Distributed Device Descriptors (D3) - Solution
Distributed Device Descriptors (D3) - Solution Outline
TODO
remove already in problem statement
Problem statement
In order to make meaningful security decisions (e.g. ....) about IoT devices we need an interoperable vocabulary for describing device types.<!---This needs expanding as to what the problem is and why it is important--->
The creation of the device vocabulary is a distributed problem; the device manufacturer is not always the best source of truth <!---what do we mean by truth?--->.
The system <!---what system? the following sound like a set of requirements rather than a problem statement---> we will define will support:
- Claims about device types (e.g. ... ) and device type qualities (e.g. ...) to be made by different stakeholders <!---we should probably define/give examples of stakeholders--->
- The maker of the claim will have a composable identity<!---needs further expansion/explanation--->, supporting at a minimum individual identity and organisational identity (current employer for example)
- A flexible method where the individual or entity <!---do you mean entities as defined below? Or an organisation?---> making decisions based on a distributed set of security claims has the ability to choose their preferred sources of trust, or 'weight' <!---may need a different word if this does not translate well, internationally---> the trust imbued in a claim appropriately.
NA: detailed problems statement will go in whitepaper.
Data gathering
As an early PoC, we will create a database to gather the D3 type information, in order to test its usability
Device Types
DevType ID | unique identifier for device type |
---|---|
Manufacturer | manufacturer - (domain name) |
Model number | most fine-grained model number |
Model number Extra | other identifiers |
Link | link to definite model definition |
Installer link | link to firmware |
Device Behaviours
TODO
write the basic behaviour schema
Behaviour ID | unique identifier for behaviour |
---|---|
src | |
dest | |
port | |
type | |
Sites
Site ID | unique identifier of the site |
---|---|
Owner | email of site manager (installation site) |
Postcode | rough location |
Type of site | (persons home | office | lab | ….) |
Type of connection | internet |
Internal Subnet | |
External IP |
Install events
Dev ID | unique device ID |
---|---|
Site ID | unique identifier of the site |
DevType ID | unique identifier for device type |
Install date | date installed |
Serial number | unique id for device (manufacture) |
Serial extra | link to firmware |
URGENT
Need final comments on this proposal before it gets reworked
Implementation
Proposed implementation will use the emerging "Verifiable Credentials" to embody distributed descriptors:
https://www.w3.org/TR/vc-data-model/
Verifiable Credentials, as an initiative, embodies the principle of "rebooting the web of trust". As such it stands in stark contrast to the prevailing solutions in the device identity space.
The Verifiable Credential (VC) model is well suited to scenarios where the trust roots may change over time, and the trust decision is context sensitive. It is still possible to express "strong roots" in this model.
For real world IoT which embraces legacy deployments, these characteristics are essential.
At its simplest, the VC model consists of JSONLD statements combined with digital signatures. Essentially they are signed tuples, where each attribute is represented by a URI.
Verifiable Credentials are part of a wider emerging Self Soverign Identity (SSI) ecosystem that includes Decentralized Identifiers (DIDs), although the two are not tightly bound. In this draft we consider Decentralized Identifiers to be optional. Distributed Ledger Technologies (DLTs) are often used to implement the identitier registry in SSI systems, but again DLTs should not be considered a mandatory part of this solution.
Implementation
Rough demo: https://ddd.tdxcloud.com/
Notation Convention
In the text below a convention will be used where:
- square brackets denote the signing entity
- -SK suffix is used to denote secret key
- -PK is used to denote pubic key
- ENTITY in caps is used to denote the type of entity signed
- round brackets defines the bounds on the signed payload
- comma separated payload identifies the separate elements being signed
- curly brackets denotes a JSON payload
- HTTP/S references denote a DID reference
SAM: "type of entity being signed"?
Entities - Relationship Model
The following key entities are defined:
- PERSON: individuals who make statements about device types. People are identified by email address.
- CODE: a piece of code process, identified by type.
- ORG: legal organisations - identified by domain name.
- AGENT: a piece of code process, identified by type.
- ORG: legal organisations - identified by domain name. [many domains are NOT legal organisations eg: ManySecured.net, .iotsecurityfoundation.org]
- D3-TYPE: an abstracted device type - identified by URI.
- D3-DEVICE: an individual device instance.
EXAMPLES ENTITIES
A PERSON is an individual. They will be frequently be identified by email. For example joe.bloggs@internetprovider.com or joe.bloggs@employee.com. Possibly non email identifiers could be used (e.g. DID). The person may provide a personal identifier or an identifier provided by their employee (different emails examples). A PERSON may make CLAIMS
CODE is functioning software. The CODE might run locally (on an edge gateway or a mobile device ) or in the the cloud. CODE can also make claims, as determined by some internal logic.
An ORG is a legal entity. For the purposes of claims, the ORG will usually be referenced by is primary internet domain e.g www.organisation.com.
A D3-TYPE is an abstract type of device. D3 types are hierarchical, where child types inherit properties form their parents. A simple hierarchy of types could me manufacturer/model/SKU/Firmware which is a four level hierarchy
TODO: do we really want a strict hierarchy or a directed acyclic graph. Will property inheritance still work with DAGS ?
A D3-DEVICE is a physical instance of a device, e.g a specific lightbulb or a specfic camera
EXAMPLES RELATIONSHIPS
The most basic type of CLAIM is the claim that a specific device is of a specific type. A device type recognition claim.
This claim could be made by a PERSON, or it could be made by CODE (automated device recognition).
The device type claim could be made a various points of the hierachy.
TE: Need some examples here to help with understanding the syntax used in following text.
SAM: computational process -- is this different from CODE? (In GiD is class of object to be identified, rather than computational process) [something missing after GiD]
VC statements
Each verifiable claim, can be packaged and identified in the following ways:
- Portable payload: a VC compliant JSON document.
- #HASH: the hash of the JSON document can be used as a shorthand to the statement.
- URL: optionally an address that can be used to access the VC statement. The URL may be HTTP or a DLT locator.
TE: Do the terms assertion/claim/statement above mean the same thing? A VC can contain more than one claim.
Person assertion
A person's existence is asserted through a DID, representing that individual.
For convenience we will provide a a process that allows people to create a DID.
Simplest version will use a username plus response challenge to police the creation of the DID.
Optionally an OAuth flow to an external ID provider can be used to help assert the user.
SAM: Is a person for authorship DID assertions? Can a robot make such assertions? Maybe a contiuous integration and build process has no human-in-the-loop for code release.
NA: i would use CODE for this scenario - signed by the continuous integration code. OR
multi layer I PERSON pre approve the CODE That does the SIGNING
Organisational membership - implicit
People are considered implicitly part of an organisation, if their email is common to the organisational domain name.
The email must be validate with a challenge. This validation can be attested to by a "trusted" email validator.
[EMAIL-VALIDATOR-SK]("ORG-MEMBER", PERSON-DID, person@domain.com, domain.com, validity-period)
Where validity period is the length of time the org membership is deemed valid.
Privacy consideration
Need to review privacy considerations of individuals vs. traceability of statements.
Use VC presentations https://www.w3.org/TR/vc-data-model/#presentations, to minimise disclosure.
Signing
Every statement made by an individual is signed by their private key, which pairs to the public key resolved through the PERSON-DID.
Hence in many of the followings statements the [PERSON-PK]
is used to denote such a signature.
Device type assertion
A basic device type expression consists of:
- Candidate URI: in theory can be hosted on any domain, or DLT
- Manufacturer: the organisational domain name
- Model number: the most fine grained model identifiable, free text descriptor
This statement can be signed by any person.
If the person belongs to the same organisation as the manufacturer, greater assurance can be placed in the expression.
Device type hierarchy
Device types are hierarchical. All child device types should inherit all properties of their parent.
This allows properties to be expressed efficiently. it also allows for device types to be refined.
Device types are hierarchical. All child device types should inherit all properties of their parent. This allows properties to be expressed efficiently. it also allows for device types to be refined.
[PERSON-SK]("INHERIT", https://devices.com/parent-device-type, https://devices.com/child-device-type)
The inheritable qualities of a device type are:
All attributes declared in the device descriptor
All D3 additional claims, tied to the device type
QUESTION: do we need to explicitly distinguish between inheritable qualities and non inheritable qualities
Inherited qualities follow the standard rule that you inherit from your parent, unless it is overridden at the child.
A device type can only have one parent, but many children.
SAM: Why must this be a hierarchy (directed acyclic graph)? Would an arbitrary directed graph work (in the same way claim graph inferencing is not limited to being hierarchical)?
Ok good point. I did half think through this. How robust is "inheritance" in DAGs
Firmware assertion
The existence of each distinguishable piece of firmware is asserted.
Typically this is done by the manufacturer, and in reality by an individual employed by the manufacturer.
[PERSON-SK]("FIRMWARE", https://devices.com/firmware-address, {firmware-descriptor})
A firmware descriptor is minimally composed of:
- the firmware payload
- URL from which the firmware can be downloaded
- Version number
- User readable friendly name
- Optional notes field
SAM: As above. Is a person neceeary? Can a robot make such assertions? Maybe a contiuous integration and build process has no human-in-the-loop for firmware releases.
I think we need to think through the semanitcs of signing chains.
a) code can sign firmware
b) person can sign code that signs firmware
c) person signs code that firmware has already signed
plus of course we can embed further semantics in the command
Firmware Type Binding
A firmware can be bound to one or more device types.
If a firmware is bound to a device type, it implies that the firmware is compatible with this device type:
[PERSON-SK]("FIRMWARE-BINDING", https://devices.com/firmware-address, https://devices.com/-device-type)
Firmware Version Update
A firmware increment statement, is used to describe the fact that a firmware has been superseeded by a newer version:
[PERSON-SK]("FIRMWARE-VERSION-UPDATE", https://devices.com/firmware-address-new, https://devices.com/firmware-address-old)
The most up to date firmware is the asserted firmware version for which no version update statemement exists.
Device Type - Least Privilege Behaviour Definition
A behaviour is a definition of approved (least privilege) internet behaviour expected of this device.
The behaviour may be expressed as an IETF YANG statement.
Or we may consider more compact, usable expressions.
A behaviour is a signed document. It could be accessible direct through URI or as a package.
TE: what is meant by 'package' here?
TODO: this all needs reconciling with MUD
[PERSON-SK]("BEHAVIOUR",https//devices.com/this-device-type, behaviour-doc)
SAM: What language will be used to describe behaviours? What will be the power of the language? E.g. will it need to be Turing Complete?
NA; not that complex
thnkking compressed behaviour beloe
and YANG
Compressed behaviour descriptor
For readability, ease of expression and compactness we should consider a compressed description of behaviour.
Compressed behaviour description is a comma separated variable list (CSV).
Each line is an allowed form of communication.
Each line is of the form:
[SOURCE-ADDRESS], [SOURCE-PORT], [DEST-ADDRESS], [DEST-PORT]
- Addresses can be IP addresses or URIs
- Addresses can support wild cards
- Port specification can support RANGES and COMMA separated lists
Each rule is being interpreted in the context of a specific device (the constrained device).
The term [THIS] is used as short hand to represent the device under consideration.
The following dynamically evaluated variables are therefore available to the descriptor:
- THIS-ADDRESS - the IP address of the device being considered
- THIS-GATEWAY - the gateway address of the device
- THIS-SUBNET - the subnet on which THIS device is registered
The compressed descriptor should map precisely to a YANG descriptor
TO CHECK
Device Type - Updater
REMOVE: we have removed this, and replaced this by binding the URI source to the firmware assertion
The updater property, identifies the "Human usable" URI from which the most up to date firmware for this device can be accessed
[PERSON-SK]("UPDATER",https://devices.com/this-device-type, https://devices.com/this-device-updater-uri)
Local device onboarding
If a device is formally provisioned/onboarded then critical information can be persisted and exposed at the gateway level.
[PERSON-SK]("DEVICE-ONBOARD", local://this-device-instance, {local-device-descriptor})
This process assumes a PERSON has explicitly onboarded the device to the home gateway.
TODO: we can look at other ways of persisting onboarding events
Local device detection
It is the responsibility of the gateway device to identify new candidate devices.
New device identification is an imperfect process.
[GATEWAY-SK]("DEVICE-DETECTION", local://this-device-instance, {local-device-evidence})
Device Recognition
Let us temporarily assume device recognition to be an out of band process.
The result of a recognition event is the assignment of a device instance to a device type
[AGENT-SK]("RECOGNISED", https//devices.com/this-device-type, local://this-device-instance)
We will need to consider different candidate instances. MAC address being the most obvious.
This model can support non-IP device types. For example LoRa device ID.
There a whole family of "recognisers" to consider:
- MAC address recogniser - would map a device instance to a high level device type ID - based on MAC address range
- User installed recogniser - maps a device to a specific device type ID based on user install event
- HTTPS Cert - maps device instance to device type based on the properties of the HTTPS certificate hosted on port 443
- FIDO Cert - maps device instance to device type based on properties of the discoverable cert
- CHIP Cert
- Lower level networking certificates - such as those described in "Locally significant certificates"
- QR Code hints - infer device type based on information found on a QR code sticker. (subset of user install event)
SAM: What are the false alarm rates of such schemes?
no idea till we implement:)
the cert backed (FIDO.CHIP etc) should be high unless their certificates are being fraudulently issued
Device descriptors workflow
The device descriptor workflow is designed to support the following properties:
- Any individual <!---or organisation?--->can propose a device or make a statement about a device
- Each of these statements stands as an autonomous cryptographically verifiable statement by the author
- We envisage a system where these statements are centrally gathered (e.g. IoTSF)
- But - the autonomous descriptors can exist in private ecosystems also
- Trusted authorities/intermediaries can apply processes to the atomic statements, which may include preferential treatment by trusted authors.
- Subsets and extracts can be made from the universe of device descriptors, as appropriate for different applications.
SAM: Does private mean not published?
SAM: Caution must be taken when dealing with sources that are not Authoritative (see the Global Identity Foundation for more on this.)
Blessed statements
Any statement made by one individual <!---organisation/entity?---> can be blessed by another. To bless a claim, means you agree with that claim.
SAM: Like "Chinese Whispers", blessing must not increase trust.
trust is subjective based on your trust base.
Claim graph inferencing
Claim graph inference is a process of drawing a set of usable conclusions from an interconnected graph of disparate claims.
The inferencing process is a subjective process, undertaken by the claim graph interpreter.
Each claim graph interpreter will have its own notions of who to trust. These apriori notions of trust can also be manifest as individual claims held by the interpreter.
The inferred facts are determined by performing an inferencing function on a combination of:
- the presented claim graph
- the interpreter held claims repressing apriori notions of trust
For example the interpreter may trust only statements made by people from Organisation A and Organisation B and ignore all other statements.
Or it may trust Org A and Org B and any statement blessed by Person X irrespective of their current organisational affiliation.
SAM: SecPAL (Security Policy Assertion Language) or similar should guide the design of an inferencing system. IKP: How do you revoke/lower confidence/trust in a person/organisation's statements (eg: if they have been hacked or taken over by disreputable organisation)?
TODOs:
Flesh out the different ways of creating a person
Flesh out the different ways of "authorising" a person
Separate out the data model (structure of claim) from signing method
Check JSON vs CDDL for expressing claim
Full examples for Compressed Behaviour Description
Lots more on inferencing