Distributed Device Descriptors (D3) - Solution

Distributed Device Descriptors (D3) - Solution Outline

TODO

remove already in problem statement

Problem statement

In order to make meaningful security decisions (e.g. ....) about IoT devices we need an interoperable vocabulary for describing device types.

The creation of the device vocabulary is a distributed problem; the device manufacturer is not always the best source of truth .

The system  we will define will support:

Claims about device types (e.g. ... ) and device type qualities (e.g. ...) to be made by different stakeholders
The maker of the claim will have a composable identity, supporting at a minimum individual identity and organisational identity (current employer for example)
A flexible method where the individual or entity  making decisions based on a distributed set of security claims has the ability to choose their preferred sources of trust, or 'weight'  the trust imbued in a claim appropriately.

NA: detailed problems statement will go in whitepaper.

Data gathering

As an early PoC, we will create a database to gather the D3 type information, in order to test its usability

Device Types

DevType ID	unique identifier for device type
Manufacturer	manufacturer - (domain name)
Model number	most fine-grained model number
Model number Extra	other identifiers
Link	link to definite model definition
Installer link	link to firmware

Device Behaviours

TODO

write the basic behaviour schema

Behaviour ID	unique identifier for behaviour
src
dest
port
type

Sites

Site ID	unique identifier of the site
Owner	email of site manager (installation site)
Postcode	rough location
Type of site	(persons home \| office \| lab \| ….)
Type of connection	internet
Internal Subnet
External IP

Install events

Dev ID	unique device ID
Site ID	unique identifier of the site
DevType ID	unique identifier for device type
Install date	date installed
Serial number	unique id for device (manufacture)
Serial extra	link to firmware

URGENT

Need final comments on this proposal before it gets reworked

https://ddd.tdxcloud.com/

Implementation

Proposed implementation will use the emerging "Verifiable Credentials" to embody distributed descriptors:

https://www.w3.org/TR/vc-data-model/

Verifiable Credentials, as an initiative, embodies the principle of "rebooting the web of trust". As such it stands in stark contrast to the prevailing solutions in the device identity space.

The Verifiable Credential (VC) model is well suited to scenarios where the trust roots may change over time, and the trust decision is context sensitive. It is still possible to express "strong roots" in this model.

For real world IoT which embraces legacy deployments, these characteristics are essential.

At its simplest, the VC model consists of JSONLD statements combined with digital signatures. Essentially they are signed tuples, where each attribute is represented by a URI.

Verifiable Credentials are part of a wider emerging Self Soverign Identity (SSI) ecosystem that includes Decentralized Identifiers (DIDs), although the two are not tightly bound. In this draft we consider Decentralized Identifiers to be optional. Distributed Ledger Technologies (DLTs) are often used to implement the identitier registry in SSI systems, but again DLTs should not be considered a mandatory part of this solution.

Implementation

Rough demo: https://ddd.tdxcloud.com/

Notation Convention

In the text below a convention will be used where:

square brackets denote the signing entity
-SK suffix is used to denote secret key
-PK is used to denote pubic key
ENTITY in caps is used to denote the type of entity signed
round brackets defines the bounds on the signed payload
comma separated payload identifies the separate elements being signed
curly brackets denotes a JSON payload
HTTP/S references denote a DID reference

SAM: "type of entity being signed"?

Entities - Relationship Model

The following key entities are defined:

PERSON: individuals who make statements about device types. People are identified by email address.
CODE: a piece of code process, identified by type.
ORG: legal organisations - identified by domain name.
AGENT: a piece of code process, identified by type.
ORG: legal organisations - identified by domain name. [many domains are NOT legal organisations eg: ManySecured.net, .iotsecurityfoundation.org]
D3-TYPE: an abstracted device type - identified by URI.
D3-DEVICE: an individual device instance.

EXAMPLES ENTITIES
A PERSON is an individual. They will be frequently be identified by email. For example joe.bloggs@internetprovider.com or joe.bloggs@employee.com. Possibly non email identifiers could be used (e.g. DID). The person may provide a personal identifier or an identifier provided by their employee (different emails examples). A PERSON may make CLAIMS
CODE is functioning software. The CODE might run locally (on an edge gateway or a mobile device ) or in the the cloud. CODE can also make claims, as determined by some internal logic.
An ORG is a legal entity. For the purposes of claims, the ORG will usually be referenced by is primary internet domain e.g www.organisation.com.
A D3-TYPE is an abstract type of device. D3 types are hierarchical, where child types inherit properties form their parents. A simple hierarchy of types could me manufacturer/model/SKU/Firmware which is a four level hierarchy
TODO: do we really want a strict hierarchy or a directed acyclic graph. Will property inheritance still work with DAGS ?
A D3-DEVICE is a physical instance of a device, e.g a specific lightbulb or a specfic camera

EXAMPLES RELATIONSHIPS
The most basic type of CLAIM is the claim that a specific device is of a specific type. A device type recognition claim.
This claim could be made by a PERSON, or it could be made by CODE (automated device recognition).
The device type claim could be made a various points of the hierachy.

TE: Need some examples here to help with understanding the syntax used in following text.

SAM: computational process -- is this different from CODE? (In GiD is class of object to be identified, rather than computational process) [something missing after GiD]

VC statements

Each verifiable claim, can be packaged and identified in the following ways:

Portable payload: a VC compliant JSON document.
#HASH: the hash of the JSON document can be used as a shorthand to the statement.
URL: optionally an address that can be used to access the VC statement. The URL may be HTTP or a DLT locator.

TE: Do the terms assertion/claim/statement above mean the same thing? A VC can contain more than one claim.

Person assertion

A person's existence is asserted through a DID, representing that individual.

For convenience we will provide a a process that allows people to create a DID.
Simplest version will use a username plus response challenge to police the creation of the DID.
Optionally an OAuth flow to an external ID provider can be used to help assert the user.

SAM: Is a person for authorship DID assertions? Can a robot make such assertions? Maybe a contiuous integration and build process has no human-in-the-loop for code release.
NA: i would use CODE for this scenario - signed by the continuous integration code. OR
multi layer I PERSON pre approve the CODE That does the SIGNING

Organisational membership - implicit

People are considered implicitly part of an organisation, if their email is common to the organisational domain name.

The email must be validate with a challenge. This validation can be attested to by a "trusted" email validator.

[EMAIL-VALIDATOR-SK]("ORG-MEMBER", PERSON-DID, person@domain.com, domain.com, validity-period)

{
    {
        "cmd" : "org-member", /* command to claim a user belongs to an organisation */
        "subject" : [PERSON-DID], /* URI to the DID of the person */
        "domain" : [ORG-DOMAIN], /* URI to the DID of the person */
        "validity-period" : [DAYS], /* number of days the claim is valid for*/
    }, 
    proof :
    {
         /*signed by the public key of the mechanism that vaidates the email address */  
         "verification-method" : [EMAIL-VALIDATOR-PK]
    }    
}

Where validity period is the length of time the org membership is deemed valid.

Privacy consideration

Need to review privacy considerations of individuals vs. traceability of statements.

Use VC presentations https://www.w3.org/TR/vc-data-model/#presentations, to minimise disclosure.

Signing

Every statement made by an individual is signed by their private key, which pairs to the public key resolved through the PERSON-DID.

Hence in many of the followings statements the [PERSON-PK] is used to denote such a signature.

Device type assertion

A basic device type expression consists of:

Candidate URI: in theory can be hosted on any domain, or DLT
Manufacturer: the organisational domain name
Model number: the most fine grained model identifiable, free text descriptor

This statement can be signed by any person.

If the person belongs to the same organisation as the manufacturer, greater assurance can be placed in the expression.

{
    {
        "cmd" : "DEVICE-TYPE", /* command to claim a user belongs to an organisation */
        "subject" : [DEVICE_TYPE_URI], /* URI type - could be DID or simple URI  */
        "issuer" : [PERSON], /* URI to the person making the claim */
    }, 
    proof :
    {
     "verification-method" : [PERSON-SK]
     /*signed by the public key of the person making the claim  */  
    }    
}

Device type hierarchy

Device types are hierarchical. All child device types should inherit all properties of their parent.

This allows properties to be expressed efficiently. it also allows for device types to be refined.

Device types are hierarchical. All child device types should inherit all properties of their parent. This allows properties to be expressed efficiently. it also allows for device types to be refined.

[PERSON-SK]("INHERIT", https://devices.com/parent-device-type, https://devices.com/child-device-type)

The inheritable qualities of a device type are:

All attributes declared in the device descriptor
All D3 additional claims, tied to the device type

QUESTION: do we need to explicitly distinguish between inheritable qualities and non inheritable qualities

Inherited qualities follow the standard rule that you inherit from your parent, unless it is overridden at the child.

A device type can only have one parent, but many children.

SAM: Why must this be a hierarchy (directed acyclic graph)? Would an arbitrary directed graph work (in the same way claim graph inferencing is not limited to being hierarchical)?
Ok good point. I did half think through this. How robust is "inheritance" in DAGs

Firmware assertion

The existence of each distinguishable piece of firmware is asserted.

Typically this is done by the manufacturer, and in reality by an individual employed by the manufacturer.

[PERSON-SK]("FIRMWARE", https://devices.com/firmware-address, {firmware-descriptor})

A firmware descriptor is minimally composed of:

the firmware payload
URL from which the firmware can be downloaded
Version number
User readable friendly name
Optional notes field

SAM: As above. Is a person neceeary? Can a robot make such assertions? Maybe a contiuous integration and build process has no human-in-the-loop for firmware releases.
I think we need to think through the semanitcs of signing chains.
a) code can sign firmware
b) person can sign code that signs firmware
c) person signs code that firmware has already signed
plus of course we can embed further semantics in the command

Firmware Type Binding

A firmware can be bound to one or more device types.

If a firmware is bound to a device type, it implies that the firmware is compatible with this device type:

[PERSON-SK]("FIRMWARE-BINDING", https://devices.com/firmware-address, https://devices.com/-device-type)

Firmware Version Update

A firmware increment statement, is used to describe the fact that a firmware has been superseeded by a newer version:

[PERSON-SK]("FIRMWARE-VERSION-UPDATE", https://devices.com/firmware-address-new, https://devices.com/firmware-address-old)

The most up to date firmware is the asserted firmware version for which no version update statemement exists.

Device Type - Least Privilege Behaviour Definition

A behaviour is a definition of approved (least privilege) internet behaviour expected of this device.

The behaviour may be expressed as an IETF YANG statement.

Or we may consider more compact, usable expressions.

A behaviour is a signed document. It could be accessible direct through URI or as a package.

TE: what is meant by 'package' here?

TODO: this all needs reconciling with MUD

[PERSON-SK]("BEHAVIOUR",https//devices.com/this-device-type, behaviour-doc)

SAM: What language will be used to describe behaviours? What will be the power of the language? E.g. will it need to be Turing Complete?
NA; not that complex
thnkking compressed behaviour beloe
and YANG

Compressed behaviour descriptor

For readability, ease of expression and compactness we should consider a compressed description of behaviour.

Compressed behaviour description is a comma separated variable list (CSV).

Each line is an allowed form of communication.

Each line is of the form:

[SOURCE-ADDRESS], [SOURCE-PORT], [DEST-ADDRESS], [DEST-PORT]

Addresses can be IP addresses or URIs
Addresses can support wild cards
Port specification can support RANGES and COMMA separated lists

Each rule is being interpreted in the context of a specific device (the constrained device).

The term [THIS] is used as short hand to represent the device under consideration.

The following dynamically evaluated variables are therefore available to the descriptor:

THIS-ADDRESS - the IP address of the device being considered
THIS-GATEWAY - the gateway address of the device
THIS-SUBNET - the subnet on which THIS device is registered

The compressed descriptor should map precisely to a YANG descriptor

TO CHECK

Device Type - Updater

REMOVE: we have removed this, and replaced this by binding the URI source to the firmware assertion

The updater property, identifies the "Human usable" URI from which the most up to date firmware for this device can be accessed

[PERSON-SK]("UPDATER",https://devices.com/this-device-type, https://devices.com/this-device-updater-uri)

Local device onboarding

If a device is formally provisioned/onboarded then critical information can be persisted and exposed at the gateway level.

[PERSON-SK]("DEVICE-ONBOARD", local://this-device-instance, {local-device-descriptor})

This process assumes a PERSON has explicitly onboarded the device to the home gateway.

TODO: we can look at other ways of persisting onboarding events

Local device detection

It is the responsibility of the gateway device to identify new candidate devices.

New device identification is an imperfect process.

[GATEWAY-SK]("DEVICE-DETECTION", local://this-device-instance, {local-device-evidence})

Device Recognition

Let us temporarily assume device recognition to be an out of band process.

The result of a recognition event is the assignment of a device instance to a device type

[AGENT-SK]("RECOGNISED", https//devices.com/this-device-type, local://this-device-instance)

We will need to consider different candidate instances. MAC address being the most obvious.

This model can support non-IP device types. For example LoRa device ID.

There a whole family of "recognisers" to consider:

MAC address recogniser - would map a device instance to a high level device type ID - based on MAC address range
User installed recogniser - maps a device to a specific device type ID based on user install event
HTTPS Cert - maps device instance to device type based on the properties of the HTTPS certificate hosted on port 443
FIDO Cert - maps device instance to device type based on properties of the discoverable cert
CHIP Cert
Lower level networking certificates - such as those described in "Locally significant certificates"
QR Code hints - infer device type based on information found on a QR code sticker. (subset of user install event)

SAM: What are the false alarm rates of such schemes?
no idea till we implement:)
the cert backed (FIDO.CHIP etc) should be high unless their certificates are being fraudulently issued

Device descriptors workflow

The device descriptor workflow is designed to support the following properties:

Any individual can propose a device or make a statement about a device
Each of these statements stands as an autonomous cryptographically verifiable statement by the author
We envisage a system where these statements are centrally gathered (e.g. IoTSF)
But - the autonomous descriptors can exist in private ecosystems also
Trusted authorities/intermediaries can apply processes to the atomic statements, which may include preferential treatment by trusted authors.
Subsets and extracts can be made from the universe of device descriptors, as appropriate for different applications.

SAM: Does private mean not published?

SAM: Caution must be taken when dealing with sources that are not Authoritative (see the Global Identity Foundation for more on this.)

Blessed statements

Any statement made by one individual  can be blessed by another. To bless a claim, means you agree with that claim.

SAM: Like "Chinese Whispers", blessing must not increase trust.
trust is subjective based on your trust base.

Claim graph inferencing

Claim graph inference is a process of drawing a set of usable conclusions from an interconnected graph of disparate claims.

The inferencing process is a subjective process, undertaken by the claim graph interpreter.

Each claim graph interpreter will have its own notions of who to trust. These apriori notions of trust can also be manifest as individual claims held by the interpreter.

The inferred facts are determined by performing an inferencing function on a combination of:

the presented claim graph
the interpreter held claims repressing apriori notions of trust

For example the interpreter may trust only statements made by people from Organisation A and Organisation B and ignore all other statements.
Or it may trust Org A and Org B and any statement blessed by Person X irrespective of their current organisational affiliation.

SAM: SecPAL (Security Policy Assertion Language) or similar should guide the design of an inferencing system. IKP: How do you revoke/lower confidence/trust in a person/organisation's statements (eg: if they have been hacked or taken over by disreputable organisation)?

TODOs:

Flesh out the different ways of creating a person
Flesh out the different ways of "authorising" a person
Separate out the data model (structure of claim) from signing method
Check JSON vs CDDL for expressing claim
Full examples for Compressed Behaviour Description
Lots more on inferencing

Distributed Device Descriptors (D3) - Solution Outline#

TODO

Problem statement#

Data gathering#

Device Types#

Device Behaviours#

TODO

Sites#

Install events#

URGENT

Implementation#

Implementation#

Notation Convention#

Entities - Relationship Model#

VC statements#

Person assertion#

Organisational membership - implicit#

Privacy consideration#

Signing#

Device type assertion#

Device type hierarchy#

Firmware assertion#

Firmware Type Binding#

Firmware Version Update#

Device Type - Least Privilege Behaviour Definition#

Compressed behaviour descriptor#

Device Type - Updater#

Local device onboarding#

Local device detection#

Device Recognition#

Device descriptors workflow#

Blessed statements#

Claim graph inferencing#

TODOs:#

Distributed Device Descriptors (D3) - Solution Outline

Problem statement

Data gathering

Device Types

Device Behaviours

Sites

Install events

Implementation

Implementation

Notation Convention

Entities - Relationship Model

VC statements

Person assertion

Organisational membership - implicit

Privacy consideration

Signing

Device type assertion

Device type hierarchy

Firmware assertion

Firmware Type Binding

Firmware Version Update

Device Type - Least Privilege Behaviour Definition

Compressed behaviour descriptor

Device Type - Updater

Local device onboarding

Local device detection

Device Recognition

Device descriptors workflow

Blessed statements

Claim graph inferencing

TODOs: