Distributed Device Descriptors (D3) - Solution

Distributed Device Descriptors (D3) - Solution Outline

TODO

remove already in problem statement

Problem statement

In order to make meaningful security decisions (e.g. ....) about IoT devices we need an interoperable vocabulary for describing device types.<!---This needs expanding as to what the problem is and why it is important--->

The creation of the device vocabulary is a distributed problem; the device manufacturer is not always the best source of truth <!---what do we mean by truth?--->.

The system <!---what system? the following sound like a set of requirements rather than a problem statement---> we will define will support:

  1. Claims about device types (e.g. ... ) and device type qualities (e.g. ...) to be made by different stakeholders <!---we should probably define/give examples of stakeholders--->
  2. The maker of the claim will have a composable identity<!---needs further expansion/explanation--->, supporting at a minimum individual identity and organisational identity (current employer for example)
  3. A flexible method where the individual or entity <!---do you mean entities as defined below? Or an organisation?---> making decisions based on a distributed set of security claims has the ability to choose their preferred sources of trust, or 'weight' <!---may need a different word if this does not translate well, internationally---> the trust imbued in a claim appropriately.

NA: detailed problems statement will go in whitepaper.

Data gathering

As an early PoC, we will create a database to gather the D3 type information, in order to test its usability

Device Types

DevType IDunique identifier for device type
Manufacturermanufacturer - (domain name)
Model numbermost fine-grained model number
Model number Extraother identifiers
Linklink to definite model definition
Installer linklink to firmware

Device Behaviours

TODO

write the basic behaviour schema

Behaviour IDunique identifier for behaviour
src
dest
port
type

Sites

Site IDunique identifier of the site
Owneremail of site manager (installation site)
Postcoderough location
Type of site(persons home | office | lab | ….)
Type of connectioninternet
Internal Subnet
External IP

Install events

Dev IDunique device ID
Site IDunique identifier of the site
DevType IDunique identifier for device type
Install datedate installed
Serial numberunique id for device (manufacture)
Serial extralink to firmware
URGENT

Need final comments on this proposal before it gets reworked

https://ddd.tdxcloud.com/

Implementation

Proposed implementation will use the emerging "Verifiable Credentials" to embody distributed descriptors:

https://www.w3.org/TR/vc-data-model/

Verifiable Credentials, as an initiative, embodies the principle of "rebooting the web of trust". As such it stands in stark contrast to the prevailing solutions in the device identity space.

The Verifiable Credential (VC) model is well suited to scenarios where the trust roots may change over time, and the trust decision is context sensitive. It is still possible to express "strong roots" in this model.

For real world IoT which embraces legacy deployments, these characteristics are essential.

At its simplest, the VC model consists of JSONLD statements combined with digital signatures. Essentially they are signed tuples, where each attribute is represented by a URI.

Verifiable Credentials are part of a wider emerging Self Soverign Identity (SSI) ecosystem that includes Decentralized Identifiers (DIDs), although the two are not tightly bound. In this draft we consider Decentralized Identifiers to be optional. Distributed Ledger Technologies (DLTs) are often used to implement the identitier registry in SSI systems, but again DLTs should not be considered a mandatory part of this solution.

Implementation

Rough demo: https://ddd.tdxcloud.com/

Notation Convention

In the text below a convention will be used where:

  • square brackets denote the signing entity
  • -SK suffix is used to denote secret key
  • -PK is used to denote pubic key
  • ENTITY in caps is used to denote the type of entity signed
  • round brackets defines the bounds on the signed payload
  • comma separated payload identifies the separate elements being signed
  • curly brackets denotes a JSON payload
  • HTTP/S references denote a DID reference

SAM: "type of entity being signed"?

Entities - Relationship Model

The following key entities are defined:

  • PERSON: individuals who make statements about device types. People are identified by email address.
  • CODE: a piece of code process, identified by type.
  • ORG: legal organisations - identified by domain name.
  • AGENT: a piece of code process, identified by type.
  • ORG: legal organisations - identified by domain name. [many domains are NOT legal organisations eg: ManySecured.net, .iotsecurityfoundation.org]
  • D3-TYPE: an abstracted device type - identified by URI.
  • D3-DEVICE: an individual device instance.

EXAMPLES ENTITIES

A PERSON is an individual. They will be frequently be identified by email. For example joe.bloggs@internetprovider.com or joe.bloggs@employee.com. Possibly non email identifiers could be used (e.g. DID). The person may provide a personal identifier or an identifier provided by their employee (different emails examples). A PERSON may make CLAIMS

CODE is functioning software. The CODE might run locally (on an edge gateway or a mobile device ) or in the the cloud. CODE can also make claims, as determined by some internal logic.

An ORG is a legal entity. For the purposes of claims, the ORG will usually be referenced by is primary internet domain e.g www.organisation.com.

A D3-TYPE is an abstract type of device. D3 types are hierarchical, where child types inherit properties form their parents. A simple hierarchy of types could me manufacturer/model/SKU/Firmware which is a four level hierarchy

TODO: do we really want a strict hierarchy or a directed acyclic graph. Will property inheritance still work with DAGS ?

A D3-DEVICE is a physical instance of a device, e.g a specific lightbulb or a specfic camera

EXAMPLES RELATIONSHIPS

The most basic type of CLAIM is the claim that a specific device is of a specific type. A device type recognition claim.

This claim could be made by a PERSON, or it could be made by CODE (automated device recognition).

The device type claim could be made a various points of the hierachy.

TE: Need some examples here to help with understanding the syntax used in following text.

SAM: computational process -- is this different from CODE? (In GiD is class of object to be identified, rather than computational process) [something missing after GiD]

VC statements

Each verifiable claim, can be packaged and identified in the following ways:

  • Portable payload: a VC compliant JSON document.
  • #HASH: the hash of the JSON document can be used as a shorthand to the statement.
  • URL: optionally an address that can be used to access the VC statement. The URL may be HTTP or a DLT locator.

TE: Do the terms assertion/claim/statement above mean the same thing? A VC can contain more than one claim.

Person assertion

A person's existence is asserted through a DID, representing that individual.

For convenience we will provide a a process that allows people to create a DID.

Simplest version will use a username plus response challenge to police the creation of the DID.

Optionally an OAuth flow to an external ID provider can be used to help assert the user.

SAM: Is a person for authorship DID assertions? Can a robot make such assertions? Maybe a contiuous integration and build process has no human-in-the-loop for code release.

NA: i would use CODE for this scenario - signed by the continuous integration code. OR

multi layer I PERSON pre approve the CODE That does the SIGNING

Organisational membership - implicit

People are considered implicitly part of an organisation, if their email is common to the organisational domain name.

The email must be validate with a challenge. This validation can be attested to by a "trusted" email validator.

[EMAIL-VALIDATOR-SK]("ORG-MEMBER", PERSON-DID, person@domain.com, domain.com, validity-period)

{
{
"cmd" : "org-member", /* command to claim a user belongs to an organisation */
"subject" : [PERSON-DID], /* URI to the DID of the person */
"domain" : [ORG-DOMAIN], /* URI to the DID of the person */
"validity-period" : [DAYS], /* number of days the claim is valid for*/
},
proof :
{
/*signed by the public key of the mechanism that vaidates the email address */
"verification-method" : [EMAIL-VALIDATOR-PK]
}
}

Where validity period is the length of time the org membership is deemed valid.

Privacy consideration

Need to review privacy considerations of individuals vs. traceability of statements.

Use VC presentations https://www.w3.org/TR/vc-data-model/#presentations, to minimise disclosure.

Signing

Every statement made by an individual is signed by their private key, which pairs to the public key resolved through the PERSON-DID.

Hence in many of the followings statements the [PERSON-PK] is used to denote such a signature.

Device type assertion

A basic device type expression consists of:

  • Candidate URI: in theory can be hosted on any domain, or DLT
  • Manufacturer: the organisational domain name
  • Model number: the most fine grained model identifiable, free text descriptor

This statement can be signed by any person.

If the person belongs to the same organisation as the manufacturer, greater assurance can be placed in the expression.

{
{
"cmd" : "DEVICE-TYPE", /* command to claim a user belongs to an organisation */
"subject" : [DEVICE_TYPE_URI], /* URI type - could be DID or simple URI */
"issuer" : [PERSON], /* URI to the person making the claim */
},
proof :
{
"verification-method" : [PERSON-SK]
/*signed by the public key of the person making the claim */
}
}

Device type hierarchy

Device types are hierarchical. All child device types should inherit all properties of their parent.

This allows properties to be expressed efficiently. it also allows for device types to be refined.

Device types are hierarchical. All child device types should inherit all properties of their parent. This allows properties to be expressed efficiently. it also allows for device types to be refined.

[PERSON-SK]("INHERIT", https://devices.com/parent-device-type, https://devices.com/child-device-type)

The inheritable qualities of a device type are:

  • All attributes declared in the device descriptor

  • All D3 additional claims, tied to the device type

QUESTION: do we need to explicitly distinguish between inheritable qualities and non inheritable qualities

Inherited qualities follow the standard rule that you inherit from your parent, unless it is overridden at the child.

A device type can only have one parent, but many children.

SAM: Why must this be a hierarchy (directed acyclic graph)? Would an arbitrary directed graph work (in the same way claim graph inferencing is not limited to being hierarchical)?

Ok good point. I did half think through this. How robust is "inheritance" in DAGs

Firmware assertion

The existence of each distinguishable piece of firmware is asserted.

Typically this is done by the manufacturer, and in reality by an individual employed by the manufacturer.

[PERSON-SK]("FIRMWARE", https://devices.com/firmware-address, {firmware-descriptor})

A firmware descriptor is minimally composed of:

  • the firmware payload
  • URL from which the firmware can be downloaded
  • Version number
  • User readable friendly name
  • Optional notes field

SAM: As above. Is a person neceeary? Can a robot make such assertions? Maybe a contiuous integration and build process has no human-in-the-loop for firmware releases.

I think we need to think through the semanitcs of signing chains.

a) code can sign firmware

b) person can sign code that signs firmware

c) person signs code that firmware has already signed

plus of course we can embed further semantics in the command

Firmware Type Binding

A firmware can be bound to one or more device types.

If a firmware is bound to a device type, it implies that the firmware is compatible with this device type:

[PERSON-SK]("FIRMWARE-BINDING", https://devices.com/firmware-address, https://devices.com/-device-type)

Firmware Version Update

A firmware increment statement, is used to describe the fact that a firmware has been superseeded by a newer version:

[PERSON-SK]("FIRMWARE-VERSION-UPDATE", https://devices.com/firmware-address-new, https://devices.com/firmware-address-old)

The most up to date firmware is the asserted firmware version for which no version update statemement exists.

Device Type - Least Privilege Behaviour Definition

A behaviour is a definition of approved (least privilege) internet behaviour expected of this device.

The behaviour may be expressed as an IETF YANG statement.

Or we may consider more compact, usable expressions.

A behaviour is a signed document. It could be accessible direct through URI or as a package.

TE: what is meant by 'package' here?

TODO: this all needs reconciling with MUD

[PERSON-SK]("BEHAVIOUR",https//devices.com/this-device-type, behaviour-doc)

SAM: What language will be used to describe behaviours? What will be the power of the language? E.g. will it need to be Turing Complete?

NA; not that complex

thnkking compressed behaviour beloe

and YANG

Compressed behaviour descriptor

For readability, ease of expression and compactness we should consider a compressed description of behaviour.

Compressed behaviour description is a comma separated variable list (CSV).

Each line is an allowed form of communication.

Each line is of the form:

[SOURCE-ADDRESS], [SOURCE-PORT], [DEST-ADDRESS], [DEST-PORT]

  • Addresses can be IP addresses or URIs
  • Addresses can support wild cards
  • Port specification can support RANGES and COMMA separated lists

Each rule is being interpreted in the context of a specific device (the constrained device).

The term [THIS] is used as short hand to represent the device under consideration.

The following dynamically evaluated variables are therefore available to the descriptor:

  • THIS-ADDRESS - the IP address of the device being considered
  • THIS-GATEWAY - the gateway address of the device
  • THIS-SUBNET - the subnet on which THIS device is registered

The compressed descriptor should map precisely to a YANG descriptor

TO CHECK

Device Type - Updater

REMOVE: we have removed this, and replaced this by binding the URI source to the firmware assertion

The updater property, identifies the "Human usable" URI from which the most up to date firmware for this device can be accessed

[PERSON-SK]("UPDATER",https://devices.com/this-device-type, https://devices.com/this-device-updater-uri)

Local device onboarding

If a device is formally provisioned/onboarded then critical information can be persisted and exposed at the gateway level.

[PERSON-SK]("DEVICE-ONBOARD", local://this-device-instance, {local-device-descriptor})

This process assumes a PERSON has explicitly onboarded the device to the home gateway.

TODO: we can look at other ways of persisting onboarding events

Local device detection

It is the responsibility of the gateway device to identify new candidate devices.

New device identification is an imperfect process.

[GATEWAY-SK]("DEVICE-DETECTION", local://this-device-instance, {local-device-evidence})

Device Recognition

Let us temporarily assume device recognition to be an out of band process.

The result of a recognition event is the assignment of a device instance to a device type

[AGENT-SK]("RECOGNISED", https//devices.com/this-device-type, local://this-device-instance)

We will need to consider different candidate instances. MAC address being the most obvious.

This model can support non-IP device types. For example LoRa device ID.

There a whole family of "recognisers" to consider:

  • MAC address recogniser - would map a device instance to a high level device type ID - based on MAC address range
  • User installed recogniser - maps a device to a specific device type ID based on user install event
  • HTTPS Cert - maps device instance to device type based on the properties of the HTTPS certificate hosted on port 443
  • FIDO Cert - maps device instance to device type based on properties of the discoverable cert
  • CHIP Cert
  • Lower level networking certificates - such as those described in "Locally significant certificates"
  • QR Code hints - infer device type based on information found on a QR code sticker. (subset of user install event)

SAM: What are the false alarm rates of such schemes?

no idea till we implement:)

the cert backed (FIDO.CHIP etc) should be high unless their certificates are being fraudulently issued

Device descriptors workflow

The device descriptor workflow is designed to support the following properties:

  • Any individual <!---or organisation?--->can propose a device or make a statement about a device
  • Each of these statements stands as an autonomous cryptographically verifiable statement by the author
  • We envisage a system where these statements are centrally gathered (e.g. IoTSF)
  • But - the autonomous descriptors can exist in private ecosystems also
  • Trusted authorities/intermediaries can apply processes to the atomic statements, which may include preferential treatment by trusted authors.
  • Subsets and extracts can be made from the universe of device descriptors, as appropriate for different applications.

SAM: Does private mean not published?

SAM: Caution must be taken when dealing with sources that are not Authoritative (see the Global Identity Foundation for more on this.)

Blessed statements

Any statement made by one individual <!---organisation/entity?---> can be blessed by another. To bless a claim, means you agree with that claim.

SAM: Like "Chinese Whispers", blessing must not increase trust.

trust is subjective based on your trust base.

Claim graph inferencing

Claim graph inference is a process of drawing a set of usable conclusions from an interconnected graph of disparate claims.

The inferencing process is a subjective process, undertaken by the claim graph interpreter.

Each claim graph interpreter will have its own notions of who to trust. These apriori notions of trust can also be manifest as individual claims held by the interpreter.

The inferred facts are determined by performing an inferencing function on a combination of:

  • the presented claim graph
  • the interpreter held claims repressing apriori notions of trust

For example the interpreter may trust only statements made by people from Organisation A and Organisation B and ignore all other statements.

Or it may trust Org A and Org B and any statement blessed by Person X irrespective of their current organisational affiliation.

SAM: SecPAL (Security Policy Assertion Language) or similar should guide the design of an inferencing system. IKP: How do you revoke/lower confidence/trust in a person/organisation's statements (eg: if they have been hacked or taken over by disreputable organisation)?

TODOs:

  • Flesh out the different ways of creating a person

  • Flesh out the different ways of "authorising" a person

  • Separate out the data model (structure of claim) from signing method

  • Check JSON vs CDDL for expressing claim

  • Full examples for Compressed Behaviour Description

  • Lots more on inferencing