Decentralized ActivityPub Draft

This document serves as an overview covering a stack of proposals in order to enable portability of identity and object storage within ActivityPub through the use of DIDs and a proposed custom DID method.

The proposed custom DID method, did:fedi is intended to be a self-certifying portable identity that provides the features of:

This is currently just a set of proposed experimental ideas and is open and subject to change as required. Implementation of the custom DID method is not required and just serves as an opportunity to craft and adjust a custom DID method for use within ActivityPub, if necessary.

This is an unofficial set of proposed extensions to ActivityPub that presently bears no official status, recognition, nor participation by any standards body, working group, or committee. It currently serves as an exploratory draft that is subject to major breaking changes through development.

@TODO Omit W3C notices

did:fedi Specification

The method currently only defines 3 record types: create, update, revoke. These record types and their respective serialization schema is scoped under what 'variant' is used (currently only fedi:0), thus allowing breaking changes to be experimented with under different variant names, during the development process of the DID.

Creation

Genesis Record

Start with an initial skeleton JSON object as follows, which shall serve as our in-memory fediRecord:

Next, define the creation parameters object (creationParams), based on the preferences desired for the DID. The required parameters and their respective options are as defined:

canon
Defines the canonicalization scheme used for record signing and DID generation. The only option currently allowed is "jcs", referring to the JSON Canonicalization Scheme as defined in [[RFC8785]].
hash
Defines the final hash algorithm to be used to generate the did:fedi identifier, after signing. Currently allowed options are: sha256
length
Defines the length of the output hash, in bytes. The value may be less than the hash algorithm's standard output size, which would imply a truncation being applied after hash generation. Hash sizes below 15 bytes (120 bits) are strongly not recommended, and implementations are encouraged to deny weak hash sizes.
encode
Defines the canonical base encoding of the desired resulting DID, for the DID-specific identifier portion. Currently allowed options are: base32, base58btc, base64url

An example of a valid set of parameters are demonstrated below.

Assign the parameters object as a value of the params property in the fediRecord object.

Rotation Key Generation and Endorsement

Generate two Ed25519 keypairs for the use of certifying the genesis record and any future changes. Encoding and endorsing each public key is handled as follows:

  1. Let rawPublicKey be the public key bytes of the generated Ed25519 key
  2. Let multicodecPubkey be the result of concatenating the bytes 0xed, 0x01, and the bytes of rawPublicKey
  3. Let armoredPubkey be the result of performing Base64url encoding on multicodecPubkey with padding characters omitted.
  4. Let multikey be the result of prepending the character "u" to the armoredPubkey string value.
  5. Let rotationKey be the result of creating a map object, where id has a short random unique string, and key has the value of multikey
  6. Append rotationKey to the list stored in fediRecord.rotationKeys

Note that Ed25519 support in WebCrypto is fairly new. Only Firefox 129+ and Safari 17+ have unprefixed support, at the time of writing. Polyfills are needed to support other/older browsers, but polyfills do carry the risk of side-channel attacks due to the non-deterministic performance of JavaScript code execution.

Rotation Keys are only used internally within the DID method for certifying and updating the DID, and are not presented in any DID Document resolution output.

Endorsing User Keys

Public keys to be publicly endorsed within the resolved DID Document (and thus used in any services that consumes the DID), such as for the roles of authentication, proof assertion, key exchange, and capability delegation/invocation—are to be listed in the userKeys list in a fedi record. These may be used and referenced in HTTP Signatures or in Object Integrity Proofs [[FEP-8b32]].

Use Label Description
assert Corresponds to assertionMethod in a Controller Document
auth Corresponds to authentication in a Controller Document
keyexch Corresponds to keyAgreement in a Controller Document
capdel Corresponds to capabilityDelegation in a Controller Document
capinv Corresponds to capabilityInvocation in a Controller Document

An example of a valid public key object, as to be appended to the userKeys list in the fediRecord is demonstrated as follows:

Services

It is strongly encouraged to delegate storage of ActivityPub objects and media to external services, versus being stored directly "on-chain" within the did:fedi protocol infrastructure. The function of did:fedi is mainly to serve as the "DNS and Certificate Authority of the identity" rather than arbitrary object storage itself. An implementation is allowed to handle identity and storage within the same application and server, it just requires that the responsibilities are cleanly separately (@TODO reword more succinctly). Media and ActivityPub objects are recommended to be listed under separate services, even if they may resolve to the exact same serviceEndpoint. DID URL Resolution and interaction with serviceEndpoints is described further in DID URL Resolution.

The service property in a did:fedi record directly correlates to the service property value presented upon DID resolution. A service can be either ActivityPubService for a server that can handle inbox activity, other dynamic behavior of an ActivityPub implementation, authentication-gated posts, etc; or a MediaStorageService for plain media and object storage. An example of a service value is demonstrated below:

The id of a service entry may be any short URL-safe string, as it is an important component in distinguishing between services in a DID URL. The serviceEndpoint may point to different locations over time, as a user moves between platforms and hosting providers, while transparently facilitating a stable identifier for media and ActivityPub content. Renaming a service to a different id is discouraged, as it will break any links that depend on that id.

Signing

When a did:fedi record is populated with the necessary public keys and information to be published, (... @TODO)

  1. Let fediRecord be the populated map object that is to be signed
  2. Set the sig property of fediRecord to null
  3. If fediRecord is not the first record (a genesis record), and instead an update operation, then:
    1. If the action property of fediRecord is set to "create", change the value to "update"
    2. Delete the params property from fediRecord, if it exists
    3. Delete the did property from fediRecord, if it exists
    4. Let lastHashRaw be the result of JCS canonicalization of the previous record, after being SHA-256 hashed
    5. Let lastMultihash be the result of concatenating the bytes 0x12, 0x20, and the bytes of lastHashRaw
    6. Let lastHash be the result of: base64url encoding lastMultihash, and prepending the character "u"
    7. Set the last property of fediRecord to the value of lastHash
  4. Set the when property of fediRecord to the latest datetime in ISO 8601 order, as Zulu time, with a "Z" timezone identifier only, and no sub-second values included
  5. Let canonicalRecord be the result of JCS canonicalization of fediRecord
  6. Let signature be the result of cryptographically signing canonicalRecord, with a private key that corresponds with a public key listed under rotationKeys in the current in-memory did:fedi record
  7. Let sigObj be the value of a map containing a property named sig which holds the value of signature, and a property named id that corresponds to the id value of the public key that corresponds with the private key that was used for creating signature
  8. Set the sig property of the fediRecord to the value of sigObj
  9. If this is the first record; then,
    1. Let didHashBytes be the result of performing JCS canonicalization, hashing, and encoding based upon the configuration values specified in params
    2. Assign a property named did to fediRecord, with a value of the string "did:fedi:" concatenated with the value of finalDid

Verifying

@TODO write descriptive form of procedures

Submission

The parameters for an HTTP request for submission of a record are as follows:

Retrieval

The parameters for requesting the latest record for a DID are as follows:

@TODO Filter parameters

Record History

To query the full record history for a did:fedi identity, or to synchronize the full history all-at-once, the same criteria for submission and retrieval apply, EXCEPT that instead of application/didfedi+json, the value application/didfedi+jsonlines is used instead. Whereas the POST body in a submission, or the response body in a query, would instead be multiple records delimited in JSON Lines format (each record terminated by a line break character: 0x0A), in ascending chronological order (genesis record being first).

Update

Any subsequent updates to a DID are identified with an action property set to update. Updates SHOULD NOT contain a params nor did property. Same signing and verifying procedures as outlined in Signing and Verifying section applies. Update records MUST contain a hash of the previous record, stored in the last property before signing. The when property of an update MUST be a later datetime than the previous record.

Updates to the DID are signified simply by the omission, replacement, or addition of values in the record. An update MUST have at least one valid public key listed in rotationKeys.

Revocation

The permanent revocation of an identity is signified by a final signed record where the action property is set to revoke. No subsequent records may be accepted after revocation.

DID URL Resolution

It is strongly encouraged that implementations of the did:fedi method only implement resolution for the base identifier, and not perform object storage using paths identified under the identifier (e.g. NOT did:fedi:z6x5vHr6sWxo4qtYamucHWRuug/post/hello-world).

Instead, it is strongly encouraged to treat DID Resolution essentially as a DNS-like system for referencing different services, that can be used storage providers, that's inferred from the DID Document (as a byproduct of DID Resolution). The justifications to this design decision are:

Utilizing the URL dereferencing behavior that's already described in DID Resolution, there are two required URL parameters: service and relativeRef. service corresponds to the id of a service description listed under the service list of a DID Document. relativeRef refers to the remaining path that locates the resource, when appended to a serviceEndpoint URL.

Therefore, given the following resolved DID Document:

along with a DID URL of:

did:fedi:z6x5vHr6sWxo4qtYamucHWRuug?service=ap&relativeRef=outbox

would dereference to a final resolvable URL of

https://example.social/user/bob/outbox

This behavior may closely align to the proposal [[FEP-e3e9]] (Actor-Relative URLs).

Comparison to other DID methods

Other DID methods can absolutely be used within ActivityPub, there's just a few minor preferential motives for constructing a different DID method rather than using the current proposals or standards:

Compatibility URLs

In order to enable non-DID-aware applications to still be able to interact with DID-aware ActivityPub applications, DID-aware implementations MAY append a DID identifier to the URL endpoint of a DID resolver that natively supports the DID method of the identifier.

Additionally, in order for DID-aware applications to still make use of the portability of DIDs, an algorithm is proposed to extract a DID identifier back to it's original form again.

For each identifier in an ActivityPub object that may refer to an object, evaluate the identifier, objectId, as follows:

  1. If objectId begins with the string "did:", then handle the identifier as a DID, as-is, and skip any further extraction steps,
  2. Else if objectId begins with the string "http://" or "https://", and contains the string "/did:", then perform the following:
    1. Split the string objectId, after the / in the first occurrence "/did:", storing the first half as resolverHint, and the second half as compatDid
    2. If the DID, compatDid, has been resolved before, and the URL (without query string) is listed in alsoKnownAs in the last resolved DID Document, then compatDid may be considered as the canonical identifier in place of objectId
    3. Else If the DID, compatDid, has not been previously resolved before, then attempt to resolve the DID using a trusted DID resolver that natively supports the DID method, with resolverHint as a resolution parameter, if the DID method supports discovery via HTTP/HTTPS.
      • If the DID is successfully resolved, and the resulting DID Document contains a reference to objectId (without query string) in the alsoKnownAs list of the resolved DID Document, then compatDid may be considered as the canonical identifier in place of objectId
  3. Else, if any operation fails, or if the identifier does not match any of the criteria above, then resume with the rest of any other identifier evaluation procedures.

Note: The resolver hint is only intended for discovery on further details for certain self-certified key-based identities, only intended for method-specific protocol use, and not to be used as some blindly-trusted general-purpose remote DID resolver, as that would otherwise just break the entire DID trust model.

DID Update Hints

To enable a push-based method of notifying interested servers (of DID-aware ActivityPub applications) about changes to a DID, especially for mechanisms that do not use a distributed ledger table, it may be worth utilizing the existing push-based delivery procedures within ActivityPub. This also helps servers to be promptly notified of the new location of an actor, even if the original server is offline or is adversarially denying the presentation of the latest DID-specific record infomation.

A proposed serialization of transporting DID-specific update information may be as follows:

This is a very incomplete stand-in example, and will substantially change per feedback.

For simplicity, it MAY be presented as a transient activity with no identifier. Upon reception, an implementation that natively supports the DID method SHALL extract the map from the content property, remove the "@type" property, and evaluate the payload as if it were submitted to it's public DID resolution endpoint. If the method-specific payload has inline self-certifying information (e.g. NOT did:web?), HTTP Signature verification is not required nor expected.

Security

Signature Malleability

Depending on the implementation, Ed25519 signatures (as with many other elliptic curve algorithms) may have malleability, which allows an attacker to manipulate mathematical characteristics of the digital signature, to another valid digital signature value, with no knowledge of the private key. This allows modifying an existing record, and resending a modified copy as if it were another distinct record.

One possible attack scenario is an attacker discovering a new identity that isn't widely known yet, and tweaking a copy of the first record update's signature to a different value, and publishing the altered record chain to other servers that aren't aware of the identity yet. Therefore, any legitimate subsequent updates to that record may end up rejected, as the last hash will differ, breaking the chain (unless resynced from before the fork, or other mitigations). Therefore, some strict constraints for the Ed25519 signature need to be defined and implemented before this is adopted in production-ready environments.

Alternatively, some other options may be as simple as prohibiting two records to have the same when datetime value, and rebasing to whichever hash chain history is longer. Though a procedure to walk back and resync the record history would have to be designed yet.

Encoding Normalization

Due to how permissive some implementations handle base encoding/decoding schemes (base64, etc), some implementations may accept inputs with some trailing bits that are discarded from the decoded output, where the output is the same, despite the different inputs. This can create complications with similar replay attacks mentioned in Signature Malleability

Verification Bloat

Given the data structure being a simple hash chain, the record history all the way back to creation, is required in order to validate any new additions. Therefore, it's not recommended to use the DID method for storing any highly-volatile information, whereas the envisioned typical size of an update history for a user over 5 years should only averagely be barely 8 records. Any highly-volatile information should be stored out-of-band, such as through a delegated service that's dereferenced through a DID URL. Ratelimits may end up being standardized as a precaution, such as having a minimum timespan between record updates (based on the when value).

Definitions

Multibase

A scheme of encoding data in a different base encoding (base64, base32, and others), while preceding the sequence with a single character that corresponds with which base encoding was used. [[MULTIBASE]]

Multicodec

A binary encoding that uses an unsigned integer (that gets encoded as a varint sequence) to describe the type of data that follows afterwards.

In order to make Multicodec-described data more safely transmittable over binary-unsafe mediums, it's usually paired with Multibase, where the whole Multicodec value (and it's following data) is encoded in base-encodings like Base64, Base32, and others. [[MULTICODEC]]

Multikey

A means of serializing key information (primarily public keys) in a structured format utilizing Multicodec identifers and Multibase encoding. Multikey is described in the [[controller-document]] specification.

Varint

A varint (variable-length integer) is a method of compactly encoding an unsigned integer in the least amount of bytes needed, and being able to infer where the encoded bytes of the integer starts and ends, rather than reserving an extra byte to express the integer length. A sequence is evaluated byte-wise, with the first byte being the Least Significant Bits of the integer, stepping up 7-bits for each subsequent byte.

Within a byte, bits 1-7 are used to store a portion of the integer, while the 8th bit (0x80) signifies if another byte follows (1), or if it's the last byte in the sequence (0).

Varints are the foundational encoding for multicodec (as part of the suite of Multiformats).