Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Getting started with the Storage Kit.
There are different ways to get started in using the Storage Kit.
Docker: Quick way to launch and try out the latest builds. No build environment required.
Local Build: Build and run the Storage Kit locally. Requires a JDK 16 build environment including, Gradle.
Dependency (JVM): The Storage Kit can be used directly as JVM-dependency via Maven or Gradle.
Where to use the Storage Kit:
CLI Tool: The Storage Kit comes with a command-line interface (CLI) tool, which offers a rich set of commands to run the entire functionality the Storage Kit provides. The CLI tool can be used by running the Docker container or the executable by the local build.
REST API: In case you want to run the Storage Kit as a service, your application can access all functionalities via the REST API.
Dependency (JVM): The Storage Kit can be used directly as JVM-dependency via Maven or Gradle.
The following sections elaborate the architecture from a systematic and functional perspective. Also, components and dependencies are described.
The client REST API is still in beta testing.
Currently it is recommended to include the JVM client library as a dependency using Maven or Gradle.
Please let us know if you are interested in using the client in non-JVM languages, so we can prioritize this feature accordingly.
Method | Route | Description |
---|---|---|
GET
/api-routes
Shows a simple routing table
GET
/api-documentation
Contains the OpenAPI documentation/definition
GET
/swagger
Shows a Swagger API interface
GET
/redoc
Shows a ReDoc API interface
POST
/edvs
Create a EDV
POST
/edvs/{edv-id}/docs
Create a document in an EDV
POST
/edvs/{edv-id}/docs/search
Perform an encrypted search
GET
/edvs/{edv-id}/docs/{doc-id}
Retrieve a specific document
PATCH
/edvs/{edv-id}/docs/{doc-id}
Update a specific document
DELETE
/edvs/{edv-id}/docs/{doc-id}
Delete a specific document
GET
/capabilities
Displays the capabilities of a server
GET
/configuration
Displays the configuration of a server
WEBSOCKET
/edvs/{edv-id}/notifications
Interact with the notification API
Run different functionalities of the Storage Kit by executing individual commands.
Make sure you have Docker or a JDK 16 build environment including Gradle installed on your machine
Clone the project
2. Change folder
3. Build docker container
4. Run the project
Find other build options for docker here.
Clone the project
2. Change folder
3. Build the project (other build options)
4. Set an alias
To make it more convient to use, you can also set an alias as follows for the executable:
5. Use the CLI
See also: Client CLI examples
This layer consists of high-level server-side functions that work on top of the encrypted credentials.
It is helpful if data storage providers are able to notify clients when changes to persisted data occurs. A server may optionally implement a mechanism by which clients can subscribe to changes in the vault.
Integrity protection is needed to prevent modification attacks. Some form of integrity protection must be integrated (e.g. HASHLINKS) to prevent the successful execution of such attacks.
This layer consists of a system that is capable of sharing data among multiple entities of versioning and replication and of performing privacy-preserving search in an efficient manner.
To enable privacy-preserving querying (where the search index is opaque to the server), the client must prepare a list of encrypted index tags (which are stored in the Encrypted Resource, alongside the encrypted data contents).
At least one versioning mechanism has to be supported. Replication is done by the client, not by the server (since the client controls the keys, knows about which other servers to replicate to, etc.). The versioning strategy may be implicit ("last write wins").
An individual vault's choice of authorization mechanism determines how a client shares resources with other entities (authorization capability link or similar mechanism).
This layer consists of a client-server system with capabilities of encrypting data in transit and at rest.
When a vault client makes a request to store, query, modify, or delete data in the vault, the server validates the request. Since the actual data and metadata in any given request is encrypted, such validation is necessarily limited and largely depends on the protocol and the semantics of the request.
The mechanism a server uses to persist data, such as storage on a local, networked, or distributed file system, is determined by the implementation. The persistence mechanism is expected to adhere to the common expectations of a data storage provider, such as reliable storage and retrieval of data.
The configuration allows the the client to perform capability discovery regarding things like authorization, protocol, and replication mechanisms that are used by the server.
When a client makes a request to store, query, modify, or delete data in the vault, the server enforces any authorization policy that is associated with the request.
It is necessary that large data is chunked into sizes that are easily managed by a server. It is the responsibility of the client to set the chunk size of each resource and chunk large data into manageable chunks for the server. It is the responsibility of the server to deny requests to store chunks larger that it can handle. Each chunk is encrypted individually using authenticated encryption.
The process of storing encrypted data starts with the creation of a Resource by the client. If the data is less than the chunk size, it is embedded directly into the content. Otherwise, the data is sharded into chunks by the client (see next section), and each chunk is encrypted and sent to the server.
The process of creating the Encrypted Resource. If the data was sharded into chunks, this is done after the individual chunks are written to the server.
Learn what the Storage Kit is.
In a nutshell, the Storage Kit is a zero-trust solution for all things data storage and sharing.
The main purpose of the Storage Kit is to provide a highly secure and private infrastructure layer for data storage and data sharing for any application.
The importance of the Storage Kit becomes particularly evident in the context of Decentralized Identity and will be a core component for developers who are building solutions and use cases that involve keys, personal data or other secrets.
The following sections elaborate the basics of the Storage Kit to help you get started.
Here are the most important things you need to know about the Storage Kit:
It is written in Kotlin/Java. It can be directly integrated (Maven/Gradle dependency) or run as RESTful web-service. A CLI tool allows you to run all functions manually.
It is open source (Apache 2). You can use the code for free and without strings attached.
It abstracts complexity and low-level functionality via different interfaces (CLI, APIs).
It is a holistic solution that allows you to build use cases “end-to-end”. There is no need to research, combine or tweak different libraries to build pilots or production systems.
It is modular, composable and built on open standards allowing you to customize and extend functionality with your own or third party implementations and to preventing lock-in.
It is flexible in a sense that you can deploy and run it on-premise, in your (multi) cloud environment or as a library in your application.
Find a full overview of all features here
The Storage Kit provides diverse functionality that can be segmented into three layers (based on the Confidential Storage specs and requirements by the Decentralized Identity Foundation):
Layer 1 consists of a client-server system with capabilities of encrypting data in transit and at rest.
Layer 2 consists of a system that is capable of sharing data among multiple entities of versioning and replication and of performing privacy-preserving search in an efficient manner.
Layer 3 consists of high-level server-side functions that work on top of the encrypted credentials.
The following graphic illustrates these three layers and offers a functional perspective:
Confidential Storage as a general system involves various actors that have to interoperate:
EDVs - Encrypted Data Vaults allowing to store data, managing data access and performing queries on the stored data.
Users - Human or nonhuman actor that wishes to access their EDVs.
Clients - Programs or libraries supporting access to conform EDVs.
Providers - Servers that provide an environment to create and manage EDVs.
Services - Instances besides the user that wish to access data on a specific users EDV (needing the users consent to do so).
A user uses one or more clients to access to their EDVs (multiple clients devices may be used to access data using multiple devices, e.g. stationary and mobile, or for redundancy, to avoid loosing data access if one device fails).
Clients may be connected to multiple EDVs (for redundancy or performance reasons).
Services are a special kind of client that access EDVs through Service Data Requests (which is described later on). EDVs are hosted by providers, and are created through specific requests sent by a client.
For building the project JDK 16+ is required.
The walt.id wrapper script storagekit.sh is a convenient way for building and using the library on Linux.
The script takes one of the the following arguments:
build|build-docker|build-podman|extract|execute.
For example, for building the project, simply supply the "build" argument:
Manually with Gradle:
After the Gradle build you can run the executable.
In build/distributions/
you have two archives, a .tar, and a .zip.
Extract either one of them, and run waltid-storagekit-1.0-SNAPSHOT/bin/waltid-storagekit
.
A client supports multiple EDV-groupings by utilizing a session-based storage container. It will be referred to simply as "session" hereafter.
Such a session consists of a unique identifier, key identifier, Decentralized Identifier (DID), list of EDV metadata as required to interact with the respective EDVs. EDV metadata consists of the EDV identifier, server URL, root delegation (from the EDV DID to the client’s DID) and an index key:
This container is not explicitly defined in the specifications, and was chosen to allow users to easy import and export of all relevant data to multiple of their devices.
Note that this container as a whole can be exported from a client and imported into other clients, allowing users to easily access their connected EDVs on multiple devices. To do so, the session will be encoded as JSON and encrypted to a JWT, using the clients master passphrase as the symmetric key.
The following sections elaborate more advanced conepts like Searchable Symmetric Encryoption (SEE), Authorization Capabilities, zCaps and Linked Data (LD) frameworks.
Data models for data requests (serialized to and from JSON):
The rest API is largely defined by the service provider.
It simply has to adhere to:
having at least one POST endpoint
with the DataResponse
as request body
Enhance your existing applications with Confidential Storage functionality, by using the Storage Kit as a dependency.
Gradle
Maven
Required Maven repos:
We embraced the controller-service model for all REST components: The web server maps a specific route to a specific controller, which itself does nothing other than handling the request and response data, it does not contain any business logic. All of that is separated into service managers.
Also, recognizable for all methods suffixed with -Docs in the controllers, is that our OpenAPI documentation, is configured and fed in-code, directly beneath the actual code. Having the code and it’s documentation nearly makes it very easy to maintain this documentation, e.g. to keep it up-to-date when the code changes, which can be forgotten easily if it was in external configuration files with special syntax.
We are using Nimbus JOSE and JWT for all JWE (JSON Web Encryption) related purposes, as it is the most popular JOSE Java library4 and the project team has already had experience with this library.
Bouncy Castle is used as cryptography-provider, providing Java Cryptography Extension (JCE) and Java Cryptography Architecture (JCA) implementations. It was chosen as they are one of the most commonly used - and thus tightly integrated - layer for cryptography in Java (and C), providing audited APIs, parts of it even being certified for FIPS-140-25 (Level 1).
Our is used for signing ZCap-LD capability delegations and invocations.
To store the data at Layer A on the system, TPM-backed full disk encryption can be optionally utilized to disallow metadata access even in case of a physical breach. Clients are also secured with passphrase-based symmetric encryption, using PBES2 (published by RSA Laboratories).
When a client instance is started for the very first time, a number of things have to setup first to allow creating a EDV at a provider:
A master key has to be setup. For human-facing clients, this key is derived from a master passphrase. This symmetric master key will be used to encrypt all data-at-rest of the client instance.
A session is created. This session is initialized with a new Ed255191 based EdDSA public-private key-pair for requests to services and EDVs, and authorization with ZCaps.
This key is used to create the session DID - also known as "controller DID".
The controller DID is used to request a new EDV at a chosen provider. The request contains data about the client, most importantly the did:key.
The key receives the initial capability delegation from the root of trust. Several attributes are generated (e.g. IDs, sets up a did:key for the EDV) for the EDV.
The Storage Kit uses the concept of ZCap-LD to build a capability-based authorization, but with a modified specification.
This framework provides functionality such as sharing and managing confidential data by using capabilities. In this document any ZCap-LD capability data model is using the JSON format. A ZCap-LD capability represents several attributes:
a context definition
an associated id
an invoker, which represents a nonce
a parentCapability, which contains the whole parent ZCap-LD capability and not only a reference to the capability.
a proof field, which signs the capability with a Linked Data Proof to verify the capability. It also contains a proofPurpose to indicate if it is a capability delegation or invocation.
The field proof contains the following attributes:
"created" represents the exact date and time when the capability delegation was created
"creator" shows the key, a nonce, of the creator
"jws" is a claim signed with a signature. The server can verify the claim with a secret signing key. So the JSON Web Signature (jws) claim signature can be seen as a private key and the server’s secret signing key as the public key. With this the server is able to verify the proof.
"purpose" a string to show if it is a delegation or invocation.
"type" is used to indicate which digital signature the proof includes.
"verificationMethod" represents a key to verify the delegation
The root ZCap-LD capability contains another attribute called "rootCapability". It represents the associated EDV. An EDV is an encrypted data vault. This EDV can be seen as a root of trust. The EDV grants permission to the root capability and the root capability is able to grant permission to a child and so on.
Any capability instead of the target capability needs to have a "parentCapability" property. It points on another capability or at the target. Several capabilities together form a capability chain.
In ZCap-LD delegation is handled via the capability chain.
Example:
The example represents attributes which are needed to create a ZCap-LD capability delegation. Any capability needs an unique id. This id is a simple uuid (Universally unique identifier). In this example the edv delegates permission to the root capability. So the edv serves as the invoker. The attribute invoker is represented as a did:key, nonce, starting with "z6Mk...". To verify this delegation there is a proof attribute. It contains important details: the capability was created on the 19th of december 2021 at 10:42 and 14 seconds, the key of the creator starting with "z6Mkp...", the jws starting with "eyJi...", the proof purpose shows that this capability is a delegation, it includes a digital signature produced by an ed25519 cryptographic key and the key to verify the delegation starting with "z6Mkk...".
At first the proof itself needs to be verified. This can be done by checking the jws attribute. After that the delegation can be verified by comparing the verification method and the invoker. In this example the delegation verification is successful because the key of the "invoker" property and the key of the "verificationMethod" property are equal. At the end there is an attribute called "rootCapability". It is a reference to the associated edv. The root capability can be compared to a root of trust component. It ensures that this delegation can be trusted.
In ZCap-LD an invocation is used to authorize a further capability delegation. It contains a property called "action". This property represents the behavior of an invocation. The "action" property consists of 3 parts. What should be performed, the document id to identify the right document and the content encrypted as a SHA256 of the content.
Example:
At first the initial capability needs to be created and verified. This is done by the attribute "parentCapability". It is a capability delegation. This delegation example is described in the delegation part of this document. Then the invocation will be created. The action attribute tells what the invocation is about. In this case the system will create a new document with the name "testDocument" and the content encrypted as a SHA256ofContent. Like the delegation an invocation includes an unique id, an invoker and a proof field. This capability invocation was created on the 19th of December 2021 at 10:44 and 1 seconds, the key of the creator starts with "z6Mkk...", the jws starts with "eyJi...", the proof purpose shows that this capability is an invocation, it includes a digital signature produced by an ed25519 cryptographic key and the key to verify the invocation starts with "z6Mkk...".
To verify that the invocation has the permission to take place its proof needs to be verified. After that the attribute "verficationMethod" of the initial capability needs to be the same as the "creator" attribute of the invocation capability. In this example both keys are equal, so the initial capability grants authority to the invocation and the document will be created.
A caveat can be used to restrict a capability. In many cases it is useful to define restrictions how the capability may be used. For child capabilities caveats don’t need to be redefined, because every capability inherit all the caveats from their parents. Caveats are identified by their types and other properties. The following code example shows how a caveat has to be declared in JSON format.
Example:
When uploading a document, the users client will store a file key to the encrypted file index, chunk the file (explained below) and encrypt the chunks with the file key.
Encrypted search
The underlying encrypted search implementation parses the document structure (depending on file format, e.g. JSON, XML, etc.)
and creates a list of keywords that were found in the file. This is the search index for this file.
It gets encrypted with the encrypted search key. This is the encrypted index.
Chunking
The document is split into chunks. Chunks may not have a size exceeding the maximum of 1 MiB each. This is restricted per the Confidential Storage specification document.
Each chunk is individually encrypted with authenticated encryption using a file key.
An index (Resource Structure) is created, which is used to be able to recreate the file from the individual chunks later on. It gets encrypted, then being the encrypted chunk index (Encrypted Resource Structure).
Chunk transmission to the EDV
The encrypted chunks are sent to the EDV using individual request authorizations using ZCaps.
The encrypted chunk index and encrypted search index get stored in the EDV (each also being authorized using ZCap capability invocations).
Services come with their own configuration files.
For the configuration of service -> implementation mappings, ServiceMatrix is used.
The default mapping file is "service-matrix.properties", and looks like this:
e.g., to change the keystore service, simply replace the line
id.walt.services.keystore.KeyStoreService=id.walt.services.keystore.SqlKeyStoreService
with your own implementation mapping, e.g. for the Azure HSM keystore:
id.walt.services.keystore.KeyStoreService=id.walt.services.keystore.azurehsm.AzureHSMKeystoreService
To add a service configuration:
id.walt.services.keystore.KeyStoreService=id.walt.services.keystore.SqlKeyStoreService:sql.conf
Service configuration is by default in HOCON format. Refer to the specific service on how their configuration is laid out.
When uploading a document, the users client will store a file key to the encrypted file index, chunk the file (explained below) and encrypt the chunks with the file key:
Client data request from services
The service transmits a request to the client, indicating a context, desired data type, and it’s did:key (for the services public key).
Assuming the client accepts the request, it queries the document index for the list of required chunks (in correct order), and responds with a delegated ZCap authorization for each of the specific chunks, bundled with the file key.
Sequence Diagram illustrating service access:
Data exchange between service and EDV
The service contacts the EDV with a object retrieval request for each of the required file chunks. The requests are accompanied with the ZCap capability invocation the client delegated to the service.
The EDV encrypts the requested objects with the public key of the delegated did:key in the ZCap capability invocation and answers them back to the service.
Note that, both of steps may happen in parallel.
Service data decryption
The service decrypts each of the individually EDV-encrypted chunks with their private key.
The service decrypts each of the individually client-encrypted chunks using the file key provided by the client.
The service puts the chunks in the order specified by the client response.
Revocation
The client may revoke the data access of a previously shared(/trusted) service at any time using one of two possible methods:
The client indicates a caveat in the initial ZCap capability delegation to the service, e.g. using SimpleCredentialStatus2022, which accesses a trusted revocation service. Multiple of such may be defined.
The client uses the inbuilt did:key revocation feature of the EDV, which works like a blacklist, indicating otherwise valid ZCap authorizations for specific did:key ids no longer valid.
Notice: This section describes a non-default alternative backend for using encrypted search. The current recommended way is to use the default hash-based index search.
A key feature of Confidential Storage is the ability to search through encrypted data. The main challenge is that the higher the security of a system is, the lower its performance and efficiency.
The reasons that the search functionality consumes more performance are obvious: If you want to search through encrypted data, you either have to decrypt the data first to be able to search through it, or you use other methods that also involve additional operations. Regardless of the method, the system must always carry out additional steps that are not required in unencrypted systems.
The SSE concept tries to achieve a suitable balance between security and efficiency.
The following graphic shows the main components of a simple SEE system:
A secure authorization system is a vital part of a Confidential Storage solution to protected data from misuse, such as data manipulation or data theft. This is true for data storage and data transfer as data must be processed confidentially and may only reach those who have the respective right to access the data.
ZCap-LD uses the object capability model to grant and express authority. Its main function is to securely share and manage data. We opted for a capability-based system (instead of a traditional access control list based system).
To understand capability-based systems it is useful to start with the concept of capabilities:
Basically, a capability consists of a token or a key. The owner uses it to verify that he/she has permission to access an entity or object. It is implemented as a data structure consisting of the access rights and a unique identifier:
This identifier points to a specific object. The access rights declare which operations may be performed. For example, the access rights define a read-only access on a file, or a write access on a memory segment.
In a capability-based system, each user has access to a capability list. With these capabilities the system is able to check if the user is allowed to interact with the object.
A capability protects the object from unauthorized access, but the whole concept is useless if a capability is not protected from manipulation. If a program could change the access list and the identifier of a capability at any time, the program would be able to force access to any object. Therefore, a capability-based system is usually built in such a way that direct modifications by a program are not allowed. So the capability list can only be modified by the operating system or the hardware. What programs can do is to call operating system or hardware operations. That means programs can get new capabilities, delete capabilities or change the rights in a capability.
The goal of an authorization system is to protect confidential data. Conventional systems only partially perform this important task. Capabilities can be used to prevent processes from gaining access to data for which they do not have permission. This means a process is allowed to access capabilities which are necessary to only access the resources that the process needs. This ensures that no confidential data is revealed without having the right permissions to access this data.
The concept of capabilities can be well combined with sandboxing (i.e. the creation of an isolated test environment in which it is safe to execute suspicious URLs or files). Capabilities can be used to grant access to all data associated with the program. That means capabilities are used to sandbox every process in the program and giving it access to all associated data. This prevents the program from gaining access to sensitive data located on the other side (i.e. on the user’s desktop).
All commands can be displayed by entering help
at the session command prompt.
Enter help
for a list of all possible commands.
history
shows the command history.
exit
or quit
will leave the interactive CLI.
like document update
.
By advancing the existing basic ZCap-LD implementation with the ZCap-Caveats extension system, it is now a full-blown ZCap-LD issuance and verification system.
ZCap-based capabilities may also include ZCap-Caveats.
ZCap-Caveats allow users even tighter control control on who can access what data in which specific circumstances. These ZCap-Caveats can also be set at multiple levels of delegations, thus allowing a user to restrict access to their documents, and a service setting ZCap-Caveats in a delegation to a sub-service, further restricting access to the users data.
An overview is shown here:
The graphic above shows how capabilities work in the chain (and how we utilize them):
The EDV delegates access from the Root-of-Trust DID (of the EDV), to the user. The user receives a capability delegation without any caveats, so the user has full access, not only to the EDV, but also to delegate access to others.
When the user delegates access to a service, it is possible (and recommended) to include so called "Caveats" in the capability chain. These provide various restrictions, starting at the layer at which they have been included.
The services then creates a capability invocation from the (restricted) capability delegation, naturally the caveats are thus also in the capability chain. If the service tried to remove the Caveats from the delegation in the chain, the capability delegations signature would not match anymore.
In the example featured in graphic above, the following caveats were added to the capability delegation:
NoSubdelegationsAllowed - This caveat disallows the service from sub-delegating access to a sub-service/child-service (e.g. contractors).
ValidUntil - This caveat will expire the capability automatically at the specified date. At or after this date, the capability delegation is no longer valid, thus no (positive-verifiable) capability invocation can be constructed from the chain.
AllowedOperations - This caveat restricts the service to only being able to execute a specified set of operations (READ, CREATE, UPDATE, DELETE). In this case, the service may only read documents, and is disallowed from writing, updating or deleting any documents from the EDV.
AllowedDocuments - This caveat restricts the service to only allow operations on a specified set of documents. This is the primary caveat we are using for document sharing.
CredentialStatus2020 - This caveat allows easy revocation of the capability. When the capability is verified, the specified host will be queried using the CredentialStatus2020 protocol if a specific ID has been revoked. If this is the case, the capability will not be verified positively.
These are the capabilities currently integrated into our ZCap-LD module. However, these are not the only ones we support, as built a plugin-like system for creating and integrating new restrictions in the form of ZCap-Caveats. They even be dynamically registered at runtime, as the plugin system is based upon dynamic reflective access.
Let us run our products for you.
We offer our products as a managed cloud service for clients who do not want to deploy, manage and run our products by themselves.
The Cloud Platform includes:
All products
Enterprise use cases
Fully managed cluster
Cloud SLA / support
if you want to learn more.
See also: This is a simple service that generates a data request (which is printed to the terminal), and handles acceptance of such data requests.
All our products are open source under the permissive Apache 2 license. This means:
You can use, modify and distribute our products for free and without strings attached.
You can use our products to build pilots or even production systems.
You can deploy, run and manage our products on-premise or in your own cloud environments.
Get in touch if you are using our products. We are curious to learn about your experience and happy to spread the word about your project via our newsletter or case studies.
Run our products with the support of our experts.
All our products are open source under the permissive Apache 2 license. This means:
You can use, modify and distribute our products for free and without strings attached.
You can use our products to build pilots or even production systems.
You can deploy, run and manage our products on-premise or in your own cloud environments.
To complement our free products, we offer services for clients who want to manage our products by themselves:
Consulting
Custom development and integration
Support / SLAs
Get in touch if you want to learn more.