Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
This layer consists of a system that is capable of sharing data among multiple entities of versioning and replication and of performing privacy-preserving search in an efficient manner.
To enable privacy-preserving querying (where the search index is opaque to the server), the client must prepare a list of encrypted index tags (which are stored in the Encrypted Resource, alongside the encrypted data contents).
At least one versioning mechanism has to be supported. Replication is done by the client, not by the server (since the client controls the keys, knows about which other servers to replicate to, etc.). The versioning strategy may be implicit ("last write wins").
An individual vault's choice of authorization mechanism determines how a client shares resources with other entities (authorization capability link or similar mechanism).
The following sections elaborate the architecture from a systematic and functional perspective. Also, components and dependencies are described.
The Storage Kit provides diverse functionality that can be segmented into three layers (based on the Confidential Storage specs and requirements by the Decentralized Identity Foundation):
Layer 1 consists of a client-server system with capabilities of encrypting data in transit and at rest.
Layer 2 consists of a system that is capable of sharing data among multiple entities of versioning and replication and of performing privacy-preserving search in an efficient manner.
Layer 3 consists of high-level server-side functions that work on top of the encrypted credentials.
The following graphic illustrates these three layers and offers a functional perspective:
Confidential Storage as a general system involves various actors that have to interoperate:
EDVs - Encrypted Data Vaults allowing to store data, managing data access and performing queries on the stored data.
Users - Human or nonhuman actor that wishes to access their EDVs.
Clients - Programs or libraries supporting access to conform EDVs.
Providers - Servers that provide an environment to create and manage EDVs.
Services - Instances besides the user that wish to access data on a specific users EDV (needing the users consent to do so).
A user uses one or more clients to access to their EDVs (multiple clients devices may be used to access data using multiple devices, e.g. stationary and mobile, or for redundancy, to avoid loosing data access if one device fails).
Clients may be connected to multiple EDVs (for redundancy or performance reasons).
Services are a special kind of client that access EDVs through Service Data Requests (which is described later on). EDVs are hosted by providers, and are created through specific requests sent by a client.
We embraced the controller-service model for all REST components: The web server maps a specific route to a specific controller, which itself does nothing other than handling the request and response data, it does not contain any business logic. All of that is separated into service managers.
Also, recognizable for all methods suffixed with -Docs in the controllers, is that our OpenAPI documentation, is configured and fed in-code, directly beneath the actual code. Having the code and it’s documentation nearly makes it very easy to maintain this documentation, e.g. to keep it up-to-date when the code changes, which can be forgotten easily if it was in external configuration files with special syntax.
We are using Nimbus JOSE and JWT for all JWE (JSON Web Encryption) related purposes, as it is the most popular JOSE Java library4 and the project team has already had experience with this library.
Bouncy Castle is used as cryptography-provider, providing Java Cryptography Extension (JCE) and Java Cryptography Architecture (JCA) implementations. It was chosen as they are one of the most commonly used - and thus tightly integrated - layer for cryptography in Java (and C), providing audited APIs, parts of it even being certified for FIPS-140-25 (Level 1).
To store the data at Layer A on the system, TPM-backed full disk encryption can be optionally utilized to disallow metadata access even in case of a physical breach. Clients are also secured with passphrase-based symmetric encryption, using PBES2 (published by RSA Laboratories).
Our is used for signing ZCap-LD capability delegations and invocations.
Getting started with the Storage Kit.
There are different ways to get started in using the Storage Kit.
Docker: Quick way to launch and try out the latest builds. No build environment required.
Local Build: Build and run the Storage Kit locally. Requires a JDK 16 build environment including, Gradle.
Dependency (JVM): The Storage Kit can be used directly as JVM-dependency via Maven or Gradle.
Where to use the Storage Kit:
CLI Tool: The Storage Kit comes with a command-line interface (CLI) tool, which offers a rich set of commands to run the entire functionality the Storage Kit provides. The CLI tool can be used by running the Docker container or the executable by the local build.
REST API: In case you want to run the Storage Kit as a service, your application can access all functionalities via the REST API.
Dependency (JVM): The Storage Kit can be used directly as JVM-dependency via Maven or Gradle.
This layer consists of high-level server-side functions that work on top of the encrypted credentials.
It is helpful if data storage providers are able to notify clients when changes to persisted data occurs. A server may optionally implement a mechanism by which clients can subscribe to changes in the vault.
Integrity protection is needed to prevent modification attacks. Some form of integrity protection must be integrated (e.g. HASHLINKS) to prevent the successful execution of such attacks.
Data models for data requests (serialized to and from JSON):
The rest API is largely defined by the service provider.
It simply has to adhere to:
having at least one POST endpoint
with the DataResponse
as request body
Services come with their own configuration files.
For the configuration of service -> implementation mappings, ServiceMatrix is used.
The default mapping file is "service-matrix.properties", and looks like this:
e.g., to change the keystore service, simply replace the line
id.walt.services.keystore.KeyStoreService=id.walt.services.keystore.SqlKeyStoreService
with your own implementation mapping, e.g. for the Azure HSM keystore:
id.walt.services.keystore.KeyStoreService=id.walt.services.keystore.azurehsm.AzureHSMKeystoreService
To add a service configuration:
id.walt.services.keystore.KeyStoreService=id.walt.services.keystore.SqlKeyStoreService:sql.conf
Service configuration is by default in HOCON format. Refer to the specific service on how their configuration is laid out.
The client REST API is still in beta testing.
Currently it is recommended to include the JVM client library as a dependency using Maven or Gradle.
Please let us know if you are interested in using the client in non-JVM languages, so we can prioritize this feature accordingly.
Run different functionalities of the Storage Kit by executing individual commands.
Make sure you have Docker or a JDK 16 build environment including Gradle installed on your machine
Clone the project
2. Change folder
3. Build docker container
4. Run the project
Clone the project
2. Change folder
4. Set an alias
To make it more convient to use, you can also set an alias as follows for the executable:
5. Use the CLI
Enhance your existing applications with Confidential Storage functionality, by using the Storage Kit as a dependency.
Gradle
Maven
Required Maven repos:
Find other build options for docker .
3. Build the project ()
See also:
Method | Route | Description |
---|
GET | /api-routes | Shows a simple routing table |
GET | /api-documentation | Contains the OpenAPI documentation/definition |
GET | /swagger | Shows a Swagger API interface |
GET | /redoc | Shows a ReDoc API interface |
POST | /edvs | Create a EDV |
POST | /edvs/{edv-id}/docs | Create a document in an EDV |
POST | /edvs/{edv-id}/docs/search | Perform an encrypted search |
GET | /edvs/{edv-id}/docs/{doc-id} | Retrieve a specific document |
PATCH | /edvs/{edv-id}/docs/{doc-id} | Update a specific document |
DELETE | /edvs/{edv-id}/docs/{doc-id} | Delete a specific document |
GET | /capabilities | Displays the capabilities of a server |
GET | /configuration | Displays the configuration of a server |
WEBSOCKET | /edvs/{edv-id}/notifications | Interact with the notification API |
For building the project JDK 16+ is required.
The walt.id wrapper script storagekit.sh is a convenient way for building and using the library on Linux.
The script takes one of the the following arguments:
build|build-docker|build-podman|extract|execute.
For example, for building the project, simply supply the "build" argument:
Manually with Gradle:
After the Gradle build you can run the executable.
In build/distributions/
you have two archives, a .tar, and a .zip.
Extract either one of them, and run waltid-storagekit-1.0-SNAPSHOT/bin/waltid-storagekit
.
This layer consists of a client-server system with capabilities of encrypting data in transit and at rest.
When a vault client makes a request to store, query, modify, or delete data in the vault, the server validates the request. Since the actual data and metadata in any given request is encrypted, such validation is necessarily limited and largely depends on the protocol and the semantics of the request.
The mechanism a server uses to persist data, such as storage on a local, networked, or distributed file system, is determined by the implementation. The persistence mechanism is expected to adhere to the common expectations of a data storage provider, such as reliable storage and retrieval of data.
The configuration allows the the client to perform capability discovery regarding things like authorization, protocol, and replication mechanisms that are used by the server.
When a client makes a request to store, query, modify, or delete data in the vault, the server enforces any authorization policy that is associated with the request.
It is necessary that large data is chunked into sizes that are easily managed by a server. It is the responsibility of the client to set the chunk size of each resource and chunk large data into manageable chunks for the server. It is the responsibility of the server to deny requests to store chunks larger that it can handle. Each chunk is encrypted individually using authenticated encryption.
The process of storing encrypted data starts with the creation of a Resource by the client. If the data is less than the chunk size, it is embedded directly into the content. Otherwise, the data is sharded into chunks by the client (see next section), and each chunk is encrypted and sent to the server.
The process of creating the Encrypted Resource. If the data was sharded into chunks, this is done after the individual chunks are written to the server.
When a client instance is started for the very first time, a number of things have to setup first to allow creating a EDV at a provider:
A master key has to be setup. For human-facing clients, this key is derived from a master passphrase. This symmetric master key will be used to encrypt all data-at-rest of the client instance.
A session is created. This session is initialized with a new Ed255191 based EdDSA public-private key-pair for requests to services and EDVs, and authorization with ZCaps.
This key is used to create the session DID - also known as "controller DID".
The controller DID is used to request a new EDV at a chosen provider. The request contains data about the client, most importantly the did:key.
The key receives the initial capability delegation from the root of trust. Several attributes are generated (e.g. IDs, sets up a did:key for the EDV) for the EDV.
Learn what the Storage Kit is.
In a nutshell, the Storage Kit is a zero-trust solution for all things data storage and sharing.
The main purpose of the Storage Kit is to provide a highly secure and private infrastructure layer for data storage and data sharing for any application.
The importance of the Storage Kit becomes particularly evident in the context of Decentralized Identity and will be a core component for developers who are building solutions and use cases that involve keys, personal data or other secrets.
The following sections elaborate the basics of the Storage Kit to help you get started.
Here are the most important things you need to know about the Storage Kit:
It is written in Kotlin/Java. It can be directly integrated (Maven/Gradle dependency) or run as RESTful web-service. A CLI tool allows you to run all functions manually.
It is open source (Apache 2). You can use the code for free and without strings attached.
It abstracts complexity and low-level functionality via different interfaces (CLI, APIs).
It is a holistic solution that allows you to build use cases “end-to-end”. There is no need to research, combine or tweak different libraries to build pilots or production systems.
It is modular, composable and built on open standards allowing you to customize and extend functionality with your own or third party implementations and to preventing lock-in.
It is flexible in a sense that you can deploy and run it on-premise, in your (multi) cloud environment or as a library in your application.
Find a full overview of all features here
A client supports multiple EDV-groupings by utilizing a session-based storage container. It will be referred to simply as "session" hereafter.
Such a session consists of a unique identifier, key identifier, Decentralized Identifier (DID), list of EDV metadata as required to interact with the respective EDVs. EDV metadata consists of the EDV identifier, server URL, root delegation (from the EDV DID to the client’s DID) and an index key:
This container is not explicitly defined in the specifications, and was chosen to allow users to easy import and export of all relevant data to multiple of their devices.
Note that this container as a whole can be exported from a client and imported into other clients, allowing users to easily access their connected EDVs on multiple devices. To do so, the session will be encoded as JSON and encrypted to a JWT, using the clients master passphrase as the symmetric key.
A secure authorization system is a vital part of a Confidential Storage solution to protected data from misuse, such as data manipulation or data theft. This is true for data storage and data transfer as data must be processed confidentially and may only reach those who have the respective right to access the data.
ZCap-LD uses the object capability model to grant and express authority. Its main function is to securely share and manage data. We opted for a capability-based system (instead of a traditional access control list based system).
To understand capability-based systems it is useful to start with the concept of capabilities:
Basically, a capability consists of a token or a key. The owner uses it to verify that he/she has permission to access an entity or object. It is implemented as a data structure consisting of the access rights and a unique identifier:
This identifier points to a specific object. The access rights declare which operations may be performed. For example, the access rights define a read-only access on a file, or a write access on a memory segment.
In a capability-based system, each user has access to a capability list. With these capabilities the system is able to check if the user is allowed to interact with the object.
A capability protects the object from unauthorized access, but the whole concept is useless if a capability is not protected from manipulation. If a program could change the access list and the identifier of a capability at any time, the program would be able to force access to any object. Therefore, a capability-based system is usually built in such a way that direct modifications by a program are not allowed. So the capability list can only be modified by the operating system or the hardware. What programs can do is to call operating system or hardware operations. That means programs can get new capabilities, delete capabilities or change the rights in a capability.
The goal of an authorization system is to protect confidential data. Conventional systems only partially perform this important task. Capabilities can be used to prevent processes from gaining access to data for which they do not have permission. This means a process is allowed to access capabilities which are necessary to only access the resources that the process needs. This ensures that no confidential data is revealed without having the right permissions to access this data.
The concept of capabilities can be well combined with sandboxing (i.e. the creation of an isolated test environment in which it is safe to execute suspicious URLs or files). Capabilities can be used to grant access to all data associated with the program. That means capabilities are used to sandbox every process in the program and giving it access to all associated data. This prevents the program from gaining access to sensitive data located on the other side (i.e. on the user’s desktop).
When uploading a document, the users client will store a file key to the encrypted file index, chunk the file (explained below) and encrypt the chunks with the file key:
Client data request from services
The service transmits a request to the client, indicating a context, desired data type, and it’s did:key (for the services public key).
Assuming the client accepts the request, it queries the document index for the list of required chunks (in correct order), and responds with a delegated ZCap authorization for each of the specific chunks, bundled with the file key.
Sequence Diagram illustrating service access:
Data exchange between service and EDV
The service contacts the EDV with a object retrieval request for each of the required file chunks. The requests are accompanied with the ZCap capability invocation the client delegated to the service.
The EDV encrypts the requested objects with the public key of the delegated did:key in the ZCap capability invocation and answers them back to the service.
Note that, both of steps may happen in parallel.
Service data decryption
The service decrypts each of the individually EDV-encrypted chunks with their private key.
The service decrypts each of the individually client-encrypted chunks using the file key provided by the client.
The service puts the chunks in the order specified by the client response.
Revocation
The client may revoke the data access of a previously shared(/trusted) service at any time using one of two possible methods:
The client indicates a caveat in the initial ZCap capability delegation to the service, e.g. using SimpleCredentialStatus2022, which accesses a trusted revocation service. Multiple of such may be defined.
The client uses the inbuilt did:key revocation feature of the EDV, which works like a blacklist, indicating otherwise valid ZCap authorizations for specific did:key ids no longer valid.
Notice: This section describes a non-default alternative backend for using encrypted search. The current recommended way is to use the default hash-based index search.
A key feature of Confidential Storage is the ability to search through encrypted data. The main challenge is that the higher the security of a system is, the lower its performance and efficiency.
The reasons that the search functionality consumes more performance are obvious: If you want to search through encrypted data, you either have to decrypt the data first to be able to search through it, or you use other methods that also involve additional operations. Regardless of the method, the system must always carry out additional steps that are not required in unencrypted systems.
The SSE concept tries to achieve a suitable balance between security and efficiency.
The following graphic shows the main components of a simple SEE system:
By advancing the existing basic ZCap-LD implementation with the ZCap-Caveats extension system, it is now a full-blown ZCap-LD issuance and verification system.
ZCap-based capabilities may also include ZCap-Caveats.
ZCap-Caveats allow users even tighter control control on who can access what data in which specific circumstances. These ZCap-Caveats can also be set at multiple levels of delegations, thus allowing a user to restrict access to their documents, and a service setting ZCap-Caveats in a delegation to a sub-service, further restricting access to the users data.
An overview is shown here: