Blockchain Hackathon! Hyperledger Indy

Blockchain is the hot buzz word right now. Blockchains and distributed ledgers are touted as the technology that will disrupt the digital world (whatever that means). They allow disparate users to transact data without a centralised authority, and for any entity to check the integrity of the data exchanged.

The crew at NextFaze in collaboration with our close partner Meeco wanted to see what all the fuss is about and whether this tech lives up to its reputation. In an epic move, the whole office shut down for a week for a Blockchain Hackathon!

Some goals were set out for the hackathon. We wanted to cut through the hype and discover whether blockchains have real-world utility. We wanted to gain general blockchain skills and acquire technical knowledge to be able to determine their usefulness for our existing and future clients. We wanted to learn the similarities, differences, and nuances between the different technologies. Our partner Meeco, a leader in digital privacy technology, wanted to assess the best technology for the future of its platform.

We broke into four groups and each picked a ledger to study: R3’s Corda, uPort on Etherium, Hyperledger Fabric, and Hyperledger Indy. My team consisted of Shane Woolcock, Jacques Fourie, and myself, Brent Jacobs, and together we focussed on Hyperledger Indy. Hyperledger Indy was chosen for investigation due to its focus on identity and verifiable claims. The remainder of this post will focus on Hyperledger Indy.

What is Hyperledger Indy?

Hyperledger Indy is a public permissioned distributed ledger built to be a decentralized identity platform. The ledger is used to store identity records which contain Decentralized Identifiers (DIDs), associated public keys, as well as things like credential schemas. This is the basis for a self-sovereign identity which the user owns and controls.

The ledger maintains the concept of Web of Trust. A web of trust starts with Stewards and Trust Anchors, trusted individuals and organisations. As entities connect with one another, pairwise-unique DIDs are added to the ledger. Organisations and individuals can, in addition, publish their trust of other entities by granting Trust Anchor status to the users they trust. In this manner the web of trust grows.

Users can generate credentials (aka verifiable claims) about themselves or other users by referencing a user’s DID. A credential can be anything, such as “This user is over 18 years of age”, but is generally in the form of key-value pairs, and will be associated with a credential schema that has been published on the ledger. Claims can be shared with other users, and recipients of credentials can cryptographically verify the issuer of the claim and that it has not been modified.

The ledger is provided by a codebase called Plenum which runs three ledgers in parallel, one to track nodes and their current states, another to track configuration parameters for the distributed system, and the third to record DIDs, identity records, credential schemas, and the like. Private keys associated with DIDs, credentials, and proofs are stored off-chain in the user’s wallet. Ledger transactions are sent between nodes using CurveZMQ and are persisted using RocksDB. I am not aware of any details of how the wallet is implemented.

Nodes stay in sync using the Redundant Byzantine Fault Tolerant (RBFT) consensus algorithm. A primary node is elected and dictates the order of transactions to the other nodes. A node that receives a request, relays it to other nodes and expects to receive ordering responses from the primary node. RBFT is a modification to the original Byzantine Fault Tolerant (BFT) algorithm. It runs multiple “instances” of the BFT algorithm, one of which is tagged as the master. Each instance has a different primary node and monitors the overall throughput. If the master instance is declining in performance, a “view change” occurs where a different instance is tagged as the new master.

The Example Scenario

An example always helps. The following is the example provided by Hyperledger Indy.

Alice, a graduate of the fictional Faber College, wants to apply for a job at the fictional company Acme Corp. As soon as she has the job, she wants to apply for a loan in Thrift Bank so she can buy a car. She would like to use her college transcript as proof of her education on the job application and once hired, Alice would like to use the fact of employment as evidence of her creditworthiness for the loan.

There are a lot of moving parts here. It’s quite a nice, succinct scenario that shows off a real-world use case.

First off, Faber College, Acme Corp, and Thrift Bank are already known to the ledger. They have all been on-boarded and trusted via a Steward, a pre-seeded entity that can grant the Trust Anchor role.

Additionally, a Government entity has also been on-boarded. The government entity has published a credential schema to define the Transaction and Job Application credential.

The first step is for Alice and Faber College to connect with one another. Faber College generates a new DID to connect with Alice and provides a connection request. Alice generates two DIDs, one for herself, and the other to provide in response to the connection request. Alice and Faber College have just connected using pairwise-unique DIDs known only to each other, but verifiable publically. Faber College can additionally grant Alice with the Trust Anchor role if they choose to.

Faber College can now issue a Transcript credential to Alice as proof of her qualifications. The credentials are sent to Alice encrypted with the public key that is associated with the DID she used when connecting with Faber College. She decrypts the credentials with her private key and stores it in her wallet.

With the Transcript credential in her wallet, Alice can now connect with Acme Corp, and provide the credential from Faber College as proof of her qualifications. The same connection process happens between Alice and Acme Corp. Acme Corp then sends a Job-Application proof request to Alice which requests a Transcript credential. When Alice responds with the Transcript credential, Acme Corp can cryptographically verify that it was indeed issued by Faber Corp and that the credentials have not been modified. Acme Corp can be sure that Alice is indeed qualified for the role they are offering her.

The example continues on in the same fashion, where Alice gets a Job Certificate credential from her new employer and uses it to apply for a loan with Thrift Bank, however, at this point all the concepts have been demonstrated.

The Investigation

Our initial plan was to see if we could use Indy to store a user’s identity, and generate some credentials which can be consumed by another user. Indy claims to be focused on providing users with full control over their information. With that in mind, we decided to build a front-end for managing a user’s data, and a system for a company to request that data.

At first, we wanted to see how the system worked and run through a demo to show us an overview of the concepts. We started with the Getting Started Guide. It explains a lot of Indy’s concepts and even shows code snippets on how to perform various actions. However, it doesn’t explain how to bring up a test network, or how to run the snippets ourselves. We never did work out how to run through the example scenario on a test network. We eventually found an older deprecated Getting Started Guide which we were able to get working. The guide indirectly points to a Readme which contains instructions to bring up a test network and a console to run the code snippets in the guide.

Our team really struggled with working through the documentation and instructions and we never really got off the ground. Eventually, we were directed to the indy-agent repo. This contains an express.js app which you can bring up quite easily using Docker. Unfortunately, the app is not documented well. We had to look through the source code to determine the passwords, and to work out what inputs were required on the various UX screens. We finally felt like we had a win as at least we had something we could demo to the NextFaze team to show the functionality of Indy. Demo day was a big flop for our team. Minutes into our demonstration, the test network froze with some random errors.

What can be improved

Indy is spread across too many repositories, each with their own Readmes, often having a separate Getting Started guide, with subfolders also containing Readmes and other guides. These guides are often outdated, or the steps don’t work as expected (or at all), or require assumed knowledge, they regularly point to other repos’ documentation, or link to documentation that no longer exist.

When I’m learning new tech, I want to be able to bring up the tech quickly and simply and see it in action. The time of developers and companies is precious, so it’s so important that documentation is concise, up to date and actually works, and demonstrates functionality with ease.

For Indy, all the outdated documentation needs to be cleared out. The documentation to bring up an example network and clients needs to be in one document in the main Readme of the repo. Likewise, the documentation for the scenario should very clearly state how to bring up a test network and how to get a console running so we can follow along and experiment. A person should be able to go to any of the repos and bring up a working demonstration in 5 minutes. The fact that it was so hard to just have it work, really makes it seem more like a proof-of-concept than a product that anyone can use.

A missing concept

I think there is a fundamental concept missing from the example scenario and the system as a whole. Entities that publish credential schemas need the ability to publish which users are authorised to generate credentials for that schema.

Using the same example as the Indy scenario example, the government entity should be able to say that Faber College is authorised to generate Transcript credentials. This implies that Transcript credentials made by any other entity are not valid. This gives a tremendous amount of distributed power to entities relying on these credentials.

For example, Acme Corp now has the ability to verify whether the Transcript credentials it receives are from a valid educational institution or not. In the real world, there are hundreds if not thousands of educational institutions. Without the ability to authorise the generation of credentials, Acme Corp would need to explicitly know which institutions are valid and trustworthy. With the ability to authorise the generation of credentials, Acme Corp merely needs to trust that the government entity has done its due diligence with the education institutions it has authorised.

To be fair, the Hyperledger Indy team aren’t the only ones to miss this point. Microsoft also uses this example in their Decentralized Identity White Paper.

Last words

I think it’s great that the Hyperledger Indy team are trying to give users their digital identities back and define new ways for organisations and users to share data while maintaining privacy. It’s an important problem to solve. However, I wouldn’t recommend Hyperledger Indy to use in any real world scenario just yet. The project is currently at the incubation stage. The documentation is fragmented and overly complex. Because of that, it is difficult to decipher how to run through a working example. Any new developers taking on Hyperledger Indy will likely struggle and even give up. A business would be silly to adopt the burden of such a challenging technology.

By Brent Jacobs