Published on 03 February 2018 in Identity on Blockchain
In my opinion, Blockchain/DLT has currently a Darwinian moment: one piece of technology set to live in different ecosystems, trying to innovate, until it rules... or dies. The Identity space, with its challenges, does not remain unshaken from this momentum.
On February 1rst 2018, Kalyia Young raised a good question to the Identity community:
opening a thread for ID & security professionals to share why putting hashed PII on any immutable ledger(blockchain) is a bad Idea. @katrynadow @dgwbirch @trbouma @aniltj @windley @BenLaurie @bobblakley @ChristopherA @manusporny @drummondreed @JoeAndrieu @BeeckCenter
The basic requirement to be fulfilled is proving something was stated at some point in time and that it has not changed since. #blockchain
Asking the source is a privacy violation. The DMV and/or State Dept. have no need to know when I use their credentials to prove i'm over 21 or 18 or whatever.
We are trying to implement what remains an inherent advantage of paper - independence from system. We are still willing to sacrifice authenticity, integrity for this advantage.
In some schemes, but it doesn't need to be. The issuer can revoke in such a manner that revocation is public and checking status leaks no data to the issuer. Further, the revocation itself can be protected to only be useful to those who already have the payload. No PII leaked.
There was also some expectations along availability of information for future statements`:
That I'm asking. Right now. I also don't trust the DMV to maintain a 99.999% reliable 24x7 service endpoint. The idea that all these authorities are going to become reliable service providers is insane. Instead, they post to the chain when their systems need to record a change.
Pointed was settled that data must be stored hashed and in no way under any reversible way:
Lots of confusion between putting encrypted versus hashed info on #blockchain. The former is BAD; the latter is ok, if it is a Merkle Root, separated from the Merkle Proof, and of course PII
The discussion did not keep on the potential of Merkel Tree as a way to ensure the respect of Privacy of PII. It's a mechanism used a lot for the protection of Bitcoin transaction amounts (same type of data for all actors) but I'm not sure we may apply it as is to protect the PII of one individual (heteregenous types of data per actor). There's surely place for another take on this by the crypto experts. But Merkel Tree is not failsafe in any case. It has also its own integrity challenges.
DLT is already the way to provide an immutable vision of action/transaction/statement. Why do we need to store a value?
There are three specific elements to remember here:
There was an impression that asking an issuer about some PII in a real-time is an unnecessary disclosure of information so a breach of privacy. But there are mechanisms to control that such as nonce:
[ramdom nonce, previously issued certificate FROM DMV]
What information, in a formal crypto model, I’m releasing to DMV?
On top of that, in case of change of PII value, described solution expects the subject to restate the PII... Will it happen willingly? How to deal with Fraud? e.g. I don't involve DMV in the checking if one has the right to drive for respect of privacy. So if this one provides a non expired electronic driver licence proof, I may not know that it has been revoked before expiry.
I may be misunderstanding your stmt of reqts, Tim, but I don't see what, on a blockchain, can tell you that an asserted attribute hasn't changed since last asserted.
If the technology is a ledger, it must be kept this way: no past proof of payment is a guarantee of my present nor future solvency.
Will state be changed? By who? Maybe it’s a bad engineering idea to use blockchain as a bulletin board
Anyway, it depends on the exact PII, just one attribute or a relationship? A SSN or an >IP-SSN? A SSN or a SSN-time-geolocation? Who’s asserting the link?
It’s an immutable ledger but you’re using it to store mutable values
It’s a public ledger but you’re using it to store private data
Oh... and we may solve privacy but not false statement of value written the ledger unless we are on the parties involved in the transaction:
Well, duplicitous issuers can always issue simultaneous incompatible assertions...
Not if they hashed the data properly. Only recipients of both payloads can see the duplicity.
For detecting the duplicity, the party involved in the consumption of the statement must provide its own statement (with its own respect for privacy). Doing so will require search function along the chain or OCSP-like functions ref
If you trust the original source, just ask the source.... If you don’t trust the source is game over before starting the game
“Crypto can't create trust It merely automates the trust that already exists for other reasons" John Gilmore
If RPs go directly to AAs, they will always get the current attribute value according to the AA, and you won’t need the revocation mechanism or the hashing.
And that's where the battlefield of the Blockchain/DLT movement: NO TRUST
Not trusting central authorities isn't about convenience. It's about freedom.
... but still relying on key doing to assert states.
The mechanics are easy enough. That's not the problem. The problem is recovering from total key loss, or key compromise, which blockchain makes no difference to.
To be frank: I've been thinking about this problem for years, and I've concluded that you need authorities.
It’s a distributed ledger with no trusted authority but you’re using it to intermediate RPs and trusted attribute authorities.
Such question also needs to be evaluated along the new Privacy regulations:
There may be GDPR issues in Europe: "The Article 29 Working Party has been unequivocal that hashing constitutes a technique of pseudonymisation, not anonymisation as it is still possible to link the dataset with the data subject." https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3080322
Hashed data with suitable nonces, as well as various hash trees should not be considered pseudo-anonymous, as the same values hashed at another time result in different hashes and thus are not correlateable. But I agee bare hashes are often a bad idea — a lesson from FOAF.
A hash of static data is a corelatable identifier, a DID is an opaque identifier that can be decoupled from the underlying data. A DID would be GDPR compliant, the hash probably not.
Each time Blockchain/DLT is raised to solve one's challenge, it always turns to the opposition of innovation through Blockchain DLT versus innovation through improvment of current technology. This situation escalates further as it involves cryptography which follows its own path of improvement.
Blockchain/DLT evangelist tends to oversee what may be with what can be:
That's bad too. Hence the desire for a distributed system that remains rigorous. Blockchain does that. I'm sure there are other ways, but those who say the blockchain gives you nothing are missing this capability as core to how blockchain increases privacy.
To be fair, how the blockchain can increase privacy.
We tends to forget that PKI by itself can now also solve this kind of question:
1) PKI is working but on a long scale, you think it's not (see let's encrypt) 2) talking of hashing: you need key exchange svc for all to be used, that's what a PKI is for 3) @Steve_Lockstep MDAV http://www.slideshare.net/Identiverse/kantara-workshop-mdav-mobile-device-attributes-validation-cis-2017 … is good example of what you try to achieve with only PKI
It's true that true independence cannot be solved today and that Blockchain/DLT migth solve this... if it is not an utopia:
I think it's eventually going to replace the country clerk as the public record. And with good reason. Lower costs, less bureaucracy, less corruption, less susceptibility to natural disasters. Less dependence on civil servants who think their religion justifies breaking the law.
Another path to look at is DID but does it really need to be combine to Blockchain/DLT to answer this question?
So: you point me at DID to indicate that you have a driving license and then I have to search the blockchain from that date onwards to look for a revocation. I can see that is a logically consistent way of working but how efficient is it?