FTC Chief Technologist Ed Felten Discusses Anonymity and Privacy
Thursday, May 3rd, 2012In March, the Federal Trade Commission started a new technology blog and Twitter account for FTC Chief Technologist Ed Felten. Recently, Felten wrote two posts concerning the issues of anonymity and privacy. In the first, he discusses “hashing” as a poor technique for “anonymization.” (We’ve discussed problems with anonymization and de-anonymization before.) Felten writes:
What is hashing anyway? What we’re talking about is technically called a “cryptographic hash function” (or, to super hardcore theory nerds, a randomly chosen member of a pseudorandom function family–but I digress). I’ll just call it a “hash” for short. A hash is a mathematical function: you give it an input value and the function thinks for a while and then emits an output value; and the same input always yields the same output. What makes a hash special is that it is as unpredictable as a mathematical function can be–it is designed so that there is no rhyme or reason to its behavior, except for the iron rule that the same input always yields the same output.
He goes on to give an example of how a hash can be a poor anonymization technique, but he also notes: “Does this means that hashing always fails, and is never a good way to scrub data? Almost, but not quite. There are more advanced uses of hashing that can offer some protection in some settings. But the casual assumption that hashing is sufficient to anonymize data is risky at best, and usually wrong.” Read more »

