The New York Times has an interesting story on privacy risks associated with third-party data collection, as exemplified by the case of Alex Rodriguez. The Yankees’ third baseman’s steroid use was recently revealed via data subpoenaed in a federal case.
The way Mr. Rodriguezâ€™s positive steroid test result became public followed a path increasingly common in the computer age: third-party data collection. We are typically told that personal information is anonymously tracked for one reason â€” usually something abstract like making search results more accurate, recommending book titles or speeding traffic through the toll booths on the thruways. But it is then quickly converted into something traceable to an individual, and potentially life-changing.
In Mr. Rodriguezâ€™s case, he participated in a 2003 survey of steroid use among Major League Baseball players. No names were to be revealed. Instead, the results were supposed to be used in aggregation â€” to determine if more than 5 percent of players were cheating â€” and the samples were then to be destroyed.
It is odd that most of the news coverage described the tests as â€œanonymous.â€ […] But when federal prosecutors came calling, as part of a steroid distribution case, it turned out that the â€œanonymousâ€ samples suddenly had clear labels on them. […]
The [Electronic Frontier Foundation] argues that online service providers â€” social networks, search engines, blogs and the like â€” should voluntarily destroy what they collect, to avoid the kind of legal controversies the baseball playersâ€™ union is now facing. The union is being criticized for failing to act during what apparently was a brief window to destroy the 2003 urine samples before the federal prosecutors claimed them. â€œYou donâ€™t want to know that stuff,â€ [EFF Legal Director Cindy Cohn] says, speaking of the ordinary blogger collecting data on every commenter. â€œYou donâ€™t want to get a subpoena. For ordinary Web sites it is a cost to collect all this data.â€