Many popular Web sites are “leaking” their users’ information, according to a new study from Craig Wills and Konstantin Naryshkin of the Worcester Polytechnic Institute and Balachander Krishnamurthy of AT&T Labs Research. In “Privacy leakage vs. Protection measures: the growing disconnect” (researchers’ pdf; archive pdf), the researchers examined more than 100 popular Web sites (ones that are not online social networks) to see if the Web sites leaked private information to prominent data collectors. The researchers say “that the privacy landscape is worsening as there is a growing disconnect between steadily increasing leakage to and linkage by aggregators with existing and proposed protection measures.”
Here’s more information from the introduction:
Recently, multiple vectors of private information leakage via Online Social Networks (OSN) and the two-decade long aggregation of data about users visiting popular Web sites have been reported. The problem of privacy has worsened significantly in spite of the various proposals and reports by researchers, government agencies, and privacy advocates. The ability of advertisers and third-party aggregators to collect a vast amount of increasingly personal information about users who visit various Web sites has been steadily growing. Numerous stories have expressed alarm about the situation with legislatures and privacy commissioners in different countries paying closer attention to the problem . The awareness about the steady erosion of privacy on the part of users is growing slowly. The potential economic impact as a result of loss of brand value has forced some companies to start paying closer attention to complaints from users and privacy advocates. […]
We show that beyond the egregious leakage of private information via OSNs and their more recent mobile counterparts, a key part of the Internet with tens of millions of users representing diverse demographics with accounts on popular non-OSN Web sites also suffer from private information leakage to prominent aggregators. Additionally, less well-understood notions of linkage are typically not addressed by most of the proposed privacy solutions. One such privacy issue arises from the existence of globally unique ids such as an OSN id or reused email addresses that could be used to link together pieces of seemingly distinct information. Beyond the intrinsic identifying nature of these ids, they aid in linking together other information, such as cookies from a home and work computer. New proposals, such as the recent United States Federal Trade Commission’s December 2010 report , fail to address several key issues. […]
We look at a broad array of sites in various categories where users establish identities and provide personal information. We examine the extent of direct leakage of private information as a result of typical user actions on these sites and present a view of exactly what subset of private information that third-party aggregators receive from these Web sites. Finally, we explore the potential for aggregators to link various pieces of information they receive via globally unique identifiers, such as userids from these sites, or via browser fingerprinting.