The New York Times has a story on the use of data held by internet companies in court. I don’t think it’s actually all that new a story; it’s more of a ‘lawyers are finally starting to catch up with the net’ story, in the same way you would have had a story about lawyers coming to terms with fax technology 20 years ago.

Towards the beginning is this quote: “And even though [the companies that provide Internet service and run Web sites] promise to protect the privacy of their users, they routinely hand over the most intimate information in response to legal demands from criminal investigators and lawyers fighting civil cases.”

This is no different from banks, or doctors–each of whom usually promise and/or are required to keep their customers’/patients’ data confidential–handing it over when served with a subpoena. Subpoenas have been around as part of the courts’ processes for a very long time, and they invariably and routinely override confidentiality.

There is an important difference between subpoenas issued in criminal investigations or proceedings, and those in civil proceedings. The former are usually affected by strictures operating on law enforcement personnel (eg the Electronic Communications Privacy Act), but the latter aren’t. (They are still subject, of course, to the usual requirements of a subpoena such as that the material sought is likely to be relevant, and that the subpoena itself is not oppressively burdensome). The article doesn’t say so, but it mainly concentrates on the criminal side, whereas it is the civil side where the threat to ‘privacy’ is arguably greater.

I put privacy in scare quotes, because that’s really the nub of the issue: the question is what protection will an individual actually receive over data linked to their internet use (eg what they sent; where they sent it to; what web sites they might have visited), versus what they expect they will receive.

The answer is probably that there is quite a big differential. First, I think many if not most users suspect that no data at all is held; that is, because they can’t see anything, or (usually) access any data about their net use, they think that that data does not exist. This is usually going to be wrong–as most, if not all, companies involved in providing net access or net services will keep all kinds of data, eg for troubleshooting or security or generally running their business, as well as finding or exploiting streams of revenue.

Secondly, many users would probably expect that because they carry our their internet activity more or less in private, that somehow any data related to it is private. Again, this is often not going to be the case. Sure there may be obligations of confidence, particularly if a website promises to keep its users’ data private. But most such obligations are either expressly qualified in the case of a lawful, valid subpoena, or will give way to one in court. That’s just the way subpoenas work. You might be able to persuade a court to impose some protections around the data, or the law may in some cases impose it for you (eg if it concerns identities of minors in some cases), but that’s after the data has been produced into court; it doesn’t stop it getting there.

Finally, what’s bringing this to a head is that both law enforcement officers and lawyers are getting much more tech-savvy. They know what data is likely to be held, and what it is likely to reveal: whether it’s IP addresses which can be used to identify ISP users, or emails thought to have been deleted, or metadata in documents that can reveal the person who authored them (see eg the BTK killer case), or private information picked up by Google whether intentionally or not (as Dell recently found to its embarrassment). Many of the major accountancy companies have computer forensic divisions who can recover and reconstruct all manner of data, and a number of their personnel are often ex-law enforcement. One affidavit submitted in the prosecution of Zacarias Moussaoui sets out some detail about the FBI’s methods.

In a nutshell, getting and using this kind of data happens quite a bit now, and will happen even more in future.