FBI data warehouse revealed by EFF

Tenacious FoI and ‘institutional discovery’ work both in and out of the US courts by the Electronic Frontier Foundation has resulted in the FBI releasing lots of information about its enormous dataveillance program, based around the Investigative Data Warehouse (IDW). 

The clear and comprehensible report is available from EFF here, but the basic messages are that:

  •  the FBI now has a data warehouse with over a billion unique documents or seven times as many as are contained in the Library of Congress;
  • it is using content management and datamining software to connect, cross-reference and analyse data from over fifty previously separate datasets included in the warehouse. These include, by the way, both the entire US-VISIT database, the No-Fly list and other controversial post-9/11 systems.
  • The IDW will be used for both link and pattern analysis using technology connected to the Foreign Terrorist Tracking Task Force (FTTTF) prgram, in other words Knowledge Disovery in Databases (KDD) software, which will through connecting people, groups and places, will generate entirely ‘new’ data and project links forward in time as predictions.

EFF conclude that datamining is the future for the IDW. This is true, but I would also say that it was the past and is the present too. Datamining is not new for the US intelligence services, indeed many of the techniques we now call datamining were developed by the National Security Agency (NSA). There would be no point in the FBI just warehousing vast numbers of documents without techniques for analysing and connecting them. KDD may well be more recent for the FBI and this phildickian ‘pre-crime’ is most certainly the future in more ways than one…

There is a lot that interests me here (and indeed, I am currently trying to write a piece about the socio-techncial history of these massive intelligence data analysis systems), but one issue is whether this complex operation will ‘work’ or whether it will throw up so many random and worthless ‘connections’ (the ‘six-degrees of Kevin Bacon’ syndrome) that it will actually slow-down or damage actual investigations into real criminal activities. That all depends on the architecture of the system, and that is something we know little about, although there are a few hints in the EFF report…

(thanks to Rosamunde van Brakel for the link)

UK National DNA Database – what will change?

The government’s official response to the damning ruling by the European Court over the retention of DNA and fingerprint samples and data is a farce, which seems utterly contemptuous of the ruling and reasoning of the court, shows no sign of understanding the significance of Article 8 or the British common law principle of innocent until proven guilty.

One thing that has struck me recently in the UK has been the sudden increase in the level of defensiveness by New Labour over the surveillance apparatus it has constructed over the last 12 years. Report after report has damned their slapdash attitude to human rights and civil liberties – we expect the government’s official response to the Lords Constitution Committee report next week – and there have been attacks from various political ‘big beasts’ including David Blunkett, former MI5 Chief Stella Rimington, and most recently Stephen Byers and even current cabinet ministers reportedly asked for the ID card scheme to be scrapped.

As a result, there has been a splurge of sudden backtracks, retreats and promises of change and consultation on future plans but there have also been rather devious attempts to avoid taking real action to remedy already existing wrongs. In the first category, we have seen the abandonment of Clause 152 of the Coroners and Justice Bill, where a an blanket permission for government data-sharing had been hidden, and there have been suggestions that the proposed new super-database of communications traffic data might not be constructed after all – though largely, it seems, on grounds of cost not principle.

However, in the second category, today we got the government’s official responseto the damning ruling by the European Court over the retention of DNA and fingerprint samples and data by the UK police. It is, to put it mildly, a farce, which seems utterly contemptuous of the ruling and reasoning of the court, and shows no sign of understanding the significance of Article 8 for individual liberty. Mind you, it also shows little sign of comprehending the British common law principle of innocent until proven guilty.

The government proposals are to retain the DNA samples and profiles, and fingerprints (these are just as important and not so often mentioned in the news reports) of all those convicted of a crime. Of the innocent, the National DNA Database (NDNAD) has around 350,000+ people who are certainly in such a position, however the police apparently need two years to go through the Police National Computer to check the other 500,000+  DNA profiles of those not convicted of any crime, as they can’t be sure whether existing profiles match to those who have committed offences (so much for joined-up government…).  Then those people, who are, let’s not forget, entirely innocent in law will be sorted into two categories – those arrested but not convicted for serious and violent offences, and those arrested and not convicted of minor offences.

Will the latter have their profiles immediately removed, as we might reasonably expect?

Err, no.

In fact, these innocent people will have their DNA profiles and fingerprints retained for 6 years – more than the number of years (5) that Scotland retains the DNA of those suspected of serious and violent offences. Those in the latter category will have their DNA profiles and fingerprints retained for 12 years. In addition the profiles of children will be retained until they are 18, and then removed only if they have been arrested (again, not convicted) for one minor offence.

Is this an acceptable response? Quite clearly not. It is against the spirit of the ruling by the European Court, even if it might be interpreted as complying with the exact wording issued. More to the point, it is an attempt to get around the difficult issues, not deal with them. It is devious, based on the pre-emptive logic of risk-surveillance principles, and goes against the long-standing principles of British Common Law as well as more recent developments in Human Rights law, and is not the response of a government that has any trust in the people who elected them. It allows the police to continue to populate the NDNAD by stealth.  And they certainly are using whatever methods they can to do so – for example, one key indicator is the rise in the number of stop and searches under Section 44 of the Terrorism Act, which in London, it was also reported today, rose from 72,000 in 2007 to 170,000 in 2008, a rise of 236%, however it rose by 325% amongst the black population. There seems to be no mention of the role that discriminatory stop and search policing plays in populating the NDNAD in recent government statements, however it is quite clear that stop and search policing is discriminatory, and we know too that young black men are disproportionately represented in the NDNAD.

In this climate, with a government obsessed by pre-emptive security to compensate for its growing loss of power and trust, and a police service that appears, after the G20, increasingly out-of-control, what is the chance of developing a fair, accountable, just and transparent system of personal data retention in law enforcement in the UK? At the moment, it could appear, the answer is ‘very small’.

The loneliness of personal data

Surveillance like this harms us all: it makes our lives banal and reveals only the sadness and the pain.

Still from I Love Alaska
Still from I Love Alaska

There is something at once banal and heartbreaking about what is revealed through the examination of personal data. The episodic film, I Love Alaska, captures this beautifully. The film by Lernert Engelberts and Sander Plug is based on AOL’s accidental exposure of the search data of hundreds of thousands of its users, and focuses on just one, 711391. The film consists of an actress reading out the (unusually discursive and plain language) search terms of User 711391 like an incantation, with background sound from Alaskan locations and static camera shots that serve to emphasize her boredom, isolation and loneliness.

I was watching episode 5 of the film when two stories popped into my inbox that just happened to be related. The first was from the New York Times business section and dealt with the other side of the recent US sporting scandal over revelations that baseball player Alex Rodriguez has taken steroids. Like User 711391, Rodriguez had given up his data (in this case, a sample) in the belief that the data would be anonymous and aggregated. But it wasn’t.

So, then we come to how the state deals with this. The Toronto Globe and Mail comments on the way the Canadian federal government is, like so many others, proposing to introduce new legislation to monitor and control Internet use. The comment argues that there is no general need to store personal Internet use data (or Canada will end up like the UK…), and that Internet surveillance should be governed by judicial oversight. Quite so. But, as the NYT article points out, it isn’t just the expanding appetite of the state for data (frequently coupled in the UK with incompetence in data handling) that we should fear but the growth in numbers of, and lack of any oversight or control over, private-sector dataveillance operations.

Some people will argue that any talk of privacy here is irrelevant: User 711391 was cheating on her husband; Rodrguez was taking steroids; there are paedophiles and terrorists conspiring on the Internet. With surveillance the guilty are revealed. Surely, as Damon Knight’s classic short story, ‘I See You’, claimed, with everything exposed we are truly free from ‘sin’? But no. In its revelations, surveillance like this harms us all: it makes our lives banal and reveals only the sadness and the pain. For User 711391, her access to the Internet served at different times as her main source of entertainment, desire, friendship, and even conscience. The AOL debacle revealed all of this and demeaned her and many others in the process. Most of us deserve the comfort of our very ordinary secrets and the ability for things to be forgotten. This is the true value of privacy.

(Thanks to Chiara Fonio for letting me know about I Love Alaska)