datamining – ubisurv

FBI data warehouse revealed by EFF

Tenacious FoI and ‘institutional discovery’ work both in and out of the US courts by the Electronic Frontier Foundation has resulted in the FBI releasing lots of information about its enormous dataveillance program, based around the Investigative Data Warehouse (IDW).

The clear and comprehensible report is available from EFF here, but the basic messages are that:

the FBI now has a data warehouse with over a billion unique documents or seven times as many as are contained in the Library of Congress;
it is using content management and datamining software to connect, cross-reference and analyse data from over fifty previously separate datasets included in the warehouse. These include, by the way, both the entire US-VISIT database, the No-Fly list and other controversial post-9/11 systems.
The IDW will be used for both link and pattern analysis using technology connected to the Foreign Terrorist Tracking Task Force (FTTTF) prgram, in other words Knowledge Disovery in Databases (KDD) software, which will through connecting people, groups and places, will generate entirely ‘new’ data and project links forward in time as predictions.

EFF conclude that datamining is the future for the IDW. This is true, but I would also say that it was the past and is the present too. Datamining is not new for the US intelligence services, indeed many of the techniques we now call datamining were developed by the National Security Agency (NSA). There would be no point in the FBI just warehousing vast numbers of documents without techniques for analysing and connecting them. KDD may well be more recent for the FBI and this phildickian ‘pre-crime’ is most certainly the future in more ways than one…

There is a lot that interests me here (and indeed, I am currently trying to write a piece about the socio-techncial history of these massive intelligence data analysis systems), but one issue is whether this complex operation will ‘work’ or whether it will throw up so many random and worthless ‘connections’ (the ‘six-degrees of Kevin Bacon’ syndrome) that it will actually slow-down or damage actual investigations into real criminal activities. That all depends on the architecture of the system, and that is something we know little about, although there are a few hints in the EFF report…

(thanks to Rosamunde van Brakel for the link)

Global CCTV datamining project revealed

As a result of an annual report on datamining sent to the US Congress by the Office of the Director of National Intelligence, a research project, Video Analysis and Content Extraction (VACE), has been revealed. The program is aiming to produce an computer system that will be able to search and analyse video images, especially “surveillance-camera data from countries other than the United States” to identify “well-established patterns of clearly suspicious behavior.”

Conducted by the Office of Incisive Analysis, part of the Intelligence Advanced Research Projects Activity (IARPA), the program has apparently been running since 2001,and is merely one of several post-9/11 research projects aiming to create advanced dataveillance systems to analyse data from global sources. How the USA would obtain the information is not specified…

One could spend a long time listing all the DARPA and IARPA projects that are running, many of which are speculative and come to nothing. The report also mentions the curious Project Reynard that I have mentioned before, which aims to analyse the behaviours of avatars in online gaming environments with the aim of detecting ‘suspicious behaviours’. Reynard is apparently achieving some successful results, but we have no real idea at what stage VACE is, and the report only states that some elements are being tested with real world data. This implies that there is nowhere near a complete system. Nevertheless the mentality behind these projects is worrying. It is hardly the first time that the USA has tried to create what Paul Edwards called a ‘closed world’ and these utopian projects which effectively try to know the whole world in some way (like ECHELON, or the FBI’s proposed Server in the Sky) are an ongoing US state obsession.

It is the particular idea that ‘suspicious patterns of behaviour’ can be identified through constant surveillance and automated analysis, that our behaviour and indeed thoughts are no longer our own business. Because it is thoughts and anticipating action that is the ultimate goal. One can see this, at a finer grain, of programs like Project Hostile Intent, a Department of Homeland Security initiative to analyse ‘microexpressions’, supposedly preconscious facial movements. The EU is not immune from such incredibly intrusive proposals: so-called ‘spy in the cabin’ cameras and microphones in the back of every seat have been proposed by the EU-funded SAFEE project, which is supported by a large consortium of security corporations. The European Commission has already hinted that it might try to ‘require’ airlines to use the system when developed.

No doubt too, because of the close (and largely secret and unaccountable) co-operation of the EU and USA on security issues, all the images and recordings would find their way into these proposes databases and their inhuman agents would check them over to make sure we are all passive, good humans with correct behaviours, expressions and thoughts, whether we are in the real or the virtual world…

Is Facebook going to sell your data or not?

the primary limitation to any social networking tool being used for purposes that users don´t like is that the users can just walk

There´s been some discussion recently over surveillance on Facebook and in particular, the question of whether Facebook is planning to make the vast amounts of data it has for more targeted and intrusive marketing. Britain´s Daily Telegraph reported yesterday, based on an interview with Randi Zuckerberg, Facebook’s global markets director (and not coincidentally, sister of founder Mark Zuckerberg), that it was going to do this. It based its conclusion on the fact that Facebook was demonstrating new instant polling tools at the Davos World Economic Forum, Facebook´s development of so-called User Engagement Advertising, and the fact that unnamed ´marketing experts´ say that Facebook could be ´worth millions´to advertisers.

But, it turns out this is putting 2+2 together to make 5. Techcrunch was one of many tech blogs that questioned the Daily Telgraph´s story. They asked Facebook what was going on and were told that the WEF polls were nothing to do with Engagement Ads (which have been on Facebook for a while already) and that ´Facebook has, for many years, allowed the targeting of advertising in a non-personally identifiable way, based on profile attributes. Nothing has changed in our approach, and Facebook is committed, as always, to connecting users in a trusted environment.´

Now I don´t trust The Daily Telegraph, which has been declining in quality over the last few years and cutting experienced journalists in favour of using agency stories rewritten by trainees. But equally I don´t trust Facebook (or for that matter, any company run by rich kids whose only experience of the world is college, but that´s another story…). It is easy to imagine that they encourage such stories to test the waters. If the reaction was less worried, they might indeed decide to reveal themselves as a massive marketing scam, but the primary limitation to any social networking tool being used for purposes that users don´t like is that the users can just walk. Facebook appeared from nowhere to become a global player within a few years and it could disappear just as quickly when the next big thing arrives. The rise and fall of net-based companies is only going to get faster.

(Thanks to Sami Coll and Jason Nolan for bringing this to my attention)