Submitted by Jon on Mon, 12/06/2010 - 11:24
My brain was pretty close to silly putty by the end of last week. It was been snapping back and forth, rubber-band-like, between microscopic, tightly focused, gnarled and tricky use cases up to their connection to the UN Global Pulse project - a global, systems-changing project.
The Global Pulse, at scale, is... well, the more time I spend on it the less I'm sure I know what it specifically is. In effect, it is a massive data coordination system which helps visualization and tracking of anomalys and trends. The dream is to predict crises and improve prevention. This is easily thought of in detecting disease outbreaks through various data-connected behavior changes (increase in usage of oral rehydration salts as evidenced by stock-outs in health clinics reported in a nationa health information system could indicate a cholera problem). Its most valuable use case seems to be at the national level, but there would also (obviously) be a global level to track larger trends across countries and regions. And country-level offices could peer together with other Pulse installations, bring in global baseline data, and so on. It keeps going deeper and deeper in every direction possible.
Accepting the insanely complicated data and architecture questions, how do you even find the right data (whether it's well-formatted or a pile of paper), and connect it in and pull out solid anomaly tracking and generate useful, predictive guesses on trends. That's the key in the next stage of the Pulse - starting in one country. This PulseLab will be able to grok the local context and know the right data to plug in.
The trick for the data and architecture part has also been faced. Implementation will not be easy, by any means, but the goal of the architecture is to re-use and re-cycle as many existing tools as possible to slurp in data (both chunky databases and firehoses of live streaming data), standardize it, and then create a set of manipulation and visualization tools to help reveal trends and test hypotheses. This will likely take the form of a set of toyboxes of data sources, data transformation tools, apps (input/output to other useful systems like mappers, charting, Ushahidi, etc.), and a recipe box of how others have chained these together for specific data-digging goals. This recipe and the hypothesis testing tool (the "hunch" ) will likely compete to be the core social object of the system, with aid and government officials trading hunches and recipes (and recipes to support hunches, hunches based on recipe results...).
There is a lot up in the air, and a lot still to congeal as development of this tool and the architecture gets moving. The (amazingly well-facilitated) process which went from thinking through users, their requirements, common underlying system-level requirements and speccing those out was fantastic. It encouraged a lot of different and conflicting views around the product's end form to come to a loose consensus (and a better, more flexible product outline!).
If you're saying it can't be done, you're almost right. The first iterations will be limited and possibly fragile, relying on low-hanging data fruit instead of difficult to "harvest" data exhaust. Privacy issues abound, both on personal levels and at government security levels. Trust me when I say that the room was stock full of mind-bogglingly smart people who have dealt with the real worlds of development and reconstruction work, and these obstacles are being worked through by people who realize that lives and livelihoods are at stake in some of the privacy questions.
Here are summary notes from each day: One (Term of the Day: "Data Exhaust"), Two (TotD: "Data Esperanto"), and Three (TotD: "Contextualized Cartography"), as well as a solid overview of the project, and the call to action leading in to last week's workshop. A great writeup of the event is at by MIT's Nadav Aharony.