We noticed recently that there are lots of people who argue with the fact that data are widdely accessible to the masses. They believe that most consistant data are hiden in very well protected stronghold datacenters. In addition, the common belief is that if the data were accessible, it wouldn't mean that they are of any use for the mass audience.
These arguments are partially true and we found interesting to try to summarize the main blocking points for a mass usage of data.
We believe that there are three of these bootlenecks :
1° Deal with the absence of standard
Data come of many ways and even more numerous formats. There has been some atempt to standardise data into some international and unified formats, but the less we can admit is that these effort haven't paid yet. As a result any platform that wants to use eterogenious data has to reformat them prior any consistant usage. That is part of the work the Data.org makes among many other players. There are good reasons to be optimistic there ; not that data are becoming standardised, but there are an increasing number of technologies that can reformat the data so that they can be exploited more easily. On another topic, and on the contrary to the common belief, data are widely available. One can access to almost any type of data about traffic, weather, pollution, demographics, export/import goods (deep detailed), energy, etc. On this topic the article The Data Deluge From the Economist is quite accurate.
2° Ease the access to data.
Although there are almost unlimited amounts of data transiting through the Internet, it remains very difficult to find them. There is no real search engine for data and no real one stop shopping place for them yet. A few initiatives -among which data.gov is probably the most well known- are currently trying to overcome this issue, but we still lack a Datapedia offering.
3° Provide fun
Data is usually boring. Nothing comparable to typing your friend's name into google or looking at Twitter. There are generaly few people spending their week-end playing with series of data and Excel. In order to let anyone play with the data, it has to be through a very friendly interface that has more to do with the right side of our brain (the emotionnal one) than the left (the rational one). It mean that it must emerge some news tools, that are revolutionnary compared to the ones that have existed over the past quarter of century. But overall, it means that the era of the data spreadsheet -The Excel Data standard- shall end to be replaced by the era of the datavisualization, or in other words, of the graphical representation of data.
Unlooking these three bottlenecks would be critical to make data becoming the next revolution phase of the Internet. It would also probably require that the data become more culturally accepted ; i.e. that we understand that it is not because an information is made out of data that it is necessarly a challenge to understand compare, and use. This might be the more difficult issue to deal with, but there is no doubt that it will be overcome very soon.