Good comment on this article by longhairedboy:
In many ways, the new initiative is an attempt to correct the failings of the government’s first “open data” effort, Data.gov.
This is stupid, actually. The Data.gov effort did not fall short because of clunky data formats, but rather because funding was drastically cut.
http://www.readwriteweb.com/archives/da ... _by_75.php
Many others have more thoughtfully described the problems with providing government data as APIs and I'll provide links. Basically, the first step should always be providing data as downloads as easy to use text files like CSV. APIs are much harder to design and support. As a result of being underfunded in the government space they typically perform too poorly to be relied on my 3rd parties. Read more here:
http://sunlightlabs.com/blog/2012/gover ... ed-an-api/
http://www.peterkrantz.com/2012/publish ... pi-design/
http://sgillies.net/blog/1101/does-plei ... ve-an-api/
Government agencies shouldn't be spinning their wheels thinking about XML vs JSON, RESTful vs SOA. They should be concerned with providing accurate, timely, and extensive information in simple plain text formats. Once this is achieved, it _may_ make sense to start thinking about APIs for a subset of those datasets."
Fascinating. The best visual analysis has to offer... :)
very clever and probably quite amazing. Still, I want to do that kind of thing on my own data, for myself.
this is very interesting and shows what I have longed believed - data coming from distributed sources can be as statistically sound as data collected in traditional ways. In fact, with a bit of experience and focus, it can probably be better
Wesabe is still the best example I can think of when it comes to personal data management. they had the right attitude to my data, they got the balance between user input and platform functionality (despite the self-flagelation in the post) and I do believe they got it right. Time will tell whether they were just ahead of time (in terms of willingness of users to get closer to their data) or whether the cause of failure was as the post explains.
Most cars now undergo regular state emissions and safety inspections. A mechanic plugs an electronic reader into what’s known as the onboard diagnostic unit, a computer that sits under your dashboard, monitoring data on acceleration, emissions, fuel levels and engine problems. The mechanic can then download the data to his own computer and analyze it.
Because carmakers believe such diagnostic data to be their property, much of it is accessible only by the manufacturer and authorized dealers and their mechanics. And even then, only a small amount of the data is available — most cars’ computers don’t store data, they only monitor it
But what if a car’s entire data stream was made available to drivers in real time? You could use, for instance, a hypothetical “analyze-my-drive” application for your smart phone to tell you when it was time to change the oil or why your “check engine” light was on. The application could tell you how many miles you were getting to the gallon, and how much yesterday’s commute cost you in time, fuel and emissions. It could even tell you, say, that your spouse’s trips to the grocery store were 20 percent more fuel-efficient than yours.
Allowing drivers and carmakers access to real-time performance data wouldn’t prevent every future mechanical failure. But it would allow carmakers and entrepreneurs to develop analytical tools to help catch developing problems in both individual cars and entire model lines. Cars would continue to break down and even cause accidents, but it wouldn’t take a Congressional hearing to figure out why.
There's many initiatives making (public) data available to public but none making individuals' data available to individuals only. this is where MINT tries to do
rather interesting. not sure about all the conclusions - I do think we are more individual(istic) than the researchers conclude... but worth a look.
beautiful... and occassionally meaningful
A second concern I have is that unstructured text analysis is very difficult to do well and even specialist companies that have been at this for years find it difficult, so to suggest that SAP will simply do this because they set their mind to it is optimistic to say the least. Entity extraction, semantic tagging and triple building, sentiment analysis, quality, and content authority are extremely challenging, especially as content feeds shift to activity streams which provide far less context for the content to be analyzed against. To then marry this to structured data analysis is something that has been accomplished in highly specialized fields like government intelligence and applications for financial traders.
Lastly, there is one very big obstacle in the way of anyone attempting to capture unstructured data for analytics, office applications. The single greatest store of unstructured data in any enterprise is email, followed closely by documents created by personal productivity applications. Microsoft is certainly well positioned here, their entire Sharepoint roadmap reads like a plan to integrate MS Office data to collaboration and transaction systems but even Microsoft is finding that bringing intelligence to email is a tough rock to push. Similarly, companies like Gist, Xobni, Clear Context, Kwaga, and Postbox have emerged and will certainly develop leadership long before SAP enters the market.
From where I sit, the single biggest threat that Google presents is their ability to climb the learning curve very quickly and that the lack of rigid product management has demonstrated to be a remarkably effective quality for quickly iterating products that then gain broad consumer acceptance.
Click in to find related links.
Groups interested in data ana...