Igor Gembitsky on 2009-04-07
New grounds... collective consciousness is the only analogy I can think of.
This link has been bookmarked by 128 people . It was first bookmarked on 24 Jun 2008, by Steven Rafferty.
There is now a better way. Petabytes allow us to say: "Correlation is enough." We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot
Petabytes allow us to say: "Correlation is enough." We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistic
data google technology culture curiosity science theory research wired philosophy causation chris_anderson sjn424
The Petabyte Age is different because more is different. Kilobytes were stored on floppy disks. Megabytes were stored on hard disks. Terabytes were stored in disk arrays. Petabytes are stored in the cloud. As we moved along that progression, we went from
This is an interesting article that I look forward to reading. However the scientific method is surely here to stay. Without a designed experiment to test a hypothesis, it seems to me that it remains impossible to isolate cause and effect, and so be able to forecast the effect of your actions.
New methods of data manipulation for a new age. We no longer have to deal with approximations in order to handle large data sets, we can actually sift through the data now -- with all relationships being dynamic, constantly readjusting and redefining themselves, like an evolving organism that's representative of the whole of the human experience.
Igor Gembitsky on 2009-04-07
New grounds... collective consciousness is the only analogy I can think of.
Igor Gembitsky on 2009-04-07
It really is a new age in which we don't have to deal with approximations to define relationships between data -- its all dynamic and in realtime -- it really is as close to collective consciousness as we've ever gotten.
Frank Carey on 2009-04-07
I wonderwhat this technology can do with markets ? stock and otherwise
The new availability of huge amounts of data, along with the statistical tools to crunch these numbers, offers a whole new way of understanding the world. Correlation supersedes causation, and science can advance even without coherent models, unified theo
science data statistics google theory philosophy technology wired
"Sensors everywhere. Infinite storage. Clouds of processors. Our ability to capture, warehouse, and understand massive amounts of data is changing science, medicine, business, and technology. As our collection of facts and figures grows, so will the opportunity to find answers to fundamental questions. Because in the era of big data, more isn't just more. More is different."
As data sets get larger, scientific method and craving for theory dies. Likens the world to a data set that computers can analyze for us - bypassing understanding as a key step to prediction/planning.
"All models are wrong, but some are useful."
So proclaimed statistician George Box 30 years ago
Once you have a model, you can connect the data sets with confidence. Data without a model is just noise.
But faced with massive data, this approach to science — hypothesize, model, test — is becoming obsolete.
Learning to use a "computer" of this scale may be challenging. But the opportunity is great: The new availability of huge amounts of data, along with the statistical tools to crunch these numbers, offers a whole new way of understanding the world. Correlation supersedes causation, and science can advance even without coherent models, unified theories, or really any mechanistic explanation at all.
Science google Data data mining visualization database_scholarly data_mining dh2008
thanks @bioinfoman3
statistics science mathematics hypothesis theory information trends
This is a world where massive amounts of data and applied mathematics replace every other tool that might be brought to bear. Out with every theory of human behavior, from linguistics to sociology. Forget taxonomy, ontology, and psychology. Who knows why people do what they do? The point is they do it, and we can track and measure it with unprecedented fidelity. With enough data, the numbers speak for themselves.
The big target here isn't advertising, though. It's science. The scientific method is built around testable hypotheses. These models, for the most part, are systems visualized in the minds of scientists. The models are then tested, and experiments confirm or falsify theoretical models of how the world works. This is the way science has worked for hundreds of years.
Scientists are trained to recognize that correlation is not causation, that no conclusions should be drawn simply on the basis of correlation between X and Y (it could just be a coincidence). Instead, you must understand the underlying mechanisms that connect the two. Once you have a model, you can connect the data sets with confidence. Data without a model is just noise.
But faced with massive data, this approach to science — hypothesize, model, test — is becoming obsolete.
The big target here isn't advertising, though. It's science. The scientific method is built around testable hypotheses. These models, for the most part, are systems visualized in the minds of scientists. The models are then tested, and experiments confirm or falsify theoretical models of how the world works. This is the way science has worked for hundreds of years.
Scientists are trained to recognize that correlation is not causation, that no conclusions should be drawn simply on the basis of correlation between X and Y (it could just be a coincidence). Instead, you must understand the underlying mechanisms that connect the two. Once you have a model, you can connect the data sets with confidence. Data without a model is just noise.
Correlation supersedes causation, and science can advance even without coherent models, unified theories, or really any mechanistic explanation at all.
There's no reason to cling to our old ways. It's time to ask: What can science learn from Google?
Get the latest in science news, including space, physics, planet earth, discoveries, NASA, satellites, and space travel from Wired.com
The End of Theory: The Data Deluge Makes the Scientific Method Obsolete
This is a world where massive amounts of data and applied mathematics replace every other tool that might be brought to bear.
Article on data being used to prove theory. Cool visuals that show how data is being analyzed.
mashups data mining statistics visualization science collaboration
"All models are wrong, but some are useful."
So proclaimed statistician George Box 30 years ago, and he was right. But what choice did we have? Only models, from cosmological equations to theories of human behavior, seemed to be able to consistently, if imperfectly, explain the world around us. Until now. Today companies like Google, which have grown up in an era of massively abundant data, don't have to settle for wrong models. Indeed, they don't have to settle for models at all.
Sixty years ago, digital computers made information readable. Twenty years ago, the Internet made it reachable. Ten years ago, the first search engine crawlers made it a single database. Now Google and like-minded companies are sifting through the most measured age in history, treating this massive corpus as a laboratory of the human condition. They are the children of the Petabyte Age.
This is a world where massive amounts of data and applied mathematics replace every other tool that might be brought to bear. Out with every theory of human behavior, from linguistics to sociology. Forget taxonomy, ontology, and psychology. Who knows why people do what they do? The point is they do it, and we can track and measure it with unprecedented fidelity. With enough data, the numbers speak for themselves.
This is a world where massive amounts of data and applied mathematics replace every other tool that might be brought to bear. Out with every theory of human behavior, from linguistics to sociology. Forget taxonomy, ontology, and psychology. Who knows why people do what they do? The point is they do it, and we can track and measure it with unprecedented fidelity. With enough data, the numbers speak for themselves.
Paul Hoff on 2008-06-30
number crunching will show us correlations but human interpretation (sense making) of correlations still wins
cloudcomputing cluster data models organisation science statistics trends wired google delicious
Correlation supersedes causation, and science can advance even without coherent models, unified theories, or really any mechanistic explanation at all.
極論だが、一部正しい。
Sixty years ago, digital computers made information readable. Twenty years ago, the Internet made it reachable. Ten years ago, the first search engine crawlers made it a single database. Now Google and like-minded companies are sifting through the most measured age in history, treating this massive corpus as a laboratory of the human condition. They are the children of the Petabyte Age.
The Petabyte Age is different because more is different. Kilobytes were stored on floppy disks. Megabytes were stored on hard disks. Terabytes were stored in disk arrays. Petabytes are stored in the cloud. As we moved along that progression, we went from the folder analogy to the file cabinet analogy to the library analogy to — well, at petabytes we ran out of organizational analogies.
At the petabyte scale, information is not a matter of simple three- and four-dimensional taxonomy and order but of dimensionally agnostic statistics. It calls for an entirely different approach, one that requires us to lose the tether of data as something that can be visualized in its totality. It forces us to view data mathematically first and establish a context for it later. For instance, Google conquered the advertising world with nothing more than applied mathematics. It didn't pretend to know anything about the culture and conventions of advertising — it just assumed that better data, with better analytical tools, would win the day. And Google was right.
"All models are wrong, and increasing you can succeed without them."
The Petabyte Age is different because more is different. Kilobytes were stored on floppy disks. Megabytes were stored on hard disks. Terabytes were stored in disk arrays. Petabytes are stored in the cloud. As we moved along that progression, we went from
article best articles blogs change computer computing culture science statistics data philosophy google theory technology wired tech numbers research information huge-entity.com
Correlation supersedes causation, and science can advance even without coherent models, unified theories, or really any mechanistic explanation at all.
There's no reason to cling to our old ways. It's time to ask: What can science learn from Google?
Public Stiky Notes
Page Comments
Would you like to comment?
Join Diigo for a free account, or sign in if you are already a member.