The End of Theory
Use this URL to share: http://www.diigo.com/list/imrchen/the-end-of-theoryYou are here: Diigo Home > Roger Chen > Roger Chen's lists
Items:21 | Visits:169
Category:Computers & Internet | Tags:data, mining, science, machine, learning
Created:on 2008-06-29 | Updated:on 2008-07-16
1 - 20 of 21
Next ›
-
9Collapse
The End of Theory: The Data Deluge Makes the Scientific Method Obsolete
"All models are wrong, and increasing you can succeed without them."
more fromwww.wired.com
-
Sixty years ago, digital computers made information readable. Twenty years ago, the Internet made it reachable. Ten years ago, the first search engine crawlers made it a single database.
-
Google's founding philosophy is that we don't know why this page is better than that one: If the statistics of incoming links say it is, that's good enough.
-
Peter Norvig, Google's research director, offered an update to George Box's maxim: "All models are wrong, and increasingly you can succeed without them."
-
The scientific method is built around testable hypotheses. These models, for the most part, are systems visualized in the minds of scientists. The models are then tested, and experiments confirm or falsify theoretical models of how the world works. This is the way science has worked for hundreds of years.
-
Once you have a model, you can connect the data sets with confidence. Data without a model is just noise.
-
But faced with massive data, this approach to science — hypothesize, model, test — is becoming obsolete.
-
There is now a better way. Petabytes allow us to say: "Correlation is enough." We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot.
-
This kind of thinking is poised to go mainstream.
-
What can science learn from Google?
-
-
6Collapse
The End Of The Scientific Method… Wha….? « Life as a Physicist
more fromgordonwatts.wordpress.com
-
His basic thesis is that when you have so much data you can map out every connection, every correlation, then the data becomes the model. No need to derive or understand what is actually happening — you have so much data that you can already make all the predictions that a model would let you do in the first place. In short — you no longer need to develop a theory or hypothesis - just map the data!
-
First, in order for this to work you need to have millions and millions and millions of data points. You need, basically, ever single outcome possible, with all possible other factors. Huge amounts of data. That does not apply to all branches of science.
-
The second problem with this approach is you will never discover anything new. The problem with new things is there is no data on them!
-
Anderson is right — we are entering a new age where the ability to mine these large amounts of data are going to open up whole new levels of understanding
-
This is a new tool, and it will open up all sorts of doors for us. But the end of the scientific method? No — because that implies an end of discovery. And end of new things.
-
Correlations are a way of catching a scientist’s attention, but the models and mechanisms that explain them are how we make the predictions that not only advance science, but generate practical applications. One only needs to look at a promising field that lacks a strong theoretical foundation—high-temperature superconductivity springs to mind—to see how badly the lack of a theory can impact progress
-
-
5Collapse
Why the cloud cannot obscure the scientific method
more fromarstechnica.com
- This article is a response to Chris Anerson's article "The End of Theory: The Data Deluge Makes the Scientific Method Obsolete" - http://www.wired.com/science/discoveries/magazine/16-07/pb_theorypost by imrchen on 2008-06-26
-
Anderson appears to take the position that the new research part of the equation has become superfluous; simply having a good algorithm that recognizes the correlation is enough.
-
Correlations are a way of catching a scientist's attention, but the models and mechanisms that explain them are how we make the predictions that not only advance science, but generate practical applications.
-
without the testable predictions made by the theory, we'll never be able to tell how precisely it is wrong
-
Overall, the foundation of the argument for a replacement for science is correct: the data cloud is changing science, and leaving us in many cases with a Google-level understanding of the connections between things. Where Anderson stumbles is in his conclusions about what this means for science. The fact is that we couldn't have even reached this Google-level understanding without the models and mechanisms that he suggests are doomed to irrelevance.
-
4Collapse
The End of Science
more fromjournalscape.com
-
It's the distinction between engineering and science. They work in a mutualistic feedback loop, but they are very conceptually different at the core.
-
An engineer, e.g., one at Google, may or may not care exactly how something works, or whether it has explanatory power that extends beyond what he is working on. His primary concern is that it just works.
-
A scientist is primarily concerned with questions of ontology, trying to figure out what the true state of the universe is. They may actually come at the problem from the bottom-up (more data driven) or the top-down (more theory driven). But their goal is understanding, not a workable product
-
I think that's the key distinction that Anderson is missing, and so it's just silly to suggest that the torrent of data and data mining techniques are going to render standard science obsolete.
-
-
The Google Way of Science
more fromwww.kk.org
-
1Collapse
Has scientific method become obsolete? « Entertaining Research
more frommogadalai.wordpress.com
-
At a more fundamental level, in spite of what Chris Anderson has to say, science is about explanations, coherent models and understanding. In my opinion, all of what Anderson shows is that, if you have enough data, you can develop technologies without having a clear handle on the underlying science; however, it is wrong to call these technologies science, and argue that you can do science without coherent models or mechanistic explanations.
-
-
1Collapse
Update on scientific methodology obsoleteness « Entertaining Research
more frommogadalai.wordpress.com
-
anyone who thinks the power of data mining will let them write a spam filter without understanding linguistic structure deserves the in-box they’ll get; and that anyone who thinks they can overcome these obstacles by chanting “Bayes, Bayes, Bayes”, without also employing exactly the kind of constraints Pereira mentions, is simply ignorant of the relevant probability theory.
-
-
1Collapse
Earning My Turns: The End of Theory: The Data Deluge Makes the Scientific Method Obsolete
more fromearningmyturns.blogspot.com
-
I like big data as much as the next guy, but this is deeply confused. Where does Anderson think those statistical algorithms come from? Without constraints in the underlying statistical models, those "patterns" would be mere coincidences. Those computational biology methods Anderson gushes over all depend on statistical models of the genome and of evolutionary relationships.
Those large-scale statistical models are different from more familiar deterministic causal models (or from parametric statistical models) because they do not specify the exact form of observable relationships as functions of a small number of parameters, but instead they set constraints on the set of hypotheses that might account for the observed data. But without well-chosen constraints — from scientific theories — all that number crunching will just memorize the experimental data.
-
-
3Collapse
The Reality Club: THE END OF THEORY
George Dyson, Kevin Kelly and Stewart Brand's responses to Chris Anderson's article
more fromwww.edge.org
-
My guess is that this emerging method will be one additional tool in the evolution of the scientific method. It will not replace any current methods (sorry, no end of science!) but will compliment established theory-driven science.
-
I do not see why large amounts of data will undermine the scientific method.
-
Sometimes it will be hard, or impossible, to discover simple models
explaining huge collections of messy data taken from noisy, nonlinear
phenomenon. But it doesn't mean we shouldn't try. Hypotheses aren't
simply useful tools in some potentially-outmoded vision of science;
they are the whole point. Theory is understanding, and understanding
our world is what science is all about.
-
-
4Collapse
Google and the end of everything » mathewingram.com/work
more fromwww.mathewingram.com
-
As I understand it, his argument is that since we have so much data, we can just use algorithms to find correlations in the data, and that will produce as much insight as years of traditional scientific research.
-
think it has a number of serious flaws — and they are all summed up in the title, which implies that having a lot of data and some smart algorithms to sift through it means “the end of the scientific method.” That’s just ridiculous.
-
Expanding the amount of data — even exponentially — doesn’t change the fundamental way that the scientific method functions, it just makes it a lot easier to test a hypothesis.
-
And for the record, correlation still doesn’t mean causation, and likely won’t for the foreseeable future. Correlation just means that you found some data that shares some kind of relationship with other data; it can help suggest causation, but it doesn’t replace it.
-
-
9Collapse
Hacking Cough - Chris Edwards' blog: Scientific method's death a little premature
more fromblog.hackingcough.com
-
But the core of all that Google does right now is based on a statistical approach that makes some basic assumptions about how language works. You might call it a model.
-
Yet, machine-learning algorithms depend on the construction of some kind of model. It is not necessarily a deterministic model in the way that classical mechanics is, but just because it invokes statistics does not make it any less a model-based technique.
-
Professor Jaroslav Stark of Imperial College sees modelling as a key to understanding what goes on inside living systems precisely because models are often inaccurate. For him, the fact that a model diverges from reality provides important clues to interactions that need to be taken into account. And they can provide a way to probe interactions where it is simply not possible to use traditional methods such as turning genes off selectively because that introduces other interactions
-
But that is what science is like: it finds new information, assimilates it and moves on.
-
Big computers can certainly help with the creation and execution of models. But it seems unlikely that unleashing petaflops and petaflops on a problem blind is going to do much for machine learning.
-
Kelly discounts idea of the approach killing scientific method. But dreams up a new term for it: "correlative analytics".
-
the people doing real work on this stuff will be asking themselves: how was the data collected; what were the conditions? In short, while they may not read the data, they will attempt to understand how it came into being and then try to fit it into a model.
-
The original use of the term data mining was pejorative: if you have enough data and search long enough, you can always find some model that fits your data arbitrarily well.
-
Say what you will about the quality of our available scientific models, but the scientific method of hypothesis testing is here to stay.
-
-
2Collapse
What Good is a Theory? | Cosmic Variance
more fromcosmicvariance.com
-
Understanding is a good thing, and in some sense is the primary goal of science.
-
Theory is understanding, and understanding our world is what science is all about.
-
-
3Collapse
Tasty Data Goodies: Data mining: the theory of everything?
more fromblog.swivel.com
-
There is no denying the importance of new data
technology, but Anderson fails to recognize that data mining cannot
replace science. -
the "what" doesn't really matter without the "why."
-
data mining may
change the rules of the science game, it's definitely not the end of
theory
-
