Skip to main content

Diigo Home

Kevin Kelly -- The Technium - The Diigo Meta page

www.kk.org/...the_google_way.php - Cached - Annotated View

Michel Roland's personal annotations on this page

bibliothecaire
Bibliothecaire bookmarked on 2008-06-29 google science
  • Perhaps understanding and answers are overrated. "The problem with computers," Pablo Picasso is rumored to have said, "is that they only give you answers."  These huge data-driven correlative systems will give us lots of answers -- good answers -- but that is all they will give us. That's what the OneComputer does --  gives us good answers. In the coming world of cloud computing perfectly good answers will become a commodity. The real value of the rest of science then becomes asking good questions.

This link has been bookmarked by 26 people . It was first bookmarked on 29 Jun 2008, by Roger Chen.

  • 17 Oct 09
    • It may turn out that tremendously large volumes of data are sufficient to skip the theory part in order to make a predicted observation. Google was one of the first to notice this. For instance, take Google's spell checker. When you misspell a word when googling, Google suggests the proper spelling. How does it know this? How does it predict the correctly spelled word? It is not because it has a theory of good spelling, or has mastered spelling rules. In fact Google knows nothing about spelling rules at all.


      Instead Google operates a very large dataset of observations which show that for any given spelling of a word, x number of people say "yes" when asked if they meant to spell word "y." Google's spelling engine consists entirely of these datapoints, rather than any notion of what correct English spelling is. That is why the same system can correct spelling in any language.

  • 16 Oct 09
    • There's a dawning sense that extremely large databases of information, starting in the petabyte level, could change how we learn things. The traditional way of doing science entails constructing a hypothesis to match observed data or to solicit new data. Here's a bunch of observations; what theory explains the data sufficiently so that we can predict the next observation?
    • It may turn out that tremendously large volumes of data are sufficient to skip the theory part in order to make a predicted observation. Google was one of the first to notice this. For instance, take Google's spell checker. When you misspell a word when googling, Google suggests the proper spelling. How does it know this? How does it predict the correctly spelled word? It is not because it has a theory of good spelling, or has mastered spelling rules. In fact Google knows nothing about spelling rules at all.
    • 4 more annotations...
  • 17 Sep 08
  • 11 Aug 08
    ognjen
    ognjen s

    If you can learn how to spell without knowing anything about the rules or grammar of spelling, and if you can learn how to translate languages without having any theory or concepts about grammar of the languages you are translating, then what else can you

    science technology philosophy kkelly longtail searle

  • 22 Jul 08
  • 12 Jul 08
    • There's a dawning sense that extremely large databases of information, starting in the petabyte level, could change how we learn things. The traditional way of doing science entails constructing a hypothesis to match observed data or to solicit new data. Here's a bunch of observations; what theory explains the data sufficiently so that we can predict the next observation?
    • Google knows nothing about spelling rules at all.


      Instead Google operates a very large dataset of observations which show that for any given spelling of a word, x number of people say "yes" when asked if they meant to spell word "y."

    • 9 more annotations...
  • 11 Jul 08
    anonymous

    In the coming world of cloud computing perfectly good answers will become a commodity. The real value of the rest of science then becomes asking good questions.

    cloudcomputing petabyteage science technology

  • 10 Jul 08
  • 08 Jul 08
    • There may be something to this observation. Many sciences such as astronomy, physics, genomics, linguistics, and geology are generating extremely huge datasets and constant streams of data in the petabyte level today. They'll be in the exabyte level in a decade. Using old fashioned "machine learning," computers can extract patterns in this ocean of data that no human could ever possibly detect. These patterns are correlations. They may or may not be causative, but we can learn new things. Therefore they accomplish what science does, although not in the traditional manner.

    • The technical term for this approach in science is Data Intensive Scalable Computation (DISC). Other terms are "Grid Datafarm Architecture" or "Petascale Data Intensive Computing."
    • 5 more annotations...
  • 07 Jul 08
  • 05 Jul 08
    jurijmlotman
    Martin Lindner

    Just as we will eventually take the brain apart, neuron by neuron, and never find the model, we will discover that true AI came into existence without ever needing a coherent model or a theory of intelligence. Reality does the job just fine.

    mediatheory ai semantic_cloud deli

  • 02 Jul 08
  • myszenka
    Gosia Stergios

    This is a world where massive amounts of data and applied mathematics replace every other tool that might be brought to bear. Out with every theory of human behavior, from linguistics to sociology. Forget taxonomy, ontology, and psychology. The technical

    new_technologies cloud_computing future_web data_mining

  • 01 Jul 08
    • The Google Way of Science
  • 30 Jun 08
    • The technical term for this approach in science is Data Intensive Scalable Computation (DISC). Other terms are "Grid Datafarm Architecture" or "Petascale Data Intensive Computing." The emphasis in these techniques is the data-intensive nature of computation, rather than on the computing cluster itself. The online industry calls this approach of investigation a type of "analytics." Cloud computing companies like Google, IBM, and Yahoo(pdf), and some universities have been holding workshops on the topic. In essence these pioneers are trying to exploit cloud computing, or the OneMachine, for large-scale science. The current tools include massively parallel software platforms like MapReduce and Hadoop (see my earlier post), cheap storage, and gigantic clusters of data centers. So far, very few scientists outside of genomics are employing these new tools. The intent of the NSF's Cluster Exploratory program is to match scientists owning large databased-driven observations with computer scientists who have access and expertise with cluster/cloud computing.
  • 29 Jun 08
  • brands
    Chuck Brands

    The Google Way of Science

  • mbauwens
    Michel Bauwens

    here's a dawning sense that extremely large databases of information, starting in the petabyte level, could change how we learn things.

    Cloud-Computing P2P-Science P2P

    • Perhaps understanding and answers are overrated. "The problem with computers," Pablo Picasso is rumored to have said, "is that they only give you answers."  These huge data-driven correlative systems will give us lots of answers -- good answers -- but that is all they will give us. That's what the OneComputer does --  gives us good answers. In the coming world of cloud computing perfectly good answers will become a commodity. The real value of the rest of science then becomes asking good questions.