Skip to main content

Swarna Srinivasan's Library tagged datamining   View Popular

07 Sep 09

AIR Lighthouse

  • Substantive knowledge of the database is captured in a sophisticated data
    structure similar to a very smart table of contents. Through the information
    maintained in this structure, the system knows how to organize the information
    for presentation to the user, which data elements are comparable across
    collections, and what types of analyses are sensible for particular data
    elements. AIR Lighthouse uses this information to guide users as they explore
    any (and only) the questions that the data can address.

AIR Lighthouse

  • Developed by the American Institutes for Research (AIR), AIR Lighthouse empowers
    the users to ask their own questions of complex datasets without specialized
    research or statistical skills. Users can create custom-run tables, graphs and
    other statistics over the Internet. This product is designed to integrate
    multiple complex surveys, assessments, or other data collections. It actually
    captures the knowledge of expert statistical analysts and stores this knowledge
    along with the data itself. To the user, the system seems to "know the data" and
    to choose the right analytic procedures. This knowledge enables the system to
    hide the technical details and sophisticated statistical procedures from the
    user, who sees only perfectly tailored answers to his or her queries
07 Jul 09

Statistical Data Mining Tutorials

  • The following links point to a set of tutorials on many aspects of
    statistical data mining, including the foundations of probability, the
    foundations of statistical data analysis, and most of the classic machine
    learning and data mining algorithms.

    These include classification algorithms such as decision trees, neural nets,
    Bayesian classifiers, Support Vector Machines and cased-based (aka
    non-parametric) learning. They include regression algorithms such as
    multivariate polynomial regression, MARS, Locally Weighted Regression, GMDH and
    neural nets. And they include other data mining operations such as clustering
    (mixture models, k-means and hierarchical), Bayesian networks and Reinforcement
    Learning.

Probability for Data Miners

  • Tutorial Slides by Andrew
    Moore


    This tutorial reviews Probability starting right at ground level. It is,
    arguably, a useful investment to be completely happy with probability before
    venturing into advanced algorithms from data mining, machine learning or applied
    statistics. In addition to setting the stage for techniques to be used over and
    over again throughout the remaining tutorials, this tutorial introduces the
    notion of Density Estimation as an important operation, and then introduces
    Bayesian Classifiers such as the overfitting-prone Joint-Density Bayes
    Classifier, and the over-fitting-resistant Naive Bayes Classifier.


    Download Tutorial Slides (PDF format)

24 Jun 09

Email patterns can predict impending doom - tech - 22 June 2009 - New Scientist

EMAIL logs can provide advance warning of an organisation reaching crisis point. That's the tantalising suggestion to emerge from the pattern of messages exchanged by Enron employees.\n\nAfter US energy giant Enron collapsed in December 2001, federal investigators obtained records of emails sent by around 150 senior staff during the company's final 18 months. The logs, which record 517,000 emails sent to around 15,000 employees, provide a rare insight into how communication within an organisation changes during stressful times.\n\nBen Collingsworth and Ronaldo Menezes at the Florida Institute of Technology in Melbourne identified key events in Enron's demise, such as the August 2001 resignation of CEO Jeffrey Skilling. They then examined the number of emails sent, and the groups that exchanged the messages, in the period around these events. They did not look at the emails' content.\n\nMenezes says he expected communication networks to change during moments of crisis. Yet the researchers found that the biggest changes actually happened around a month before. For example, the number of active email cliques, defined as groups in which every member has had direct email contact with every other member, jumped from 100 to almost 800 around a month before the December 2001 collapse. Messages were also increasingly exchanged within these groups and not shared with other employees.\n\nMenezes thinks he and Collingsworth may have identified a characteristic change that occurs as stress builds within a company: employees start talking directly to people they feel comfortable with, and stop sharing information more widely. They presented their findings at the International Workshop on Complex Networks, held last month in Catania, Italy.

www.newscientist.com/...an-predict-impending-doom.html - Preview

datamining HR2.0

17 Jun 09

Finding High-Potential Customers | Nielsen Wire

Now more than ever, marketers are trying to identify and reach profitable and discrete market segments that were previously overlooked in an effort to keep growing in difficult economic conditions. Predictive analytics is one way to do that: it marries a range of consumer, transaction and media information and reveals the "who" and "how" of an effective go-to-market strategy.\n\nPredictive analytics can help marketers in several ways: It can provide a deeper understanding of customers to guide the marketing spend; it offers guidance on selecting messages and vehicles for communicating with them, and; it provides key metrics for measuring campaign impact.\n\nNielsen teamed up with Experian, the credit reporting agency, and dLife, the leading online resource for diabetics, to develop a program targeting the diabetes community. Diabetes accounts for 31 percent of health care costs in the U.S. and approximately $175 billion in spending. Using diabetics' common interest - in this case - how their disease shapes their lives and influences their food purchasing patterns - the team was able to pinpoint high-value consumer targets, determine the best retail channels to reach them and deliver marketing materials that were specific to their needs.\n\nRead the full case study on how predictive analytics shaped this campaign and can uncover the hidden treasure of consumer value in the current edition of Consumer Insight.\nTags: consumer behavior, Consumer Insight, consumer research, diabetes, marketing analytics, pred

blog.nielsen.com/...nding-high-potential-customers - Preview

predictive analytics datamining prediction market

09 May 09

Annals of Innovation: How David Beats Goliath: Reporting & Essays: The New Yorker

  • He would conduct business on the basketball court, he decided, the same way he
    conducted business at his software firm. He would speak calmly and softly, and
    convince the girls of the wisdom of his approach with appeals to reason and
    common sense.
  • He would never forget the first time he saw a basketball game. He thought it was
    mindless.
  • 14 more annotations...
07 Apr 09

MIT OpenCourseWare | Sloan School of Management | 15.062 Data Mining, Spring 2003 | Study Materials

  • Study Materials



    <!-- start of repeating body copy -->

    XLMiner Software (Excel Add-in)  
02 Apr 09

The knees have it | Knobbly ID | The Economist

  • Lior Shamir, a geneticist at the National Institutes of Health in Maryland,
    has developed a knee-analysing mathematical algorithm for medical use.
    Algorithms are used by computers to analyse knee images in order to compare and
    contrast tiny structures in the joint that might indicate diseases like
    osteoarthritis. Computers make this work less labour-intensive. Dr Shamir and
    his colleagues now think his algorithm could identify individuals as well.

  • To find out, they used the algorithm to explore X-ray images of the general
    structure of various knees and then to look in greater detail by measuring the
    texture of the bone by monitoring differences in individual picture points,
    called pixels. The researchers found that analysing fine details at this level
    was roughly equivalent to studying fingerprints
  • 1 more annotations...
19 Mar 09

Genetic Future : Why biology students should learn how to program

  • I'd agree that biological data-sets can't compete with particle physicists in
    terms of sheer scale, although the speed with which they are accumulating is alarming.
    Where biological data-sets really become intimidating is in their diversity, in
    the complexity of the underlying processes, and in the levels of noise and bias.
    I suspect a lot of people used to dealing with extremely large data-sets would
    still balk at the complexity of computational biology once they dug a little
    deeper, particularly in a few years' time.
  • That said, such tools and databases, however powerful, will always lag
    substantially behind the science
    . For young biologists who want to work
    right at the cutting edge - which will require dealing directly with rapidly
    changing technologies, generating biological data at an increasingly dizzying
    pace and in constantly evolving formats - solid informatic skills, including
    at least basic programming and sound statistical knowledge, will make you
    a far more productive scientist
    .
  • 1 more annotations...

Sense Networks

  • Users who go to rock clubs see rock club hotspots, users who frequent
    hip-hop clubs see hip-hop hotspots, and those who go to both see both. The
    question "where is everybody like me right now?" is thus answered for these
    users – even in a city they've never visited before.

Sense Networks

need to track this. machine learning algorithms and data mining methods

www.sensenetworks.com - Preview

datamining location data analytics predictors mobile machine learning

09 Mar 09

NASA-Cisco climate project to flash 'Planetary Skin' - NYTimes.com

  • NASA and Cisco (Nasdaq: CSCO) will develop the online collaborative
    platform to process data from satellite, airborne and sea- and land-based
    sensors around the globe
  • The goal is to translate the data into information that governments and
    businesses can use to mitigate and adapt to climate change and manage energy and
    natural resources more effectively, NASA and Cisco officials explained in
    interviews.
  • 2 more annotations...
28 Feb 09

Crowds are good | The kindness of crowds | The Economist

  • . Rather, they have accidentally gathered a huge body of data on how people
    behave, and particularly on how they behave in situations where violence is in
    the air. This means that hypotheses about violent behaviour which could not be
    tested experimentally for practical or ethical reasons, can now be examined in a
    scientific way. And it is that which may help violence to be controlled.
  • Virtual reality may thus allow Dr Levine to understand the collective
    choreography of violence better than he does now, but he is already convinced
    that, despite the moral panic over violence in Britain today, the influence of
    groups is largely benign. His work could have practical consequences, since
    police generally aim to break crowds up. If he is right, that approach may
    unintentionally lead to more fights. It sounds counter-intuitive, but many of
    the best ideas are. And if it is true, then perhaps Big Brother could give up
    the CCTV habit and go and do something more useful instead.
1 - 20 of 26 Next ›
Showing 20 items per page

Highlighter, Sticky notes, Tagging, Groups and Network: integrated suite dramatically boosting research productivity. Learn more »

Join Diigo