Skip to main content

Carlos Santos's Library tagged via:cshalizi   View Popular

16 Sep 09

[0909.0844] High-Dimensional Non-Linear Variable Selection through Hierarchical Kernel Learning

We consider the problem of high-dimensional non-linear variable selection for supervised learning. Our approach is based on performing linear selection among exponentially many appropriately defined positive definite kernels that characterize non-linear interactions between the original variables. To select efficiently from these many kernels, we use the natural hierarchical structure of the problem to extend the multiple kernel learning framework to kernels that can be embedded in a directed acyclic graph; we show that it is then possible to perform kernel selection through a graph-adapted sparsity-inducing norm, in polynomial time in the number of selected kernels. Moreover, we study the consistency of variable selection in high-dimensional settings, showing that under certain assumptions, our regularization framework allows a number of irrelevant variables which is exponential in the number of observations. Our simulations on synthetic datasets and datasets from the UCI repository show state-of-the-art predictive performance for non-linear regression problems.

arxiv.org/0909.0844 - Preview

MachineLearning via:cshalizi by:FrancisBach FeatureSelection KernelMethods

MIT Press Journals - Journal of Cognitive Neuroscience - Abstract

Converging evidence from humans and nonhuman primates is obliging us to abandon conventional models in favor of a radically different, distributed-network paradigm of cortical memory. Central to the new paradigm is the concept of memory network or cognit—that is, a memory or an item of knowledge defined by a pattern of connections between neuron populations associated by experience. Cognits are hierarchically organized in terms of semantic abstraction and complexity. Complex cognits link neurons in noncontiguous cortical areas of prefrontal and posterior association cortex. Cognits overlap and interconnect profusely, even across hierarchical levels (heterarchically), whereby a neuron can be part of many memory networks and thus many memories or items of knowledge.

www.mitpressjournals.org/...jocn.2009.21280 - Preview

cortex cognitivescience memory brain via:cshalizi

10 Sep 09

Causal Inference in Statistics: An Overview (Pearl, 2009)

According to Shalizi, Pearl makes a summary of everything he knows about causality

ftp.cs.ucla.edu/...r350.pdf - Preview

Causality by:JudeaPearl via:cshalizi

10 Aug 09

Probabilistic Graphical Models - The MIT Press

D. Koller and N. Friedman's book. Amazingly, at US$ 95,00 it is still one of the more affordable books on this subject (compare price and number of pages with Lauritzen's or Wainwright and Jordan's)

mitpress.mit.edu/...default.asp - Preview

GraphicalModels book via:cshalizi

14 Jul 09

[physics/9701026] Monte Carlo Implementation of Gaussian Process Models for Bayesian Regression and Classification

"Gaussian processes are a natural way of defining prior distributions over functions of one or more input variables. In a simple nonparametric regression problem, where such a function gives the mean of a Gaussian distribution for an observed response, a Gaussian process model can easily be implemented using matrix computations that are feasible for datasets of up to about a thousand cases. Hyperparameters that define the covariance function of the Gaussian process can be sampled using Markov chain methods. Regression models where the noise has a t distribution and logistic or probit models for classification applications can be implemented by sampling as well for latent values underlying the observations. Software is now available that implements these methods using covariance functions with hierarchical parameterizations. Models defined in this way can discover high-level properties of the data, such as which inputs are relevant to predicting the response. "

arxiv.org/9701026 - Preview

via:cshalizi by:RadfordNeal GaussianProcesses MCMC Bayesian statistics

28 Jun 09

Rcode

R code for some books (nonlinear models, basic statistics, etc)

www.commanster.eu/rcode.html - Preview

R book programming via:cshalizi

18 Jun 09

[0906.2885] Noisy Independent Factor Analysis Model for Density Estimation and Classification

"We consider the problem of multivariate density estimation when the unknown density is assumed to follow a particular form of dimensionality reduction, a noisy independent factor analysis (IFA) model. In this model the data are generated by a number of latent independent components having unknown distributions and are observed in Gaussian noise."

arxiv.org/0906.2885 - Preview

FactorAnalysis via:cshalizi ICA densityEstimation classification MachineLearning

05 Jun 09

[0906.0858] Monte Carlo methods in statistical physics: Mathematical foundations and strategies

Monte Carlo is a versatile and frequently used tool in statistical physics and beyond. Correspondingly, the number of algorithms and variants reported in the literature is vast, and an overview is not easy to achieve. In this pedagogical review, we start by presenting the probabilistic concepts which are at the basis of the Monte Carlo method. From these concepts the relevant free parameters--which still may be adjusted--are identified. Having identified these parameters, most of the tangled mass of methods and algorithms in statistical physics Monte Carlo can be regarded as realizations of merely a handful of basic strategies which are employed in order to improve convergence of a Monte Carlo computation. Once the notations introduced are available, many of the most widely used Monte Carlo methods and algorithms can be formulated in a few lines. In such a formulation, the core ideas are exposed and possible generalizations of the methods are less obscured by the details of a particular algorithm.

arxiv.org/0906.0858 - Preview

montecarlo via:cshalizi arxiv

26 May 09

Minimal sufficient causation and directed acyclic graphs

Notions of minimal sufficient causation are incorporated within the directed acyclic graph causal framework. Doing so allows for the graphical representation of sufficient causes and minimal sufficient causes on causal directed acyclic graphs while maintaining all of the properties of causal directed acyclic graphs. This in turn provides a clear theoretical link between two major conceptualizations of causality: one counterfactual-based and the other based on a more mechanistic understanding of causation. The theory developed can be used to draw conclusions about the sign of the conditional covariances among variables.

projecteuclid.org/DPubS - Preview

causality via:cshalizi

14 May 09

PLoS Computational Biology: Functional Brain Networks Develop from a “Local to Distributed” Organization

we combine resting state functional connectivity MRI (rs-fcMRI), graph analysis, community detection, and spring-embedding visualization techniques to analyze four separate networks defined in earlier studies. As we have previously reported, we find, across development, a trend toward ‘segregation’ (a general decrease in correlation strength) between regions close in anatomical space and ‘integration’ (an increased correlation strength) between selected regions distant in space. The generalization of these earlier trends across multiple networks suggests that this is a general developmental principle for changes in functional connectivity that would extend to large-scale graph theoretic analyses of large-scale brain networks.

www.ploscompbiol.org/...journal.pcbi.1000381 - Preview

via:cshalizi brain graphtheory communitydetection

05 May 09

Editorial: publishing economics harm science's credibility - Ars Technica

"It would be nice to think that Elsevier will listen to scientist, but I suspect that this will not happen until scientists start getting a little more strident. If you are scientist, publish your work in society journals rather than Elsevier journals. Try to avoid citing work published in Elsevier journals. Elsevier lives by a combination of pricing and impact factor, and scientists have direct control over only one of these—impact factor. Librarian could start looking at Elsevier journal usage patterns; perhaps they can follow Cornell's example, and subscribe to just a few Elsevier journals."

arstechnica.com/...-scientific-respectability.ars - Preview

elsevier publishing badscience via:cshalizi

  • It would be nice to think that Elsevier will listen to scientist, but I suspect that this will not happen until scientists start getting a little more strident. If you are scientist, publish your work in society journals rather than Elsevier journals. Try to avoid citing work published in Elsevier journals. Elsevier lives by a combination of pricing and impact factor, and scientists have direct control over only one of these—impact factor. Librarian could start looking at Elsevier journal usage patterns; perhaps they can follow Cornell's example, and subscribe to just a few Elsevier journals.
1 - 20 of 43 Next › Last »
Showing 20 items per page

Highlighter, Sticky notes, Tagging, Groups and Network: integrated suite dramatically boosting research productivity. Learn more »

Join Diigo