Carlos Santos's Library tagged → View Popular
[0909.0844] High-Dimensional Non-Linear Variable Selection through Hierarchical Kernel Learning
We consider the problem of high-dimensional non-linear variable selection for supervised learning. Our approach is based on performing linear selection among exponentially many appropriately defined positive definite kernels that characterize non-linear interactions between the original variables. To select efficiently from these many kernels, we use the natural hierarchical structure of the problem to extend the multiple kernel learning framework to kernels that can be embedded in a directed acyclic graph; we show that it is then possible to perform kernel selection through a graph-adapted sparsity-inducing norm, in polynomial time in the number of selected kernels. Moreover, we study the consistency of variable selection in high-dimensional settings, showing that under certain assumptions, our regularization framework allows a number of irrelevant variables which is exponential in the number of observations. Our simulations on synthetic datasets and datasets from the UCI repository show state-of-the-art predictive performance for non-linear regression problems.
MIT Press Journals - Journal of Cognitive Neuroscience - Abstract
Converging evidence from humans and nonhuman primates is obliging us to abandon conventional models in favor of a radically different, distributed-network paradigm of cortical memory. Central to the new paradigm is the concept of memory network or cognit—that is, a memory or an item of knowledge defined by a pattern of connections between neuron populations associated by experience. Cognits are hierarchically organized in terms of semantic abstraction and complexity. Complex cognits link neurons in noncontiguous cortical areas of prefrontal and posterior association cortex. Cognits overlap and interconnect profusely, even across hierarchical levels (heterarchically), whereby a neuron can be part of many memory networks and thus many memories or items of knowledge.
Causal Inference in Statistics: An Overview (Pearl, 2009)
According to Shalizi, Pearl makes a summary of everything he knows about causality
Probabilistic Graphical Models - The MIT Press
D. Koller and N. Friedman's book. Amazingly, at US$ 95,00 it is still one of the more affordable books on this subject (compare price and number of pages with Lauritzen's or Wainwright and Jordan's)
[physics/9701026] Monte Carlo Implementation of Gaussian Process Models for Bayesian Regression and Classification
"Gaussian processes are a natural way of defining prior distributions over functions of one or more input variables. In a simple nonparametric regression problem, where such a function gives the mean of a Gaussian distribution for an observed response, a Gaussian process model can easily be implemented using matrix computations that are feasible for datasets of up to about a thousand cases. Hyperparameters that define the covariance function of the Gaussian process can be sampled using Markov chain methods. Regression models where the noise has a t distribution and logistic or probit models for classification applications can be implemented by sampling as well for latent values underlying the observations. Software is now available that implements these methods using covariance functions with hierarchical parameterizations. Models defined in this way can discover high-level properties of the data, such as which inputs are relevant to predicting the response. "
[0906.2885] Noisy Independent Factor Analysis Model for Density Estimation and Classification
"We consider the problem of multivariate density estimation when the unknown density is assumed to follow a particular form of dimensionality reduction, a noisy independent factor analysis (IFA) model. In this model the data are generated by a number of latent independent components having unknown distributions and are observed in Gaussian noise."
[0906.0858] Monte Carlo methods in statistical physics: Mathematical foundations and strategies
Monte Carlo is a versatile and frequently used tool in statistical physics and beyond. Correspondingly, the number of algorithms and variants reported in the literature is vast, and an overview is not easy to achieve. In this pedagogical review, we start by presenting the probabilistic concepts which are at the basis of the Monte Carlo method. From these concepts the relevant free parameters--which still may be adjusted--are identified. Having identified these parameters, most of the tangled mass of methods and algorithms in statistical physics Monte Carlo can be regarded as realizations of merely a handful of basic strategies which are employed in order to improve convergence of a Monte Carlo computation. Once the notations introduced are available, many of the most widely used Monte Carlo methods and algorithms can be formulated in a few lines. In such a formulation, the core ideas are exposed and possible generalizations of the methods are less obscured by the details of a particular algorithm.
Minimal sufficient causation and directed acyclic graphs
Notions of minimal sufficient causation are incorporated within the directed acyclic graph causal framework. Doing so allows for the graphical representation of sufficient causes and minimal sufficient causes on causal directed acyclic graphs while maintaining all of the properties of causal directed acyclic graphs. This in turn provides a clear theoretical link between two major conceptualizations of causality: one counterfactual-based and the other based on a more mechanistic understanding of causation. The theory developed can be used to draw conclusions about the sign of the conditional covariances among variables.
PLoS Computational Biology: Functional Brain Networks Develop from a “Local to Distributed” Organization
we combine resting state functional connectivity MRI (rs-fcMRI), graph analysis, community detection, and spring-embedding visualization techniques to analyze four separate networks defined in earlier studies. As we have previously reported, we find, across development, a trend toward ‘segregation’ (a general decrease in correlation strength) between regions close in anatomical space and ‘integration’ (an increased correlation strength) between selected regions distant in space. The generalization of these earlier trends across multiple networks suggests that this is a general developmental principle for changes in functional connectivity that would extend to large-scale graph theoretic analyses of large-scale brain networks.
Editorial: publishing economics harm science's credibility - Ars Technica
"It would be nice to think that Elsevier will listen to scientist, but I suspect that this will not happen until scientists start getting a little more strident. If you are scientist, publish your work in society journals rather than Elsevier journals. Try to avoid citing work published in Elsevier journals. Elsevier lives by a combination of pricing and impact factor, and scientists have direct control over only one of these—impact factor. Librarian could start looking at Elsevier journal usage patterns; perhaps they can follow Cornell's example, and subscribe to just a few Elsevier journals."
-
It would be nice to think that Elsevier will listen to scientist, but I suspect that this will not happen until scientists start getting a little more strident. If you are scientist, publish your work in society journals rather than Elsevier journals. Try to avoid citing work published in Elsevier journals. Elsevier lives by a combination of pricing and impact factor, and scientists have direct control over only one of these—impact factor. Librarian could start looking at Elsevier journal usage patterns; perhaps they can follow Cornell's example, and subscribe to just a few Elsevier journals.
Selected Tags
Related Tags
Sponsored Links
Top Contributors
Highlighter, Sticky notes, Tagging, Groups and Network: integrated suite dramatically boosting research productivity. Learn more »
Join Diigo
