Skip to main contentdfsdf

Kiran Kuppa's List: statisics

  • Mar 10, 11

    While there are a few existing online explanations of Bayes' Theorem, my experience with trying to introduce people to Bayesian reasoning is that the existing online explanations are too abstract.  Bayesian reasoning is very counterintuitive.  People do not employ Bayesian reasoning intuitively, find it very difficult to learn Bayesian reasoning when tutored, and rapidly forget Bayesian methods once the tutoring is over.  This holds equally true for novice students and highly trained professionals in a field.  Bayesian reasoning is apparently one of those things which, like quantum mechanics or the Wason Selection Test, is inherently difficult for humans to grasp with our built-in mental faculties.

    Or so they claim.  Here you will find an attempt to offer an intuitive explanation of Bayesian reasoning - an excruciatingly gentle introduction that invokes all the human ways of grasping numbers, from natural frequencies to spatial visualization.  The intent is to convey, not abstract rules for manipulating numbers, but what the numbers mean, and why the rules are what they are (and cannot possibly be anything else).  When you are finished reading this page, you will see Bayesian problems in your dreams.

  • Mar 10, 11

    One of the easiest ways to understand probabilities is to think of them in terms of Venn Diagrams. You basically have a Universe with all the possible outcomes (of an experiment for instance), and you are interested in some subset of them, namely some event. Say we are studying cancer, so we observe people and see whether they have cancer or not. If we take as our Universe all people participating in our study, then there are two possible outcomes for any particular individual, either he has cancer or not. We can then split our universe in two events: the event "people with cancer" (designated as A), and "people with no cancer" (or ~A). We could build a diagram like this:

  • Nov 23, 11

    I'm a programmer with a decent background in math and computer science. I've studied computability, graph theory, linear algebra, abstract algebra, algorithms, and a little probability and statistics (through a few CS classes) at an undergraduate level.

    I feel, however, that I don't know enough about statistics. Statistics are increasingly useful in computing, with statistical natural language processing helping fuel some of Google's algorithms for search and machine translation, with performance analysis of hardware, software, and networks needing proper statistical grounding to be at all believable, and with fields like bioinformatics becoming more prevalent every day.

  • Nov 23, 11

    As a software engineer, I'm interested in topics such as statistical algorithms, data mining, machine learning, Bayesian networks, classification algorithms, neural networks, Markov chains, Monte Carlo methods, and random number generation.

    I personally haven't had the pleasure of working hands-on with any of these techniques, but I have had to work with software that, under the hood, employed them and would like to know more about them, at a high level. I'm looking for books that cover a great breadth - great depth is not necessary at this point. I think that I can learn a lot about software development if I can understand the mathematical foundations behind the algorithms and techniques that are employed.

    Can the Statistical Analysis community recommend books that I can use to learn more about implementing various statistical elements in software?

  • Dec 02, 11

    R is an elegant and comprehensive statistical and graphical programming language. Unfortunately, it can also have a steep learning curve. I created this website for both current R users, and experienced users of other statistical packages (e.g., SAS, SPSS, Stata) who would like to transition to R. My goal is to help you quickly access this language in your work.

    I assume that you are already familiar with the statistical methods covered and instead provide you with a roadmap and the code necessary to get started quickly, and orient yourself for future learning. I designed this web site to be an easily accessible reference.

  • Dec 25, 11

    The purpose of a measure of similarity is to compare two lists of numbers (i.e. vectors), and compute a single number which evaluates their similarity. Most measures were developed in the context of comparing pairs of variables (such as income or attitude toward abortion) across cases (such as respondents in a survey). In other words, the objective is to determine to what extent two variables co-vary, which is to say, have the same values for the same cases

  • Feb 10, 12

    The following lists of further reading are provided for each of the Core technical subjects. The exams for each subject will be based on the relevant syllabus and core reading, and the ActEd course material will be the main source of tuition for students

  • Feb 21, 12

    Think Stats: Probability and Statistics for Programmers

    Allen B. Downey

  • Jun 16, 12

    This page is intended to assist students and professionals pursuing a career in the actuarial profession in preparing for the actuarial exams. The long term objective is to provide textbooks for most of the exams offered by both the Society of Actuaries and the Casualty Actuarial Society. All these books are free of charge and are available to the publi

  • Sep 02, 12

    This is a General Statistics Curriculum E-Book, which includes Advanced-Placement (AP) materials.This is an Internet-based probability and statistics E-Book. The materials, tools and demonstrations presented in this E-Book would be very useful for advanced-placement (AP) statistics educational curriculum. The E-Book is initially developed by the UCLA Statistics Online Computational Resource (SOCR). However, all statistics instructors, researchers and educators are encouraged to contribute to this project and improve the content of these learning materials.
    There are 4 novel features of this specific Statistics EBook. It is community-built, completely open-access (in terms of use and contributions), blends information technology, scientific techniques and modern pedagogical concepts, and is multilingual.

  • Sep 16, 12

    "Introduction to statistics. Will eventually cover all of the major topics in a first-year statistics course (not there yet!)"

  • Jan 30, 13

    These are a set of video lectures by Prof. Yaser S. Abu Mostafa of Caltech on Statistical Learning Theory that accompany his book "Learning from Data". The topics covered in brief are:
    1.Bayesian Learning
    2. Bin Model
    3. Data Snooping
    4. Ensemble Learning
    5. Gradient Descent
    6. Learning Curves (Regression)
    7. Neural Networks
    8. Overfitting problem
    9. Radial basis functions and Regularization
    10. Support Vector Machines
    11. VC Dimension

  • Jan 13, 14

    "The articles on the left provide an introduction to R for people who are already familiar with other programming languages."

  • Mar 24, 14

    This is a draft textbook on data analysis methods, intended for a one-semester course for advance undergraduate students who have already taken classes in probability, mathematical statistics, and linear regression. Contents

    I. Regression and Its Generalizations

    1.Regression Basics
    2.The Truth about Linear Regression
    3.Model Evaluation
    4.Smoothing in Regression
    5.Simulation
    6.The Bootstrap
    7.Weighting and Variance
    8.Splines
    9.Additive Models
    10.Testing Regression Specifications
    11.More about Hypothesis Testing
    12.Logistic Regression
    13.Generalized Linear Models and Generalized Additive Models

    II. Multivariate Data, Distribution Estimates, and Latent Structure

    14.Multivariate Distributions
    15.Density Estimation
    16.Relative Distributions and Smooth Tests
    17.Principal Components Analysis
    18.Factor Analysis
    19.Mixture Models
    20.Graphical Models

    III. Causal Inference

    21.Graphical Causal Models
    22.Identifying Causal Effects
    23.Estimating Causal Effects
    24.Discovering Causal Structure

    IV. Dependent Data

    25.Time Series
    26.Time Series with Latent Variables
    27.Longitudinal, Spatial and Network Data

  • Mar 28, 14

    CIML is a set of introductory materials that covers most major aspects of modern machine learning (supervised learning, unsupervised learning, large margin methods, probabilistic modeling, learning theory, etc.). It's focus is on broad applications with a rigorous backbone. A subset can be used for an undergraduate course; a graduate course could probably cover the entire material and then some.

  • Apr 05, 14

    Welcome to our new online textbook on forecasting. This book is a replacement for Makridakis, Wheelwright and Hyndman (Wiley 1998).

    This textbook is intended to provide a comprehensive introduction to forecasting methods and to present enough information about each method for readers to be able to use them sensibly. We don’t attempt to give a thorough discussion of the theoretical details behind each method, although the references at the end of each chapter will fill in many of those details. The book is written for three audiences: (1) people finding themselves doing forecasting in business when they may not have had any formal training in the area; (2) undergraduate students studying business; (3) MBA students doing a forecasting elective. We use it ourselves for a second-year subject for students undertaking a Bachelor of Commerce degree at Monash University, Australia.

  • Aug 15, 15

    "One of the first things a scientist hears about statistics is that there is are two different approaches: frequentism and Bayesianism. Despite their importance, many scientific researchers never have opportunity to learn the distinctions between them and the different practical approaches that result. The purpose of this post is to synthesize the philosophical and pragmatic aspects of the frequentist and Bayesian approaches, so that scientists like myself might be better prepared to understand the types of data analysis people do.

    I'll start by addressing the philosophical distinctions between the views, and from there move to discussion of how these ideas are applied in practice, with some Python code snippets demonstrating the difference between the approaches. "

1 - 18 of 18
20 items/page
List Comments (0)