51 items | 6 visits
Lectures, documents, blogs, resources that will help me learn more about data analytics and big data.
Updated on Mar 17, 14
Created on Oct 15, 12
Category: Computers & Internet
URL:
CIML is a set of introductory materials that covers most major aspects of modern machine learning (supervised learning, unsupervised learning, large margin methods, probabilistic modeling, learning theory, etc.). It's focus is on broad applications with a rigorous backbone. A subset can be used for an undergraduate course; a graduate course could probably cover the entire material and then some.
The current version is 0.9 (the "beta" pre-release).
"I usually encourage my students to go through a step-by-step troubleshooting process when trying to fix misbehaving code, in which we go through these common errors one by one and see if they could be causing the problem. Today, I decided to finally write this troubleshooting process down and turn it into a flowchart in non-threatening colours."
The Split-Apply-Combine Strategy for Data
Analysis (Hadley Wickham)
"Over the next few months, I plan to write a number of posts about the role that geometry plays in the analysis of large, high-dimensional data sets."
"Here are the required inputs for a few algorithms. (For an overview, see e.g. Ch 29 of MacKay.) There are many more out there of course. I'm leaving off tuning parameters."
"The bigvis package provides tools for exploratory data analysis of large datasets (10-100 million obs)."
"Tabula is free and available under the MIT open-source license. Tabula lets you upload a (text-based) PDF file into a simple web interface and magically pull tabular data into CSV format."
Iterative Methods for Optimization
"With Vega you can describe data visualizations in a JSON format, and generate interactive views using either HTML5 Canvas or SVG. "
51 items | 6 visits
Lectures, documents, blogs, resources that will help me learn more about data analytics and big data.
Updated on Mar 17, 14
Created on Oct 15, 12
Category: Computers & Internet
URL: