Skip to main content

Carlos Santos's Library tagged mapreduce   View Popular

Ivory: A Hadoop toolkit for web-scale information retrieval

"Ivory is a Hadoop toolkit for web-scale information retrieval research that features a retrieval engine based on Markov Random Fields, appropriately named SMRF (Searching with Markov Random Fields). This open-source project began in Spring 2009 and represents a collaboration between the University of Maryland and Yahoo! Research. Ivory takes full advantage of the Hadoop distributed environment (the MapReduce programming model and the underlying distributed file system) for both indexing and retrieval. The current release of Ivory (release 0.2) works with Hadoop 0.20.1 (and requires certain features only found in that release). Ivory also uses Cloud9, a MapReduce library for Hadoop developed at the University of Maryland (currently also at release 0.2)."

www.umiacs.umd.edu/...index.html - Preview

hadoop MapReduce InformationRetrieval

19 Oct 09

MapReduce Online! (and some gimmes) « Data Beta

"Introducing HOP: the Hadoop Online Prototype. With modest changes to the structure of Hadoop, we were able to convert it from a batch-processing system to an interactive, online system that can provide features like “early returns” from big jobs, and continuous data stream processing, while preserving the simple MapReduce programming and fault tolerance models popularized by Google and Hadoop. And by the way, it exposes pipeline parallelism that can even make batch jobs finish faster. This is a project led by Tyson Condie, in collaboration with folks at Berkeley and Yahoo! Research."

databeta.wordpress.com/...mapreduce-online - Preview

MapReduce hadoop streaming

10 Jul 09

Hadoop Studio

Hadoop Studio is a map-reduce development environment (IDE) based on Netbeans. It makes it easy to create, understand and debug map-reduce applications based on Hadoop, without requiring development-time access to a map-reduce cluster.

www.hadoopstudio.org - Preview

hadoop mapreduce tools development netbeans HadoopStudio IDE via:pskomoroch

01 Jun 09

Cloudera Hadoop & Big Data Blog » Blog Archive » Introducing Sqoop

Sqoop (”SQL-to-Hadoop”) is a straightforward command-line tool with the following capabilities:

* Imports individual tables or entire databases to files in HDFS
* Generates Java classes to allow you to interact with your imported data
* Provides the ability to import from SQL databases straight into your Hive data warehouse

www.cloudera.com/...introducing-sqoop - Preview

cloudera sqoop hadoop mapreduce

17 May 09

22S:295-HPC Home Page

High Performance Computing in Statistics: course notes; uses R running on a cluster

www.stat.uiowa.edu/...295-hpc - Preview

mapreduce R statistics via:pskomoroch course parallel distributed

Hadoop User Group UK: HUGUK #2 - Wrap up

Practical MapReduce - (Tom White, Cloudera) video, slides
Introducing Apache Mahout - (Isabel Drost, ASF) video, slides

huguk.org/...huguk-2-wrap-up.html - Preview

apache mapreduce hadoop cloudera towatch mahout

13 Apr 09

Hadoop Training: Virtual Machine | Cloudera

  • In order to make it easy for you to get started with Hadoop and complete our various training exercises, we have created a virtual machine with everything you need. The VM includes Cloudera's Distribution for Hadoop, all of our example code, as well as eclipse and other standard tools
02 Apr 09

Amazon Elastic MapReduce

  • Amazon Elastic MapReduce is a web service
  • t utilizes a hosted Hadoop framework running on the web-scale infrastructure of Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Simple Storage Service (Amazon S3)
1 - 20 of 51 Next › Last »
Showing 20 items per page

Highlighter, Sticky notes, Tagging, Groups and Network: integrated suite dramatically boosting research productivity. Learn more »

Join Diigo