Map-Reduce for Machine Learning on Multicore

 

 

We are at the beginning of the multicore era. Computers will have increasingly many cores (processors), but there is still no good programming framework for these architectures, and thus no simple and unified way for machine learning to take advantage of the potential speed up. In this paper, we develop a broadly applicable parallel programming method, one that is easily applied to many different learning algorithms. Our work is in distinct contrast to the tradition in machine learning of designing (often ingenious) ways to speed up a single algorithm at a time. Specifically, we show that algorithms that fit the Statistical Query model [15] can be written in a certain “summation form,” which allows them to be easily parallelized on multicore computers. We adapt Google’s map-reduce [7] paradigm to demonstrate this parallel speed up technique on a variety of learning algorithms including locally weighted linear regression (LWLR), k-means, logistic regression (LR), naive Bayes (NB), SVM, ICA, PCA, gaussian discriminant analysis (GDA), EM, and backpropagation (NN). Our experimental results show basically linear speedup with an increasing number of processors. [via]
http://www.cs.stanford.edu/people/ang//papers...

Rating: 0/10

 

 

Note: We do not store any files or documents in our server.
 

 

Related Files

 

 

Hot Spot

Download Fast Direction-Aware Proximity for Graph Mining

Fast Direction-Aware Proximity for Graph Mining

...In this paper we study asymmetric proximity measures on directed graphs, which quantify the relationships between two nodes or two groups of nodes. The measures are useful in several graph mining tasks, including clustering, link prediction and connection subgraph discovery. Our proximity measure is based on the concept of escape pro..
... detail »
 
Download Parallel Bifold: Large-Scale Parallel Pattern Mining with Constraints

Parallel Bifold: Large-Scale Parallel Pattern Mining with Constraints

...When computationally feasible, mining huge databases produces tremendously large numbers of frequent patterns. In many cases, it is impractical to mine those datasets due to their sheer size; not only the extent of the existing patterns, but mainly the magnitude of the search space. Many approaches have suggested the use of constrain..
... detail »
 
Download Finding All Frequent Patterns Starting from the Closure

Finding All Frequent Patterns Starting from the Closure

...Efficient discovery of frequent patterns from large databases is an active research area in data mining with broad applications in industry and deep implications in many areas of data mining. Although many efficient frequent-pattern mining techniques have been developed in the last decade, most of them assume relatively small databas..
... detail »
 
Map-Reduce for Machine Learning on Multicore

Rate this Document

ADS

 

Tag Clouds

 

Last Download

 

BookShelf

 

 

Connecting Condor Pools into Computational Grids by Jini
...The paper describes how Condor-pools could be joined together to form a large computational cluster-grid. In the architecture Jini provides the infrastructure for resource lookup, while Condor manages..
 
Java 2 Platform Enterprise Edition Specification, v1.3
...Enterprises today need to extend their reach, reduce their costs, and lower the response times of their services to customers, employees, and suppliers. Typically, applications that provide these serv..