Using Hadoop

221
vote

Ever wonder how Google and Yahoo! can process millions of queries a day and get results back to you in usually less than a second. One tool they both use is called "Hadoop" . Hadoop is a software tool that runs on small to very large compute clusters (>2000 nodes). Hadoop is able to take an application and divide it into many small fragments of work, each of which may be executed or reexecuted on any node in the cluster. Each bit of data is processed separately and then brought back together using a Map/Reduce methodology. Hadoop goes a step further by providing its own distributed file system that stores data from each application across the various nodes within the cluster allowing for even greater aggregate bandwidth.

This system allows for very high bandwidth and helps mitigate failures in a system. If a node within the cluster fails the job just gets submitted again to run on another node within the cluster. Compute nodes can be distributed among campuses which allows a large aggregate of compute resources to work together to perform work together.

I can see this tool being used to do more than just search. It could be applied to other applications that need high performance computing. I can see a lot of bioinformatics/cheminformatics, oil discovery services, and financial based application programs being able to take advantage of such a framework.

Hadoop would probably fall into the Grid computing category. See this post on the advantages of using a Grid model.

It is free and available from Apache.org.

Vassilios
--
http://www.outervillage.com

_____________________

Vassilios
Co-Founder
OuterVillage.com
http://outervillage.com

If you enjoyed this posting please subscribe to our RSS feed or submit it to your favorite social networks.

None
A comma-separated list of terms describing this content. Example: funny, bungee jumping, "Company, Inc.".

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
  • Youtube and google video links are automatically converted into embedded videos.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
3 + 11 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.
website statistics