VMware’s Serengeti Brings Hadoop to Virtual, Cloud Environments

Hadoop is a framework for reliably running applications on large hardware clusters. Many large enterprises (such as Facebook and IBM) have come to rely on it as a vital part of their respective data-crunching infrastructures. Research firm IDC recently predicted that worldwide revenues from Hadoop and MapReduce, another framework for processing problems across huge datasets, could hit $812.8 million in 2016, a significant uptick from $77 million in revenues last year.

via VMware’s Serengeti Brings Hadoop to Virtual, Cloud Environments.

VMware has positioned Serengeti as a “one click” deployment toolkit that, when used in conjunction with its vSphere platform, can deploy an enterprise-level Hadoop cluster in a matter of minutes. The company claims that vSphere’s virtualization capabilities will boost the “availability and manageability” of Hadoop clusters.

Project Serengeti

Serengeti is an open source project initiated by VMware to enable the rapid deployment of an Apache Hadoop cluster HDFS, MapReduce, Pig, Hive, .. on a virtual platform.

Serengeti 0.5 currently supports vSphere, with the ability to support other platforms. The project is at an early stage, and is endorsed by all major Hadoop distributions including Cloudera, Greenplum, Hortonworks and MapR.

via Project Serengeti.

Hadoop Tutorial

Welcome to the Yahoo! Hadoop tutorial! This series of tutorial documents will walk you through many aspects of the Apache Hadoop system. You will be shown how to set up simple and advanced cluster configurations, use the distributed file system, and develop complex Hadoop MapReduce applications. Other related systems are also reviewed.

via Hadoop Tutorial – YDN.

Yahoo Challenges Apple with a Cocktail of Mobile Publishing Tools

It turns out that Yahoo (NASDAQ: YHOO) has ambitious plans to help publishers get more efficient about how they push content out to mobile devices. Specifically, Yahoo wants to become the new middleman of the mobile publishing world, giving media companies software that they could use to reach users of iPhones, Android devices, Windows phones, and other gadgets without having to bow to the programming approaches favored by their powerful makers—namely Apple, Google, and Microsoft.

via Yahoo Challenges Apple with a Cocktail of Mobile Publishing Tools | Xconomy.

The first thing you need to understand about Yahoo’s publishing vision is that it’s coming from the Platform Technology Group. This is the same part of the company that created and then open-sourced key technologies that are now part of the Web’s infrastructure, such as Hadoop, which allows companies to run big, distributed software systems,

Welcome to Apache™ Hadoop™!

The Apache™ Hadoop™ project develops open-source software for reliable, scalable, distributed computing.

The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using a simple programming model. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-avaiability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-availabile service on top of a cluster of computers, each of which may be prone to failures.

via Welcome to Apache™ Hadoop™!.