VMware’s Serengeti Brings Hadoop to Virtual, Cloud Environments

Hadoop is a framework for reliably running applications on large hardware clusters. Many large enterprises (such as Facebook and IBM) have come to rely on it as a vital part of their respective data-crunching infrastructures. Research firm IDC recently predicted that worldwide revenues from Hadoop and MapReduce, another framework for processing problems across huge datasets, could hit $812.8 million in 2016, a significant uptick from $77 million in revenues last year.

via VMware’s Serengeti Brings Hadoop to Virtual, Cloud Environments.

VMware has positioned Serengeti as a “one click” deployment toolkit that, when used in conjunction with its vSphere platform, can deploy an enterprise-level Hadoop cluster in a matter of minutes. The company claims that vSphere’s virtualization capabilities will boost the “availability and manageability” of Hadoop clusters.

Project Serengeti

Serengeti is an open source project initiated by VMware to enable the rapid deployment of an Apache Hadoop cluster HDFS, MapReduce, Pig, Hive, .. on a virtual platform.

Serengeti 0.5 currently supports vSphere, with the ability to support other platforms. The project is at an early stage, and is endorsed by all major Hadoop distributions including Cloudera, Greenplum, Hortonworks and MapR.

via Project Serengeti.

OpenStack Storage

OpenStack Object Storage (code-named Swift) is open source software for creating redundant, scalable object storage using clusters of standardized servers to store petabytes of accessible data. It is not a file system or real-time data storage system, but rather a long-term storage system for a more permanent type of static data that can be retrieved, leveraged, and then updated if necessary. Primary examples of data that best fit this type of storage model are virtual machine images, photo storage, email storage and backup archiving. Having no central “brain” or master point of control provides greater scalability, redundancy and permanence.

via OpenStack Storage » OpenStack Open Source Cloud Computing Software.

Coolest jobs in tech (literally): running a South Pole data center

That mission demands a level of reliability that many less remote data centers cannot provide. Raytheon Polar Services held the National Science Foundation’s Antarctic programs support contract until April. As Dennis Gitt, a former director of IT and communications services for the company puts it, a failure anywhere in the Antarctic systems could lose data from events in space that may not be seen again for millennia.

via Coolest jobs in tech (literally): running a South Pole data center.

With a maximum population of 150 at the base during the Austral summer, South Pole IT professionals-in-residence are limited to a select few. And they don’t get to stay long—most of the WIPAC IT team only stays for a few months in the summer, during which they have to complete all planned IT infrastructure projects.

IBM Parallel Sysplex

In computing, a Parallel Sysplex is a cluster of IBM mainframes acting together as a single system image with z/OS. Used for disaster recovery, Parallel Sysplex combines data sharing and parallel computing to allow a cluster of up to 32 systems to share a workload for high performance and high availability.

via IBM Parallel Sysplex – Wikipedia, the free encyclopedia.

Managed DNS Advanced Feature:Active Failover

Datacenter and/or server failures are no fun for anyone, especially those responsible for website operations. If you’ve protected yourself by using Active Failover — an advanced feature available for DynECT Managed DNS users — your site will remain live and accessible without any of your visitors knowing the difference.

via Managed DNS Advanced Feature:Active Failover – Dyn.

Corosync

http://corosync.org/doku.php

The Corosync Cluster Engine is a Group Communication System with additional features for implementing high availability within applications. The project provides four C Application Programming Interface features:

  • A closed process group communication model with virtual synchrony guarantees for creating replicated state machines.
  • A simple availability manager that restarts the application process when it has failed.
  • A configuration and statistics in-memory database that provide the ability to set, retrieve, and receive change notifications of information.
  • A quorum system that notifies applications when quorum is achieved or lost.

Our project is used as a High Availability framework by projects such as Apache Qpid and Pacemaker.

We are always looking for developers or users interested in clustering or participating in our project.

The project is hosted by Fedora Hosted and the The Linux Foundation.

Fencing and Stonith

Fencing is a very important concept in computer clusters for HA (High Availability). Unfortunately, given that fencing does not offer a visible service to users, it is often neglected.

Fencing may be defined as a method to bring an HA cluster to a known state. But, what is a “cluster state” after all? To answer that question we have to see what is in the cluster.

via Fencing and Stonith.

STONITH (Shoot The Other Node In The Head)

Stonith is our fencing implementation. It provides the node level fencing.

Gotta love how they come up with those acronyms.  🙂