In 1993/1994, at NASA’s Goddard Space Flight Center, Donald Becker and Thomas Sterling designed a Commodity Off The Shelf (COTS) supercomputer: Beowulf. Since they couldn’t afford a traditional supercomputer, they built a cluster computer made up of 16 Intel 486 DX4 processors, which were connected by channel bonded Ethernet. This Beowulf supercomputer was an instant success.
Source: Linux totally dominates supercomputers | ZDNet
Linux first appeared on the Top500 in 1998. Before Linux took the lead, Unix was supercomputing’s top operating system. Since 2003, the Top500 was on its way to Linux domination. By 2004, Linux had taken the lead for good.
The problem of reliable distributed storage is arguably even more historically challenging than distributed consensus. In the algorithms required to implement distributed storage correctly, mistakes can have serious consequences. Data sets in distributed storage systems are often extremely large, and storage errors may propagate alarmingly while remaining difficult to detect. The burgeoning size of this data is also changing the way we create backups, archives, and other fail-safe measures to protect agains
Source: Presenting Torus: A modern distributed storage system by CoreOS
These real-time applications, according to Donna Dillenberger, a distinguished engineer at IBM’s Watson lab, can be done in a mainframe environment. They are not yet possible on clusters of smaller, industry-standard computers, she said. But there are several open-source software projects, like Apache Spark, that focus on real-time data processing across large numbers of computers.
via IBM Introduces z13, a Mainframe for the Smartphone Economy – NYTimes.com.
He estimates the total cost of ownership including hardware, software and labor will be 50 percent less with a mainframe than on his “sprawling server farm,” given the growing complexity of managing hardware and software from several suppliers.
No matter how you slice it, the database market is massive and evolving. It’s also a market that has received a disproportionate share of VC investment, with VCs plowing funding into a long list of database related market segments including: NoSQL, Hadoop, graph databases, open-source SQL, cloud-based databases, visualization, etc. But for all of that innovation, the process of setting up and running very large database remains either expensive or complicated. Expensive because large databases still often require expensive hardware and/or licenses. Complicated because setting up a massive cluster of commodity machines to run a database requires a ton of administrative work and expertise that not a lot of people have. It’s this administrative complexity that Crate is out to eliminate – and that’s the real story behind the investment: the democratization of database cluster management. Crate’s real claim to fame is that it allows developers – any developer – to easily set up a massively scalable data store on commodity hardware with sub-second query latency simply and within minutes.
via Democratizing the Datastore: Why we invested in Crate | Yankee Sabra Limey.
Apache Mesos is a cluster manager that provides efficient resource isolation and sharing across distributed applications, or frameworks. It can run Hadoop, MPI, Hypertable, Spark (a new framework for low-latency interactive and iterative jobs), and other applications. Mesos is open source in the Apache Incubator.
via Apache Mesos: Dynamic Resource Sharing for Clusters.
Mesos is being used to manage clusters at Twitter, AirBnb, Conviva, UC Berkeley, and UC San Francisco.
At first sight, the relatively low performance per core of ARM CPUs seems like a bad match for servers. The dominant CPU in the server market is without doubt Intel’s Xeon. The success of the Xeon family is largely rooted in its excellent single-threaded (or per core) performance at moderate power levels (70-95W). Combine this exceptional single-threaded performance with a decent core count and you get good performance in almost any kind of application. Economies of scale and the resulting price levels are also very important, but the server market has been more than willing to pay a little extra if the response times are lower and the energy bills moderate.
via AnandTech | Calxeda’s ARM server tested.
As usual another thorough review from Anandtech. Below is another interesting architectural tidbit.
Let’s start with a familiar block on the SoC (black): the external I/O controller. The chip has a SATA 2.0 controller capable of 3Gb/s, a General Purpose Media Controller (GPMC) providing SD and eMMC access, a PCIe controller, and an Ethernet controller providing up to 10Gbit speeds. PCIe connectivity cannot be used in this system, but Calxeda can make custom designs of the “motherboard” to let customers attach PCIe cards if requested.
A single misclick.
No, really: A Titan pilot beneath the Cluster banner was attempting a “bridge”—using a ship to act as an artificial warp corridor for other ships—to Asakai VI when he accidentally warped himself straight into a very surprised Pandemic Legion fleet. The pilot, named Dabigredboat, immediately came under heavy attack as the Legion pounced on the extremely valuable ship.
via EVE Online’s Battle of Asakai: who was involved, the stakes, and the aftermath | News | PC Gamer.
I sometimes find the drama in these MMORPG fascinating and Eve Online usually has the best stories. The cynical side of me suspects this might have been staged as a marketing promotion. I hear nothing but good things about Eve Online however.
Hadoop Corona is the next version of Map-Reduce. The current Map-Reduce has a single Job Tracker that reached its limits at Facebook. The Job Tracker manages the cluster resource and tracks the state of each job. In Hadoop Corona, the cluster resources are tracked by a central Cluster Manager. Each job gets its own Corona Job Tracker which tracks just that one job. The design provides some key improvements:
via hadoop-20/src/contrib/corona at master · facebook/hadoop-20 · GitHub.
Cole revealed that Twitter’s MySQL database handles some huge numbers — three million new rows per day, the storage of 400 million tweets per day replicated four times over — but it is managed by a team of only six full-time administrators and a sole MySQL developer.
via Twitter, PayPal reveal database performance – Software – Technology – News – iTnews.com.au.
Daniel Austin, a technology architect at Paypal, has built a globally-distributed database with 100 terabytes of user-related data, also based on a MySQL cluster.
Austin said he was charged with building a system with 99.999 percent availability, without any loss of data, an ability to support transactions (and roll them back), and an ability to write data to the database and read it anywhere else in the world in under one second.
Configuring and Managing a Red Hat Cluster describes the configuration and management of Red Hat cluster systems for Red Hat Enterprise Linux 5. It does not include information about Red Hat Linux Virtual Servers (LVS). Information about installing and configuring LVS is in a separate document.
via Configuring and Managing a Red Hat Cluster.