On Thursday a rocket failed. Three humans remain on the ISS. What’s next?

NASA’s strong preference is to keep astronauts aboard the station. But Todd said NASA does have procedures for operating the station without crew on board. “That’s something that we’re always prepared for,” he said. “I feel very confident that we could fly for a significant period of time.”

Source: On Thursday a rocket failed. Three humans remain on the ISS. What’s next? | Ars Technica

Kepler Spacecraft in Emergency Mode

The last regular contact with the spacecraft was on April. 4.  The spacecraft was in good health and operating as expected.

Kepler completed its prime mission in 2012, detecting nearly 5,000 exoplanets, of which, more than 1,000 have been confirmed. In 2014 the Kepler spacecraft began a new mission called K2. In this extended mission, K2 continues the search for exoplanets while introducing new research opportunities to study young stars, supernovae, and many other astronomical objects.

Source: Mission Manager Update: Kepler Spacecraft in Emergency Mode | NASA

Also From: Kepler Reaction Wheel Failure Cripples Spacecraft, but Mission Thrives

To save on bandwidth, Kepler only downlinks data from the pixels associated with 156,000 target stars out of the millions of stars in the Kepler field.  Data from an “aperture” of pixels around each target star are downlinked to Earth, and computer programs on Earth measure the brightness of the star based on the light that hit the pixels in the aperture.  If the telescope pointing is not good enough to keep the target stars in their respective apertures on the pixels, it is impossible to measure the brightness of those stars with a precision of 20 parts per million.

Update From:  Kepler telescope readies for new mission after communications scare

Once the spacecraft checks out, Kepler will kick off its latest effort, looking toward the galactic center for planets whose gravity distorts the light from far more distant stars. This technique, known as gravitational microlensing, has been used with ground-based telescopes to discover about 46 planets, some of them orphaned from their parent stars. But the method is a first for Kepler, which searches for dips in starlight caused by planets crossing in front of their suns.

A400M probe focuses on impact of accidental data wipe

Computers operating each engine cannot work if this data, which is unique to each of the turboprops, is missing.

Source: Exclusive: A400M probe focuses on impact of accidental data wipe | Reuters

Under the A400M’s design, the first warning pilots would receive of the engine data problem would be when the plane was 400 feet (120 meters) in the air, according to a safety document seen by Reuters. On the ground, there is no cockpit alert.

Sounds like these data files became a single point of failure.

How a dumb software glitch kept thousands from reaching 911

At first, Intrado thought that the complaints arising from various PSAPs around the country were just isolated, unconnected events — even though alarm bells were going off an hour into the breakdown. Nobody noticed the warnings until it was too late; the server taking note of the alerts categorized them as “low level” incidents and were never flagged for a human, according to the FCC report.

via How a dumb software glitch kept thousands from reaching 911 – The Washington Post.

PSAP = Poor Sucker At Phone

Creating a Centralized Syslog Server

For this article, I’ll be focusing on syslog-ng as this is more up to date, and if the reader wishes, can be ‘supported’ via the company that owns the syslog-ng software by going with their enterprise edition version at a later date.

via Creating a Centralized Syslog Server | Linux Journal.

This is a good tutorial to get going with syslog-ng.  Monitoring events being logged into syslog can provide ample warning when a server is about to die.

Cloud Providers Work To Disperse Points Of Failure

In the end, cloud providers — many of which aim for 99.9 percent uptime, or “three nines” — are likely to offer individual companies a more reliable service than those companies attain for themselves, the CSA’s Howie says.

via Cloud Providers Work To Disperse Points Of Failure – Dark Reading.

Note that telecom typically operates under 5 nines uptime.  The point of this article may be that end users need to implement their own backup plans to get higher than three 9s reliability.

Mars Rover Curiosity in Safe Mode After Computer Glitch

The issue cropped up Wednesday (Feb. 27), when the spacecraft failed to send its recorded data back to Earth and did not switch into its daily sleep mode as planned. After looking into the issue, engineers decided to switch the Curiosity rover from its primary “A-side” computer to its “B-side” backup on Thursday at 5:30 p.m. EST (22:30 GMT). [Curiosity Rover’s Latest Amazing Mars Photos]

via Mars Rover Curiosity in Safe Mode After Computer Glitch | Space.com.

Netflix Gives Data Center Tools to Fail

Netflix has released Hystrix, a library designed for managing interactions between distributed systems, complete with “fallback” options for when those systems inevitably fail.

The code for Hystrix—which Netflix tested on its own systems—can be downloaded at Github, with documentation available here, in addition a getting-started guide and operations examples, among others.

via Netflix Gives Data Center Tools to Fail.

Netflix will also release the real-time dashboard it uses for monitoring Hystrix. That dashboard relies on a traffic-light system to display service dependencies for the last ten seconds, with colors measuring latency and the size of the circles showing traffic.

That smooth SpaceX launch? Turns out one of the engines came apart

The Falcon 9, as its name implies, has nine engines, and is designed to go to orbit if one of them fails. On-board computers will detect engine failure, cut the fuel supply, and then distribute the unused propellant to the remaining engines, allowing them to burn longer. This seems to be the case where that was required, and the computers came through. The engines are also built with protection to limit the damage in cases where a neighboring engine explodes, which appears to be the case here.

via That smooth SpaceX launch? Turns out one of the engines came apart | Ars Technica.