A fundamental design flaw in Intel’s processor chips has forced a significant redesign of the Linux and Windows kernels to defang the chip-level security bug.
Source: ‘Kernel memory leaking’ Intel processor design flaw forces Linux, Windows redesign • The Register
There were rumors of a severe hypervisor bug – possibly in Xen – doing the rounds at the end of 2017. It may be that this hardware flaw is that rumored bug: that hypervisors can be attacked via this kernel memory access cockup, and thus need to be patched, forcing a mass restart of guest virtual machines.
In 1993/1994, at NASA’s Goddard Space Flight Center, Donald Becker and Thomas Sterling designed a Commodity Off The Shelf (COTS) supercomputer: Beowulf. Since they couldn’t afford a traditional supercomputer, they built a cluster computer made up of 16 Intel 486 DX4 processors, which were connected by channel bonded Ethernet. This Beowulf supercomputer was an instant success.
Source: Linux totally dominates supercomputers | ZDNet
Linux first appeared on the Top500 in 1998. Before Linux took the lead, Unix was supercomputing’s top operating system. Since 2003, the Top500 was on its way to Linux domination. By 2004, Linux had taken the lead for good.
Each processor core can run its own small program independently of the others, which is a fundamentally more flexible approach than so-called Single-Instruction-Multiple-Data approaches utilized by processors such as GPUs; the idea is to break an application up into many small pieces, each of which can run in parallel on different processors, enabling high throughput with lower energy use, Baas said.
Because each processor is independently clocked, it can shut itself down to further save energy when not needed, said graduate student Brent Bohnenstiehl, who developed the principal architecture.
Source: World’s First 1,000-Processor Chip | UC Davis
“We’ve been running TPUs inside our data centers for more than a year, and have found them to deliver an order of magnitude better-optimized performance per watt for machine learning. This is roughly equivalent to fast-forwarding technology about seven years into the future (three generations of Moore’s Law),” the blog said. “TPU is tailored to machine learning applications, allowing the chip to be more tolerant of reduced computational precision, which means it requires fewer transistors per operation. Because of this, we can squeeze more operations per second into the silicon, use more sophisticated and powerful machine learning models, and apply these models more quickly, so users get more intelligent results more rapidly.”
Source: Google’s Tensor Processing Unit could advance Moore’s Law 7 years into the future | PCWorld
AMD is facing a lawsuit over claims that it misrepresented the core counts of its eight-core Bulldozer products, but the lawsuit’s technical merit seems extremely weak.
Source: AMD lawsuit over false Bulldozer chip marketing is bogus | ExtremeTech
This lawsuit essentially asks a court to define what a core is and how companies should count them. As annoying as it is to see vendors occasionally abuse core counts in the name of dubious marketing strategies, asking a courtroom to make declarations about relative performance between companies is a cure far worse than the disease. From big iron enterprise markets to mobile devices, companies deploy vastly different architectures to solve different types of problems.
This week we’ll look at Amazon’s mighty cloud infrastructure, including how it builds its data centers and where they live (and why).
Source: Inside Amazon’s Cloud Computing Infrastructure
The Federal Aviation Administration this week said it had completed the momentous replacement of 40-year old main computer systems that control air traffic in the US.
Known as En Route Automation Modernization (ERAM), the system is expected to increase air traffic flow, improve automated navigation and strengthen aircraft conflict detection services, with the end result being increased safety and less flight congestion.
Source: FAA: 2 million lines of code process new air traffic system | Network World
Lichtl says that people have tried to use wavelet compression before, and these particular simulations are based on work done by Jonathan Regele, a professor at the department of aerospace engineering at Iowa State University.
“The difference is that without GPU acceleration, and without the architecture and the techniques that we just described, it takes months on thousands of cores to run even the simplest of simulations. It is a very interesting approach but it doesn’t have industrial application without the hardware and the correct algorithms behind it. What the GPUs are doing here is enabling tremendous acceleration.
via Rockets Shake And Rattle, So SpaceX Rolls Homegrown CFD.
To be more precise, if you get the temperature wrong in the simulation by a little, you get the kinetic energy of the gas wrong by a lot because there is an exponential relationship there. If you get the pressure or viscosity of the fluid wrong by a little bit, you will see different effects in the nozzle than will happen in the real motor.
These real-time applications, according to Donna Dillenberger, a distinguished engineer at IBM’s Watson lab, can be done in a mainframe environment. They are not yet possible on clusters of smaller, industry-standard computers, she said. But there are several open-source software projects, like Apache Spark, that focus on real-time data processing across large numbers of computers.
via IBM Introduces z13, a Mainframe for the Smartphone Economy – NYTimes.com.
He estimates the total cost of ownership including hardware, software and labor will be 50 percent less with a mainframe than on his “sprawling server farm,” given the growing complexity of managing hardware and software from several suppliers.
The most prominent prior art invalidating this patent is the RAID6 (one of the most commonly used Erasure Code) implementation of the linux kernel. In an article dated 2004 (i.e. ten years before the patent was granted to StreamScale) it is described to be optimized as follows : For additional speed improvements, it is desirable to use any integer vector instruction set that happens to be available on the machine, such as MMX or SSE-2 on x86, AltiVec on PowerPC, etc. Where SSE2 is the acronym of Streaming SIMD Extensions 2. The patent cites Anvin aticle’s but only to state the problem and does not acknowledge it also contains the solution.
via Erasure Code Patents | Analysis of Erasure Code Patents for everyone.