The new approach comes with a snappy name: chiplets. You can think of them as something like high-tech Lego blocks. Instead of carving new processors from silicon as single chips, semiconductor companies assemble them from multiple smaller pieces of silicon—known as chiplets.
A fundamental design flaw in Intel’s processor chips has forced a significant redesign of the Linux and Windows kernels to defang the chip-level security bug.
There were rumors of a severe hypervisor bug – possibly in Xen – doing the rounds at the end of 2017. It may be that this hardware flaw is that rumored bug: that hypervisors can be attacked via this kernel memory access cockup, and thus need to be patched, forcing a mass restart of guest virtual machines.
In 1993/1994, at NASA’s Goddard Space Flight Center, Donald Becker and Thomas Sterling designed a Commodity Off The Shelf (COTS) supercomputer: Beowulf. Since they couldn’t afford a traditional supercomputer, they built a cluster computer made up of 16 Intel 486 DX4 processors, which were connected by channel bonded Ethernet. This Beowulf supercomputer was an instant success.
Linux first appeared on the Top500 in 1998. Before Linux took the lead, Unix was supercomputing’s top operating system. Since 2003, the Top500 was on its way to Linux domination. By 2004, Linux had taken the lead for good.
Each processor core can run its own small program independently of the others, which is a fundamentally more flexible approach than so-called Single-Instruction-Multiple-Data approaches utilized by processors such as GPUs; the idea is to break an application up into many small pieces, each of which can run in parallel on different processors, enabling high throughput with lower energy use, Baas said.
Because each processor is independently clocked, it can shut itself down to further save energy when not needed, said graduate student Brent Bohnenstiehl, who developed the principal architecture.
“We’ve been running TPUs inside our data centers for more than a year, and have found them to deliver an order of magnitude better-optimized performance per watt for machine learning. This is roughly equivalent to fast-forwarding technology about seven years into the future (three generations of Moore’s Law),” the blog said. “TPU is tailored to machine learning applications, allowing the chip to be more tolerant of reduced computational precision, which means it requires fewer transistors per operation. Because of this, we can squeeze more operations per second into the silicon, use more sophisticated and powerful machine learning models, and apply these models more quickly, so users get more intelligent results more rapidly.”
AMD is facing a lawsuit over claims that it misrepresented the core counts of its eight-core Bulldozer products, but the lawsuit’s technical merit seems extremely weak.
This lawsuit essentially asks a court to define what a core is and how companies should count them. As annoying as it is to see vendors occasionally abuse core counts in the name of dubious marketing strategies, asking a courtroom to make declarations about relative performance between companies is a cure far worse than the disease. From big iron enterprise markets to mobile devices, companies deploy vastly different architectures to solve different types of problems.
This week we’ll look at Amazon’s mighty cloud infrastructure, including how it builds its data centers and where they live (and why).
The Federal Aviation Administration this week said it had completed the momentous replacement of 40-year old main computer systems that control air traffic in the US.
Known as En Route Automation Modernization (ERAM), the system is expected to increase air traffic flow, improve automated navigation and strengthen aircraft conflict detection services, with the end result being increased safety and less flight congestion.
Lichtl says that people have tried to use wavelet compression before, and these particular simulations are based on work done by Jonathan Regele, a professor at the department of aerospace engineering at Iowa State University.
“The difference is that without GPU acceleration, and without the architecture and the techniques that we just described, it takes months on thousands of cores to run even the simplest of simulations. It is a very interesting approach but it doesn’t have industrial application without the hardware and the correct algorithms behind it. What the GPUs are doing here is enabling tremendous acceleration.
To be more precise, if you get the temperature wrong in the simulation by a little, you get the kinetic energy of the gas wrong by a lot because there is an exponential relationship there. If you get the pressure or viscosity of the fluid wrong by a little bit, you will see different effects in the nozzle than will happen in the real motor.
These real-time applications, according to Donna Dillenberger, a distinguished engineer at IBM’s Watson lab, can be done in a mainframe environment. They are not yet possible on clusters of smaller, industry-standard computers, she said. But there are several open-source software projects, like Apache Spark, that focus on real-time data processing across large numbers of computers.
He estimates the total cost of ownership including hardware, software and labor will be 50 percent less with a mainframe than on his “sprawling server farm,” given the growing complexity of managing hardware and software from several suppliers.