Every F-35 squadron, no matter the country, has a 13-server ALIS package that is connected to the worldwide ALIS network. Individual jets send logistical data back to their nation’s Central Point of Entry, which then passes it on to Lockheed’s central server hub in Fort Worth, Texas. In fact, ALIS sends back so much data that some countries are worried it could give away too much information about their F-35 operations.
Source: F-35’s Hacking Vulnerability | Could the F-35 Be Hacked?
Hackers could conceivably introduce bad data in the JRE that could compromise the safety of a mission, shortening the range of a weapon system so that a pilot thinks she is safely outside the engagement zone when she is most certainly not.
It’s highly likely these vulnerabilities are a known detectable exploit vector. Any military aircraft should be able to perform its mission disconnected from a network — except for perhaps drones.
Mind the Bullshit Asymmetry Principle, articulated by the Italian software developer Alberto Brandolini in 2013: the amount of energy needed to refute bullshit is an order of magnitude bigger than that needed to produce it. Or, as Jonathan Swift put it in 1710, “Falsehood flies, and truth comes limping after it.”Plus ça change.
Source: How to Call B.S. on Big Data: A Practical Guide
Into this universe comes Airbus SE, the European aerospace conglomerate. Airbus is starting a new data company, called Airbus Aerial, to provide an array of unmanned aerial vehicles (UAV) services, a field the company estimates could increase to more than $120 billion annually as the use of these fleets expands, said Dirk Hoke, CEO of Airbus’s defence and space group. Hoke introduced the new company Wednesday at Xponential.
Source: Here Comes the War for Commercial Drone Dominance – Bloomberg
The data release, part of the company’s Webscope initiative and announced on Yahoo’s Tumblr blog, is intended for researchers to use in validating recommender systems, high-scale learning algorithms, user-behaviour modelling, collaborative filtering techniques and unsupervised learning methods.
Source: Yahoo releases massive research dataset
From: Yahoo Releases the Largest-ever Machine Learning Dataset for Researchers
Today, we are proud to announce the public release of the largest-ever machine learning dataset to the research community. The dataset stands at a massive ~110B events (13.5TB uncompressed) of anonymized user-news item interaction data, collected by recording the user-news item interactions of about 20M users from February 2015 to May 2015.
Their overview stated that machine learning techniques emphasized causality less than traditional economic statistical techniques, or what’s usually known as econometrics. In other words, machine learning is more about forecasting than about understanding the effects of policy.
That would make the techniques less interesting to many economists, who are usually more concerned about giving policy recommendations than in making forecasts.
Source: Economics Has a Math Problem – Bloomberg View
Accuracy of 90 percent with 80 percent consistency sounds good, but the scores are “actually very poor, since they are for an exceedingly easy case,” Amaral said in an announcement from Northwestern about the study.
Applied to messy, inconsistently scrubbed data from many sources in many formats – the base of data for which big data is often praised for its ability to manage – the results would be far less accurate and far less reproducible, according to the paper.
via Test shows big data text analysis inconsistent, inaccurate | Computerworld.
Here’s an interesting explanation as to how LDA, Latent Dinchlet Allocation works. From: What is a good explanation of Latent Dirichlet Allocation?
From a 3000 foot level as I understand the explanation of LDA; it seems like a mechanism to score words in order to categorize sets of words like paragraphs or entire papers. Interesting exercise but a human must data model this first. Any time some program has to estimate or guess like this there will be error, the only issue is how much is acceptable to even use the results that this kind of analysis produces.
A good predictive model requires a stable set of inputs with a predictable range of values that won’t drift away from the training set. And the response variable needs to remain of organizational interest.
via Surviving Data Science “at the Speed of Hype” – John Foreman, Data Scientist.
If you want to move at the speed of “now, light, big data, thought, stuff,” pick your big data analytics battles. If your business is currently too chaotic to support a complex model, don’t build one. Focus on providing solid, simple analysis until an opportunity arises that is revenue-important enough and stable enough to merit the type of investment a full-fledged data science modeling effort requires.
Microsoft this morning announced a deal to buy Revolution Analytics, the top commercial provider of software and services for the open-source R programming language for statistical computing and predictive analytics.
via Microsoft to buy Revolution Analytics, pushing further into big data – GeekWire.
When data is abundant, intelligence will win
Putting the power to publish and consume content into the hands of more people in more places enables everyone to start conversations with facts. With facts, negotiations can become less about who yells louder, but about who has the stronger data. They can also be an equalizer that enables better decisions and more civil discourse. Or, as Thomas Jefferson put it at the start of his first term, “Error of opinion may be tolerated where reason is left free to combat it.”
via Official Google Blog: From the height of this place.
It then goes on to say this:
The vast majority of computing will occur in the cloud
Within the next decade, people will use their computers completely differently than how they do today. All of their files, correspondence, contacts, pictures, and videos will be stored or backed-up in the network cloud and they will access them from wherever they happen to be on whatever device they happen to hold.
Of course google wants this for everyone will need to use services like google to access their data. Do people really need all their data accessible to them 24/7? Can anyone trust the security of one’s data when placed in the hands of a stranger?
A bird in the hand is worth two in the bush. There is nothing more secure than a hard drive or more (one or more for backups) in a safety deposit box. No one needs to access their tax returns from anywhere at any time just because they can.
The switch from relational hadn’t been too hard because Riak is a key-value store, which made modeling relatively easy. Key value-stores are relatively simple database management systems that store just pairs of keys and values.
McCaul reckoned, too, migration of data had been made possible because the structure of patient records lent themselves to Riak’s key-value mode
via NHS grows a NoSQL backbone and rips out its Oracle Spine • The Register.