Scientific Data Has Become So Complex, We Have to Invent New Math to Deal With It

Scientists like DeDeo and Vespignani make good use of this piecemeal approach to big data analysis, but Yale University mathematician Ronald Coifman says that what is really needed is the big data equivalent of a Newtonian revolution, on par with the 17th century invention of calculus, which he believes is already underway. It is not sufficient, he argues, to simply collect and store massive amounts of data; they must be intelligently curated, and that requires a global framework.

via Scientific Data Has Become So Complex, We Have to Invent New Math to Deal With It – Wired Science.

Among the most notable insights Euler gleaned from the puzzle was that the exact positions of the bridges were irrelevant to the solution; all that mattered was the number of bridges and how they were connected. Mathematicians now recognize in this the seeds of the modern field of topology.

Scientists who took chemistry into cyberspace win Nobel Prize

Chemical reactions occur at lightning speed as electrons jump between atomic nuclei, making it virtually impossible to map every separate step in chemical processes involving large molecules like proteins.

Powerful computer models, first developed by the three scientists in 1970s, offer a new window onto such reactions and have become a mainstay for researchers in thousands of academic and industrial laboratories around the world.

via Scientists who took chemistry into cyberspace win Nobel Prize – chicagotribune.com.

Private Cygnus Spacecraft Makes Historic 1st Rendezvous with Space Station

Orbital officials initially aimed for Cygnus to arrive at the space station on Sunday, Sept. 22, but a data format issue between the spacecraft and orbiting lab forced the company to abort that first rendezvous attempt. Troubleshooting efforts with that glitch and the impending arrival of a new space station crew aboard a Russian Soyuz spacecraft, which launched and docked on Wednesday (Sept. 25), pushed Cygnus’ arrival to today.

via Private Cygnus Spacecraft Makes Historic 1st Rendezvous with Space Station | Space.com.

The other firm is SpaceX of Hawthorne, Calif., which has a $1.9 billion contract for 12 supply missions using its Dragon space capsules and Falcon 9 rockets. SpaceX has flown two of those delivery missions already, and is expected to test fly an upgraded version of its Falcon 9 rocket later today in a launch from California. Unlike Cygnus, SpaceX’s Dragon capsules are equipped with a heat shield and can return science experiments and gear to Earth from the station.

Hibernate (Java)

Hibernate is an object-relational mapping (ORM) library for the Java language, providing a framework for mapping an object-oriented domain model to a traditional relational database. Hibernate solves object-relational impedance mismatch problems by replacing direct persistence-related database accesses with high-level object handling functions.

Hibernate is free software that is distributed under the GNU Lesser General Public License.

Hibernate’s primary feature is mapping from Java classes to database tables (and from Java data types to SQL data types). Hibernate also provides data query and retrieval facilities. It also generates the SQL calls and attempts to relieve the developer from manual result set handling and object conversion and keep the application portable to all supported SQL databases with little performance overhead

via Hibernate (Java) – Wikipedia, the free encyclopedia.

The site to download it is here.

The Enron E-mails’ Immortal Life

This research has had widespread applications: computer scientists have used the corpus to train systems that automatically prioritize certain messages in an in-box and alert users that they may have forgotten about an important message. Other researchers use the Enron corpus to develop systems that automatically organize or summarize messages. Much of today’s software for fraud detection, counterterrorism operations, and mining workplace behavioral patterns over e-mail has been somehow touched by the data set.

via The Enron E-mails’ Immortal Life | MIT Technology Review.

UML Tool for Fast UML Diagrams

UMLet is a free, open-source UML tool with a simple user interface: draw UML diagrams fast, produce sequence and activity diagrams from plain text, export diagrams to eps, pdf, jpg, svg, and clipboard, share diagrams using Eclipse, and create new, custom UML elements. UMLet runs stand-alone or as Eclipse plug-in on Windows, OS X and Linux. (Also, check out its sister tool PLOTlet to create chart grids and our other tools.)

via UML Tool for Fast UML Diagrams.

Information Extraction and Synthesis Laboratory

Cross-document coreference resolution is the task of grouping the entity mentions in a collection of documents into sets that each represent a distinct entity. It is central to knowledge base construction and also useful for joint inference with other NLP components. Obtaining large, organic labeled datasets for training and testing cross-document coreference has previously been difficult. We use a method for automatically gathering massive amounts of naturally-occurring cross-document reference data to create the Wikilinks dataset comprising of 40 million mentions over 3 million entities. Our method is based on finding hyperlinks to Wikipedia from a web crawl and using anchor text as mentions. In addition to providing large-scale labeled data without human effort, we are able to include many styles of text beyond newswire and many entity types beyond people.

via Wikilinks – Information Extraction and Synthesis Laboratory.

Sharpening Endpoint Security

Endpoints are as hard to define as they are to protect. The term traditionally referred to desktops and laptops, but endpoints now encompass smartphones, tablets, point-of-sale machines, bar code scanners, multifunction printers and practically any other device that connects to the company network. Without a well-conceived strategy, keeping track of and securing these devices is difficult and frustrating.

via Sharpening Endpoint Security – Dark Reading.

Some IT shops buy cleverly marketed products that promise off-the-shelf endpoint security using anti-malware and sandboxing. In most cases, attackers can easily bypass those defenses

The 5 Commandments Of Data And Why Analytics Efforts Are Still A Big Old Mess

Data has to be a strategic asset. The presence of consultants at a conference like Strata shows how much confusion people still have in realizing how to get the value that vendors promise in such bountiful amounts

via The 5 Commandments Of Data And Why Analytics Efforts Are Still A Big Old Mess | TechCrunch.

I don’t have patience to watch people talk but it sounds like data analytics might be a lucrative field to be in right now.