Exploring the Wall Street Journal’s Pulitzer-Winning Medicare Investigation with SQL

This is a SQL-based introduction to the data and analysis behind the Wall Street Journal’s Pulitzer-winning “Medicare Unmasked” investigative project. It also doubles as a helpful guide if you’re attempting the midterm based on the WSJ Medicare’s investigation.

Source: Exploring the Wall Street Journal’s Pulitzer-Winning Medicare Investigation with SQL | Public Affairs Data Journalism at Stanford University

To follow along in this walkthrough, you can download my SQLite database here:

The hot new technology in Big Data is decades old: SQL

Over the past six months, vendors have responded to the demand for more corporate-friendly analytics by announcing a slew of systems that offer full SQL query capabilities with significant performance improvements over existing Hive/Hadoop systems. These systems are designed to allow full SQL queries over warehouse-size data sets, and in most cases they bypass Hadoop entirely (although some are hybrid approaches). Allowing much faster SQL queries at scale makes big data analytics accessible by many more people in the enterprise and fits in with existing workflows.

via The hot new technology in Big Data is decades old: SQL | Ars Technica.

SQL vs. NoSQL: Which Is Better?

So what can we conclude? Well, with the drivers here I focused primarily on ease-of-use. There are other factors that need to be considered, as well. Do they support connection pooling, for example? Do they cache? What about pulling in large amounts of data? (Hint: Most of the better drivers for most of the popular languages support cursors, so you don’t have to pull all the data in at once.) Those are factors you’ll need to investigate as you choose a driver for the language and database you’re using. But in general, virtually all the popular languages today, including Java, PHP, Python, PERL, and even C++, have nice libraries that make database programming far easier than it used to be.

via SQL vs. NoSQL: Which Is Better?.

MongoDB does great with large complex structures that are typically read in individually, while the large relational databases do well when I’m processing huge amounts of data. And no, my clients’ data needs are nowhere near as big as Google, so we don’t encounter any performance and scalability problems.

SQL Developer – The Universal Database Frontend

SQL Developer is a database administration and query tool that provides a single consistent interface for various databases.

Visually navigate through your database structure, create and execute SQL queries and scripts the easy way. Or reverse engineer complete data models with the integrated diagram editor.

via SQL Developer – The Universal Database Frontend.