File and Data Storage: AFS

AFS Andrew File System is a distributed, networked file system that enables efficient file sharing between clients and servers. AFS files are accessible via the Web or through file transfer programs such as OpenAFS or Fetch Macintosh and SecureFX Windows. Currently all users with a full-service SUNet ID are granted 2 GB of AFS file space. Additional disk space is available by request for faculty-sponsored research including dissertations.

via File and Data Storage: AFS | Information Technology Services.

Synology Network Attached Storage – DS1812+ Products

iSCSI is also supported as the ideal alternative to SAN solution for business. Affordable and cost-effective iSCSI allows large scale business to consolidate storage into data center storage arrays while providing hosts with the illusion of locally-attached disks. With the iSCSI support, DS1812+ provides a seamless storage solution for virtualization servers, such as VMware, Citrix, and Hyper-V.

via Synology Network Attached Storage – DS1812+ Products.

NSA Mimics Google, Pisses Off Senate

But the NSA also saw the database as something that could improve security across the federal government — and beyond. Last September, the agency open sourced its Google mimic, releasing the code as the Accumulo project. It’s a common open source story — except that the Senate Armed Services Committee wants to put the brakes on the project.

via NSA Mimics Google, Pisses Off Senate | Wired Enterprise | Wired.com.

Google Research Publication: BigTable

Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance. These applications place very different demands on Bigtable, both in terms of data size (from URLs to web pages to satellite imagery) and latency requirements (from backend bulk processing to real-time data serving). Despite these varied demands, Bigtable has successfully provided a flexible, high-performance solution for all of these Google products. In this paper we describe the simple data model provided by Bigtable, which gives clients dynamic control over data layout and format, and we describe the design and implementation of Bigtable.

via Google Research Publication: BigTable.

Apache Accumulo

The Apache Accumulo™ sorted, distributed key/value store is a robust, scalable, high performance data storage and retrieval system. Apache Accumulo is based on Google’s BigTable design and is built on top of Apache Hadoop, Zookeeper, and Thrift. Apache Accumulo features a few novel improvements on the BigTable design in the form of cell-based access control and a server-side programming mechanism that can modify key/value pairs at various points in the data management process. Other notable improvements and feature are outlined here.

via Apache Accumulo.

SQL vs. NoSQL: Which Is Better?

So what can we conclude? Well, with the drivers here I focused primarily on ease-of-use. There are other factors that need to be considered, as well. Do they support connection pooling, for example? Do they cache? What about pulling in large amounts of data? (Hint: Most of the better drivers for most of the popular languages support cursors, so you don’t have to pull all the data in at once.) Those are factors you’ll need to investigate as you choose a driver for the language and database you’re using. But in general, virtually all the popular languages today, including Java, PHP, Python, PERL, and even C++, have nice libraries that make database programming far easier than it used to be.

via SQL vs. NoSQL: Which Is Better?.

MongoDB does great with large complex structures that are typically read in individually, while the large relational databases do well when I’m processing huge amounts of data. And no, my clients’ data needs are nowhere near as big as Google, so we don’t encounter any performance and scalability problems.

Cloud Security: What You Need to Know to Lock It Down

“The most important thing to remember when you’re storing or processing sensitive data in the cloud is that you are still fully responsible for the security of the data, and you are fully accountable if that data is lost or stolen,” Shaul concluded. “Even if your cloud provider offers some security services or indemnifies you for losses resulting from a breach, if your data is stolen, it’s still your problem.”

via Cloud Security: What You Need to Know to Lock It Down.

Cloud storage: a pricing and feature guide for consumers

Cloud storage services are cropping up left and right, all enticing their customers with a few gigabytes of storage that sync seemingly anywhere, with any device. We’ve collected some details on the most popular services, including Google Drive, to compare them.

via Cloud storage: a pricing and feature guide for consumers | Ars Technica.

I don’t normally post images here but this chart makes for a quick reference.  The linked to article has much more details and worth a read.

What intrigued me about this is the max file size.  This is probably set to keep people from building their own file containers (i.e. tar, zip, etc.) — which is what I hoped to do.  Dropbox allows for a max file size of 300MB, ICloud 25MB.    By building your own file containers gives you more local control over security of said files by allowing you to use your own encryption.  300MB seems suitable for even rather large databases.  Apps will have to be more frugal with a 25MB limit.  I need to start using my Dropbox account.  Will report more on this later.