Google Storage - Google's answer to Amazon S3?
Monday, April 25, 2011 » Amazon, Cloud, Google Storage, IaaS, S3
While I've only had a small bit of time to investigate and play around with the new Google service, I've been using S3 for a few years now and S3 (along with the rest of Amazon Web Services (AWS)) has proven to be pretty fast, stable, and reliable (um...except for last week). Naturally, my first question was how do they compare?<!--break--> Both services provide RESTful APIs, allowing access to objects (aka files) via HTTP usually requiring all kinds of authentication but with minimal tweaking of permissions, they can be accessed without any access controls at all - in essense turning both services into simple web-servers. This turns out to be a perfect configuration to perform a few simple, non-scientific speed tests. Placing files of various sizes on each service, I measured the time it took to retrieve the files (using time wget from the command line). I ran my tests using several ISPs, and from several locations. Initial tests show downloads from Google Storage to be, on average, 30% faster than those from Amazon S3. Amazon has an add-on offering called CloudFront. CloudFront is a content distribution network (CDN) that caches objects from storage and ensures that end-user requests are serviced from the nearest edge service-point thereby optimizing download speeds to end users by eliminating latency due to network traversal. I decided to re-run my Amazon S3 tests using CloudFront and found that with CloudFront+S3, download speeds are almost dead-on the same as those from Google Storage. File storage for Amazon costs (at maximum - there are discounts for volume or for sacrificing a few 9's of reliability) $0.15/GB. File storage costs $0.17 on Google Storage. Each service has transfer charges (both are $0.10 upload, $0.15 download + various minor charges per 1,000 or 10,000 HTTP requests depending on what which operation). While at first glance, Google Storage is priced significantly higher, serving files out of Google Storage is cheaper than S3+CloudFront so S3 (without CloudFront) and Google Storage each may have use cases that offer different levels of economic benefits that favor one or the other. It's worth noting that CloudFront can cache objects from any server, any where, so it has it's cost-beneficial use cases too. Also, since my tests were very casually constructed, your mileage may vary. One major difference between the two services is that Amazon S3 supports a hierarchical file store with folders that can contain files and other folders. Google Storage supports a flat file store. While there are no folders, filenames can include forward slashes (i.e. foo/bar) so a hierarchy of files can be copied into Google Storage, but each file is stored as a separate object in the top-level directory. Both Google Storage and Amazon S3 offer access control based upon each services respective user account system. Each also offers a fairly rich (complex) set of access controls. Amazon also has a "Signed URI" mechanism that provides a way to share S3 resources with any third-party regardless of whether they have an account on Amazon Web Services. If Google Storage has a similar mechanism, I haven't found it yet. I will delve further into access controls in a later article, after I've had more time to explore Google Storage. Google provides a pair of tools to initially access and manage resources on Google Storage. GSutil is an open-sourced command-line utility written in Python that provides a way to upload/download files, manage buckets and objects in Google Storage...and...most curiously...Amazon S3 too. I have not had a chance to exercise this tool yet, but it appears to allow similar operations on both Cloud file stores. The other tool Google provides is a web-based utility called Google Storage Manager. The web-based utility is graphical explorer-like tool that provides basic capabilities such as upload, delete, creation of new buckets, and setting acls to public. By comparison, Amazon S3's web-based console is much more complete, allowing detailed manipulation of ACLs and Mimetypes. Amazon S3 has been around for a while and is a mature and stable offering. Many developers have created commercial consumer applications services on top of S3 (such as Netflix, Dropbox, and Carbonite to name just a few. Google Storage appears to hold great promise to provide a viable alternative, and perhaps, make the concept of Cloud Storage enough of a commodity to drive down the price for everybody.Today I received my invitation for "Google Storage for Developers," Google's limited-access (by invitation only) beta-test of it's new cloud-based storage offering. Google's new offering, which they're terming "... a new service for developers to store and access data in Google's cloud." Their intro goes on to say "It offers developers direct access to Google's scalable storage and networking infrastructure as well as powerful authentication and data sharing mechanisms." In other words, Google is preparing to launch their own competition to Amazon S3 (Amazon's cloud based storage service).