Amazon S3 as a Backup Solution

by Nate 23. September 2007 22:57

I've always understood the importance of backing up my workstations, but haven't always followed through with action. Luckily it has never come back to burn me too badly, although I did have one bad experience a couple of years ago when I lost a good portion of my digital music library. Well, over the last year and a half I've tried to change this by consistently backing up important files to external hard drives and taking image snapshots of all my computers using Acronis' True Image product (see My Favorite Applications and Tools post for more info on True Image). However, with more and more of my life (documents, emails, music, pictures, projects, videos, etc.) living on a computer somewhere, I've decided that I now need a more robust and fail-proof backup solution. Enter Amazon Simple Storage Service (S3).

For those who aren't familiar with S3, it is an "in-the-cloud" storage service that Amazon provides as part of its growing list of web services. These web services are built using the same technologies that Amazon.com uses, meaning that a lot of lessons have been learned and integrated into them. Developers can access these services through a variety of web service interfaces, and, using S3, can gain access to unlimited storage on Amazon's robust infrastructure. And yes, this includes the same availability and performance that users have come to expect from Amazon.com. As for backups, S3 is appealing not only because of its availability and performance, but also because files stored in S3 are encrypted and redundantly stored in multiple data centers, meaning that the likelihood of loss of files is extremely low, especially when compared to the chance that your external hard drive will fail. That said, I still don't recommend making an S3 repository your authoritative copy of your data, as it's always a good idea to have at least one copy of your data somewhere where you can access it, no matter what.

I've been keeping an eye on Amazon Web Services for quite a while now, and have recently opened up a dialog with the Amazon Web Services team about the possibility of migrating some of my organization's map cache and media into S3. I've also been playing around with integrating S3 into a couple of demo ASP.NET and Ruby on Rails applications as proofs-of-concept. As I've learned more and more about S3, the idea of using it for personal backups of important information has grown on me.

As I hinted at earlier, Amazon gives developers tools to connect to (authenticate) and transfer files to and from S3. I, however, wanted an easier, more automated solution to help with my backups. I looked at a couple of solutions, including the Firefox Organizer for Amazon S3 (which I definitely suggest using, if just for browsing the files that you're storing on S3) and Jungle Disk, a GUI for using Amazon S3 that mounts your S3 as a local drive letter (on Mac and Window machines, I'm not sure how you connect if you're using Linux, but I do know that Jungle Disk is supported on all three operating systems) and optionally automates backups. I decided on Jungle Disk, as it's easy to setup and use and acts a lot like a local hard drive. Note that there is a one-time *lifetime $20 fee for purchasing the utility. In my opinion it is well worth it. As I've said before, I'm willing to pay for a service that makes my life easier.

Jungle Disk has a ton of configuration options, but is still trivial to get setup and running. You simply download and install the utility, sign up for an S3 account, tell Jungle Disk your S3 account information, and tell it which files (or directories) you want it to backup and when. If you set it up for automatic backups, it will perform a scheduled check to see which files have changed and only backup those that have been modified, keeping the S3 bandwidth/storage costs as low as possible. Taken from the Amazon S3 pricing page:

Storage

  • $0.15 per GB-Month of storage used

Data Transfer

  • $0.10 per GB - all data transfer in
  • $0.18 per GB - first 10 TB / month data transfer out

Requests

  • $0.01 per 1,000 PUT or LIST requests
  • $0.01 per 10,000 GET and all other requests

Pretty simple and cheap, eh?

I just installed Jungle Disk and ran my first backup today. I got an average of ~600 kb/s on upload (and, yes, you can limit the upload speed and tell it to run only at certain times), and interacting with the files through Windows Explorer is just as fast as interacting with them on a local disk. In fact, as long as Jungle Disk knows where a copy of a file lives on my local machine, it will access that file automatically rather than going out to S3 to access it. If S3, however, has a newer copy of the file, it will browse to it.

Overall, so far so good. Maybe I can finally have some peace of mind about my digital files?

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Tags:

Amazon S3 | Life | Utilities

MapDotNet Server 2007 6.1.2 Released Along with Better Documentation

by Nate 7. September 2007 03:56

Less than a month after the 6.1.1 release, ISC announced yesterday the release of MapDotNet Server 2007 6.1.2. I love it when development teams embrace the "release early, release often" approach. This is the approach that the ArcGIS Explorer team has taken, and look how much their product has improved in the (relatively) short time that it has been out. According to the release notes, 6.1.1 is a maintenance release, but there are two critical improvements related to ArcSDE support that merit mention. Quoted from the release notes:

  • "Substantially improved rendering speeds with ArcSDE. This is especially the case in large multi-processor web garden deployments where the MDNS services are less likely to be processor-bound. Substantially improved SDE connection pooling resulted in upwards of 10 times the rendering performance in our tests. This was especially noticeable when ArcSDE is installed on a separate server from the MDNS web services."
  • "Improved locking support for ArcSDE where large numbers of spatial queries/edits/transforms and map renderings are occurring simultaneously. The ArcSDE ESRI client connector is not thread safe and under heavy load faults were encountered. This has been resolved through better locking in MDNS."

As I mentioned in a previous post, performance in general increased many-fold with the addition of SQL Server tile caching support and the tile over-fetching capabilities in the 6.1 release. And now that the performance of data stored in SDE (where just about all of our data are stored) is supposed to improve even more dramatically with this release, I'm looking forward to seeing the difference.

And on a slightly different - but just as important - note, if you check out the MapDotNet Server website, you'll notice that a lot has changed over the last couple of months. The Interactive SDK now has ten examples that you can both preview and download. In addition to this, the Wiki has some new content (especially of interest are the Virtual Earth extended template and the Performance Tuning entries). Kudos to the MapDotNet Server team for these improvements. That said, there are still a lot of improvements that need to be made, but it seems that they are moving quickly in the right direction.

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Tags:

ASP.NET | MapDotNet Server | Microsoft Virtual Earth | SQL Server

Powered by BlogEngine.NET 1.4.5.0
Theme by Mads Kristensen
GeoURL