This final post in the three part series will discuss some of the issues of data management that arise once you’ve decided to move to a 3rd party data storage company. Do you need specialized services tailored for the type of work you do? Are these companies just file stores that act like hard drives, expect online? These kinds of questions need to be addressed to make proper decisions on who you should trust to host your data.

So you’ve made the decision to move to a third party data host and let them retain your data on their servers. Now you have to decide what processes and services you want to continue doing yourself vs. let your host do.

At the basic level, most of these companies store your data and not much more. They provide a repository of data, give you access to your data, make sure it’s safe and secure, let you do the rest. In many ways, it’s not much different than a hard drive on your computer. You move things in and out as you please and anything you do with that data is entirely your job. If all you need is straight file storage, there’s a world of options out there. With these companies, you can rent a block of storage or pay based on the size of the data and your usage of it. Mozy, to use one example, will allow you to automatically backup files and later retrieve them. The application creates a nearly seamless interface where you don’t have to keep managing things, after the initial setup. Another option, Dropbox, creates a folder on your computer that looks just like any other folder you might have, except that it is uploaded to Dropbox’s servers and stored online.

If you want to go one step further, some of these companies provide specialized services on top of storing your data. If you need a database system, such as MySQL, Microsoft SQL Server, or Oracle, there are companies out there that will help you configure and maintain a complex system. You can hire database administrators to keep everything up and running in optimal condition, system administrators to keep things running at peak performance and analysts to help you generate useful information from large stores of data.

In the life science realm, you can cut out some of the effort on your part by linking professional services you use directly to your online data stores. For example, if you work with a 3rd party DNA sequencing company, it may be possible to have the results of that sequencing process uploaded directly from the sequencer to your online storage account. This way, you don’t have to wait to receive your data, then take the time of having to upload it online yourself. Some companies even combine these services, performing bioinformatics and or analysis work with online storage so you can simply view it online without ever having to directly interact with it. Often these companies have a web interface in which you can view genomes, test results, or whatever data you were looking for. Open API (or Application Programming Interface) is a term for a computer interface in which software developers and scientific programmers can interact directly with data via code. This can allow you to use your own software applications to do additional analysis or anything else you might do, if you had the data on a hard drive in your office or lab.

The greatest drawback is that most of these services come with a large price tag attached. You pay for each service you tack on while also paying for all the data you upload and/or download. However, this additional expense needs to be weighed against the cost of storing it yourself. Consider the costs of doing everything these 3rd party companies do for you and weigh that against the cost of doing it yourself. This might involve hiring an IT staff dedicated to running a small data center operation, experts in manipulating and organizing the data, trained staff to do analysis or other specialized work and management to make sure it all runs smoothly. Additionally, if you’re storing large amounts of data, this typically becomes more cost efficient as your operation grows. Online storage companies typically offer discounts for bulk storage and your costs can be spread over all the data sets you upload. The price per terabyte is always a fraction of what a portable hard drive of the same size would cost.

These managed services where you have released a great deal of control over your data can really streamline operations. You don’t have to deal with hard drives, software, and a potentially sizable staff, and can instead leave the data storage and management to experts who already have the resources in place to make your data as easily accessible and secure as possible.