Introduction to world of data management
Shannon Hicks, my coworker at the Stroud Water Research Center, has written several posts about the cool data loggers and sensor packages she has developed here. But, unfortunately, just designing a new instrument and deploying it in the watershed doesn’t answer questions about watershed science. To get the answers we’re looking for we need to be able to combine, compare, and analyze both current and historical data from all of the different types of sensors and loggers all over the watershed.
Before coming to the Stroud Center, I would have thought that was a simple task: pull the data into excel files, make a few graphs, and start doing the science. That’s what I did with my data from my undergraduate thesis in college. And if you have only 1 or 2 people interested in the data and that data is only from a few loggers over a short time period, the file/excel system kind-of works.
But once you have more loggers and files and sites and people involved this system falls apart very, very fast.
What we need is a system that brings the data from our individual (commercial and home-made) loggers into the lab, combines them into a single system, provides the means to search, visualize, and do quality control on the results, and makes the data publicly available to anyone who is interested. I would like to say that we’ve solved all of these problems, but then I’d be flat-out lying.
What I can say, though, is that we’re taking steps toward our dream system. In my next few posts I’ll give more details on the pieces we’re working on, how far they’ve come, and what we still have to do.