Buoy System Handbook | ||
---|---|---|
<<< Previous | Processing Overview | Next >>> |
There are currently four data sources for the buoy system,
Cell Phone. Data telemetered via cell phone constitutes the backbone of the buoy data ingest.
Data is transmitted back to PhoG group windows PCs named helmholtz or backup Holmboe on an hourly basis. Buoys M01 and N01 telemeter data using an irridium phone. L01, J02, I01, F01, and E01 use regular cell calls. A01, B01, C02, and D01 use internet connections.
MATLAB runs within a cron job on a linux workstation ekman, reading buoy-specific files from helmholtz.
Data is split out into constituent data streams, which may include met, doppler, and others. A list of the datastreams and a description of the structure of each can be found elsewhere.
As the campbell data is read in, it is also archived so that it may be manually examined later.
Each data stream is processed and archived to NetCDF files describing the time series data. The raw data is archived separately from the processed data, and this particular data stream is known as sensor-raw and realtime. The most recent observation for most parameters is used to update a MySQL database table that is accessed by the web server. These entries are overwritten upon each new observation.
GOES. As a backup to the cell phone system, most of the buoys are now outfitted with GOES transmitters. This constitutes an equivalent datastream to that provided by the cell phones. It is a smaller datastream due to bandwidth limitations, but the most important geophysical parameters are usually transmitte. For more information on the GOES satellite system, see [ Nestlebush ]. The script that initiates the GOES processing is currently ${BUOY_ROOT}/bin/process_goes.sh. The basic processing scenario is as follows...
Every hour at a preset time, each buoy transmits to the GOES satellite system. There is a schedule that is followed, where the buoys usually transmit in windows of 10 to 30 seconds. As soon as this transmission window has passed, it is safe to initiate a download from UMaine.
The download from the Wallops Data Center is accomplished via a a perl script. The Expect perl module is employed to automate this process. Each buoy has its own unique id, referred to as a platform_id. The data is retrieved in a binary format. The perl script which accomplishes this task is currently ${BUOY_ROOT}/bin/retrieve_goes_buffer.pl.
A MATLAB session launches, decodes the binary input, and classifies the data into the various buffer streams. After this point, the processing is indistinguishable from the cell phone processing. However, the data retrieved via GOES is archived separately from the cell phone data so as to correctly distinguish its origins. It is in fact referred to as the goes-sensor-raw and goes-realtime datastream.
NDBC. Data from the last 45 days for NOAA buoys, C-MAN stations, and drifters in the Gulf of Maine are retrieved via HTTP from NDBC.
There is a core of processing routines that all of these sources feed into. For example, all aanderaa data regardless of source feeds into a routine called aanderaa_processing_stream.m. The other instrument data streams have similarly named core processing routines. Therefore, in order to add aanderaa data from a hypothetical new data stream, all one would have to would be to package up the input data to match that which is expected by aanderaa_processing_stream.m.
Once MATLAB processing has finished, rsync is invoked to mirror the data repository onto the machine gyre.umeoce.maine.edu.
Quality control is guided by the valid_range attribute that is present for each measured parameter in its NetCDF file. For example, temperature on the Aanderaa has a valid range of [-0.5, 30] °C. Each datum is checked against its valid range by a single MATLAB routine. Datums falling outside this range have their quality flags subsequently modified. Each day, a separate process examines how many times a parameter exceeded its valid range and reports it. This information can be used to determine whether or not a valid range needs to be changed. The valid range attribute is a part of the standard set of metadata for NetCDF files and need not be the same across all NetCDF files. For example, it may be determined that a different valid range exists for temperatures at 50 meters depth as opposed to temeperatures at 2 meters depth.
Parameters can also be marked as completely invalid and without need of any range checking. Being without need of range checking can be important if sensor values are fluctuating wildly. Some of those fluctuation values might well be within the stated valid range, but are nonetheless no less questionable. By marking such a parameter as invalid (done with the is_dead flag), this situation is easily dealt with.
Additional quality control checks are being done in realtime on many of the data variables. UPDATE: winds, wave heights, air temperature, barometric_pressure... See specific process_realtime scripts for further details.
<<< Previous | Home | Next >>> |
Processing Overview | Up | Processing Types |