Shell Script for Directing Normal Cell Phone Data Ingest

The script takes a single argument, which is the mooring id. It starts off by setting up environment variables necessary for the processing to run. A PID file is generated, which also serves as a log file. The current state of the processing can be determined by examining this file. For example, to examine the state of A0104, type tail -f /data/gomoos/buoy/run/A0104 Most of the processing is done in the shell function do_it. This is done in order to make the logging easier. After the processing is done, the PID file is concatenated to a master log, then deleted.

The actual processing consists of

  1. Check the length of the input campbell file. If it is zero, then no processing is necessary. Otherwise...

  2. Start up MATLAB via the shell script process_cellphone.sh. This will read in the data file, run all the algorithms, and archive the data. The input file is written out in its entirety to a file underneath the primary data directory (where the NetCDF files reside) for each buoy. For example, if we are processing buoy A0104, then the primary data directory is /data/gomoos/buoy/archive/A0104, and the campbell data is archived to /data/gomoos/buoy/archive/A0104/archive/A0104.dat.

  3. Create a rolling data file from the campbell archive. This is around four days of data. Its primary purpose is to be a small and easily managed data source of recent data. Bigelow Labs uses it to retrieve recent observations without downloading the entire archive.

  4. The split.sh shell script splits out the campbell archive file by each buffer. This makes it easier for Robert Stessel to examine each buffer.

  5. Finally, the web pages for this particular buoy are refreshed.

#!/bin/bash

#
# This function is completely logged.
function do_it {

	ppid=`ps -f $$ | tail +2 | awk '{print  $3}'`
	echo ----------------------------------------------------------
	echo ----------------------------------------------------------
	echo - Starting ${current_buoy} $$ at `date`
	echo - PID is $$, PPID=${ppid}
	echo - `hostname -a`
	echo


	#
	# Check to see if the input buffer for this buoy has anything in it.  If
	# so, initiate cell phone processing.  Otherwise initiate goes processing.
	echo checking the incoming directory ${incoming_directory}... 
	input_buffer=${incoming_directory}/${current_buoy}.dat

	if [ ! -s $input_buffer ]; then
	    echo File ${input_buffer} was empty
	else
	    echo File ${input_buffer} has not been modified with the last minute
		echo and has data in it.  GO FOR IT!!!

		echo Apparently this needs to be copied to the last_transmission directory
		cp ${input_buffer} ${incoming_directory}/last_transmission
		scp ${input_buffer} @gyre:${incoming_directory}/last_transmission
	
		echo nice matlab -desktop -nosplash -r  'process_cellphone_wrapper('\'''$current_buoy''\'')'
		nice time matlab -nojvm -nosplash -r  'process_cellphone_wrapper('\'''$current_buoy''\'')'

		rolling_archive.sh $current_buoy 
		split.sh ${current_buoy}
		echo finished with rolling_archive.sh 
		sh cache_updated_webpages.sh ${current_buoy} 
		sh ${RUNTIME_ROOT}/bin/cache_table_js_pages.sh ${current_buoy} 
		echo finished with cache_updated_webpages 

		echo starting rsync transfer at `date`
		export RSYNC_PASSWORD=gomoos_rsync_transfer
		rsync --verbose  --progress --stats --compress  \
		      --recursive --owner --times --perms --links \
			   --exclude "*~" \
			  /data/${BUOY_PROJECT}/buoy/archive/${current_buoy}  gomoos@gyre::${RSYNC_TARGET}
		echo apparently finished with rsync transfer at `date`
	fi




	echo - Finishing ${current_buoy} $$ at `date`
	echo - PID is $$ 
	echo ----------------------------------------------------------
	echo ----------------------------------------------------------
	echo
}



#
# How many input arguments?  If two, then
# the 2nd argument is the "project".  Set this to "blue_hill_bay", for example.
# If not supplied, "gomoos" is the default.
echo Script invoked as "$0: $@"
if [ $# -eq 2 ]; then
	project=${2}
else
	project=gomoos
fi

RSYNC_TARGET=${project}_netcdf_archive
export RSYNC_TARGET

export DATA_ROOT=/data/${project}/buoy
export RUNTIME_ROOT=/data/gomoos/buoy
export BUOY_PROCESSING_CALLING_ENVIRONMENT="shell script"
export BUOY_PROJECT=${project}
incoming_directory=$DATA_ROOT/incoming



#
# Set up the rest of the environment.
. ${RUNTIME_ROOT}/bin/setup_gomoos_environment.sh

#
# Master log file
master_log_file=${RUNTIME_ROOT}/log/buoy.log


#
# Only passed in one argument, the buoy id
current_buoy=$1


#
# check the pid file.  If it exists, then don't run
pidfile=${RUNTIME_ROOT}/run/${current_buoy}
if ls ${pidfile}
then
	#
	# pid file exists. must already be running?
#	echo "Pid file ${pidfile} exists, too much going on.  Remove this file if in error." | mail jevans
	echo "Pid file ${pidfile} exists, ${current_buoy} may already be being processed."
	exit 1
fi

touch ${pidfile}


do_it >> ${pidfile}


#
# Save the pidfile to the master log
cat ${pidfile} >> ${master_log_file}





#
# removing pid file
echo removing pid file ${pidfile} 
rm -f ${pidfile}

echo all done, bye