Set up

Go through this document one chunk at a time to insert new soil moisture measurements data into the database.

First, load the necessary libraries

library(RPostgreSQL)
library(readr)
library(dplyr)
library(tidyr)
library(uuid)
library(magrittr)
library(kableExtra)
library(DT)

Then connect to the database

password <- scan(".pgpass", what="")

db <- dbConnect(PostgreSQL(), 
                host = "sesync-postgis01.research.sesync.org",
                dbname = "choptank", 
                user = "palmergroup",
                password = password)

Load up all of the functions defined in scripts in the R/ folder at once.

pathnames <- list.files(pattern="[.]R$", path="R", full.names=TRUE)
sapply(pathnames, FUN=source)

Upload and prepare data

Upload the spreadsheet with your new soil moisture data. Either use a file browser:

new_measurements <- file.choose()

Or supply the file path:

new_measurements <- read_csv("../odm2-samples/data/Field_measurements - SOIL_field_measurements.csv")

The new_measurements data frame should look something like this:

new_measurements %>% head() %>% kable(format = "markdown")

Date	Time	Site	Location	VWC_percent	soil_EC_mscm	soil_temp_C
November 28, 2017	08:52:00	QB	SC B	48.2	0.28	8.3
November 28, 2017	08:54:00	QB	SC C	42.0	0.27	8.1
November 28, 2017	08:57:00	QB	SC D	28.3	0.19	8.0
November 28, 2017	09:01:00	QB	SC E	12.8	0.16	7.8
November 28, 2017	10:14:00	DK	SC A	68.3	0.22	10.0
November 28, 2017	10:20:00	DK	SC B	57.9	0.23	10.1

Data column names

Make sure the column names for data match the template. This chunk should return TRUE.

all(c("VWC_percent", "soil_EC_mscm", "soil_temp_C") %in% names(new_measurements))

## [1] TRUE

Site codes

Make a new column with site code names called samplingfeaturecode if it isn’t there already. If you have the wetland site code and the transect point in 2 separate columns, you can combine them using paste.

new_measurements$samplingfeaturecode <- paste(new_measurements$Site, new_measurements$Location)

Make sure all the sites in the new data are in the database by reading in the samplingfeatures table and checking all the site code names in the new data. The function check_samplingfeaturecodes (defined in helpers.R) returns the names of any sites not in the database compared to a given vector.

check_samplingfeaturecodes(new_codes = new_measurements$samplingfeaturecode) %>% head()

##   new_codes in_db
## 1   BB SC D FALSE
## 2   BB SC E FALSE
## 3   BB SC E FALSE
## 4   BB SC E FALSE
## 5   BB SC E FALSE
## 6   BB SC E FALSE

If you need help figuring out the exact site names, look for site names in the database with a given pattern using the function get_site_names_like()

get_site_names_like("QB SC")

##  [1] "DB SC-E" "DB SC-D" "DB SC-C" "DB SC-B" "DB SC-A" "TB SC-A" "TB SC-B"
##  [8] "TB SC-C" "TB SC-D" "TB SC-E" "BB SC-E" "BB SC-D" "BB SC-B" "BB SC-A"
## [15] "QB SC-E" "QB SC-D" "QB SC-C" "QB SC-B" "QB SC-A"

Fix site code names if needed. For example, soil chamber sites might need hyphens added.

new_measurements$samplingfeaturecode <- gsub(pattern = "SC ", "SC-", new_measurements$samplingfeaturecode)

Check site names again if needed. If there are no rows returned then all the site names match sampling feature codes in the database.

check_samplingfeaturecodes(new_measurements$samplingfeaturecode)

## [1] new_codes in_db    
## <0 rows> (or 0-length row.names)

Method code

There needs to be a defined method in the methods table about how samples were collected. Make sure that sample collection method exists by checking the methods table online or reading it directly from the database using the dbReadTable() function. Note the methodcode or make a new one with the db_add_method() function.

dbReadTable(db, c("odm2", "methods")) %>% kable()

methodid	methodtypecv	methodcode	methodname	methoddescription	methodlink	organizationid
1	Specimen collection	waterisotopefield	water isotope sample collection	Collect water sample without headspace for water isotopes	NA	NA
3	Specimen analysis	waterisotopelab	Picarro L1102-i	D-18O and D-D ratios measured using a cavity ringdown spectroscopic liquid water laser isotope analyzer (Picarro L1102-i) in the Virginia Tech Water Quality Lab housed within the Biological Systems Engineering Department.	NA	NA
5	Instrument deployment	soilmoistureTDR	Soil moisture measurement	Point measurement of soil moisture using TDR probe with 1.5 inch probes	NA	NA

# db_add_method(methodname = "Soil moisture measurement",
#               methodcode = "soilmoistureTDR",
#               methodtypecv = "Instrument deployment",
#               methoddescription = "Point measurement of soil moisture using TDR probe with 1.5 inch probes")

Save the methodcode as a variable

methodcode = "soilmoistureTDR"

Date time formatting

Actions need a correctly formatted date and time. Format the date and time. If no time is given, assume midnight (to know that it isn’t a specific time). Also define the UTC offset variable (assuming it is the same as your computer’s timezone). Use ?strptime to find the correct character string for the input data or refer to strtime.org

new_measurements$datetime <- strptime(
  paste(new_measurements$Date, new_measurements$Time),
  format = "%B %d, %Y %H:%M:%S", tz = "")

Define UTC offset as a variable. This will be either -4 or -5 depending on whether its daylight savings time or not.

utcoffset <- format(Sys.time(), "%z") %>% substr(1, 3) %>% as.integer()

Annotations

Coming soon!

Add an annotation to an action or result

# dbReadTable(db, c("odm2", "cv_annotationtype"))
# db_add_annotation(annotationtypecv = "Action annotation",
#                   annotationtext = "measured less than 24 hours after rain")

More metadata

Specify some necessary metadata parameters. You will likely not need to change these. Kelly should probably add these as default function arguments so its not cluttering up this document as much.

actiontypecv = "Instrument deployment"
resulttypecv = "Measurement"
censorcodecv = "Not censored"
qualitycodecv = "Unknown"
aggregationstatisticcv = "Sporadic"
timeaggregationinterval = 1 # assume 1 minute
timeaggregationintervalunitsid = dbGetQuery(db, "SELECT unitsid FROM odm2.units WHERE unitsname = 'Minute'") # minute
processlinglevelid = 1 # definition = "Raw data" # processinglevelcode = 0
valuecount = 1
sampledmediumcv = "Soil"

Insert measurements data

This is where the magic happens! The function db_insert_measurements() writes and runs SQL code to upload data from one row of the new_measurements data frame at a time. So db_insert_measurements(1) would insert data from the first row. Use an apply statement to upload all of the data at once. The eval=FALSE argument means that this code will not run if you knit the whole document.

# db_insert_measurements_sm(1) # for just one measurement

sapply(1:nrow(new_measurements),
       function(x) db_insert_measurements_sm(x))

Query database

Use this chunk to make a data frame called sm_db with all the soil data in the database

get_sm_data <- function(){
  
  sql <- paste0("SELECT mrv.datavalue, mrv.valuedatetime, sf.samplingfeaturecode, r.featureactionid, v.variablecode, u.unitsname
 FROM odm2.measurementresultvalues mrv, odm2.results r, odm2.variables v, odm2.units u, odm2.samplingfeatures sf, odm2.featureactions fa
 WHERE r.variableid = v.variableid 
 AND r.featureactionid = fa.featureactionid
 AND fa.samplingfeatureid = sf.samplingfeatureid
 AND r.unitsid = u.unitsid
 AND mrv.resultid = r.resultid 
 AND r.sampledmediumcv = 'Soil'")

  sql <- gsub("\n", "", sql)
  dbGetQuery(db, sql)
}

sm_db <- get_sm_data()

organize back to the original format using the spread function in the tidyr package

sm_db_spread <- sm_db %>% group_by(samplingfeaturecode) %>%
  dplyr::select(-unitsname) %>%
  spread(variablecode, datavalue)

sm_db_spread %>% datatable()

Soil moisture measurements