I've been playing around with "R":http://www.r-project.org to correlate GPS and heart rate monitor logs from when we were cycling around in Italy.
One of the problems that I've run into is that the GPS position accuracy degrades significantly when less than 4 satellites are used, altitude data likewise degrades significantly when less than 6 satellites are used.
These high error data points add noise to the data set and complicate analysis. One option would be to find a way to calculate the error in position for each point, and move that data around with the position data.
That's hard. An easier way is to simply convert the inaccurate points to NULL (NA in R).
To import the data from my heart rate monitor, I used "s710":http://daveb.net/s710. To import the data from the GPS NMEA logs, I created a "gpsbabel":http://www.gpsbabel.org filter:
DESCRIPTION GRASS GIS v.in.ascii
SHORTLEN 8
EXTENSION grass
# FILE LAYOUT DEFINITIIONS:
FIELD_DELIMITER PIPE
RECORD_DELIMITER NEWLINE
BADCHARS ,"
IFIELD LON_DECIMAL,"","%f"
IFIELD LAT_DECIMAL,"","%f"
IFIELD ALT_METERS,"", "%f"
IFIELD GMT_TIME,"","%m/%d/%Y %H:%M:%S"
IFIELD PATH_SPEED,"","%f"
IFIELD PATH_COURSE,"","%f"
IFIELD GPS_HDOP,"","%f"
IFIELD GPS_VDOP,"","%f"
IFIELD GPS_PDOP,"","%f"
IFIELD GPS_SAT,"","%d"
IFIELD GPS_FIX,"","%s"
I had originally created this format for easy import into "GRASS GIS":http://grass.itc.it, but it works nicely for importing into "R":http://www.r-project.org :
Lat|Long|Altitude|timestamp|speed|bearing|hdop|vdop|pdop|satellites|fix
11.128023|43.430142|143.300000|09/26/2006 07:42:34|0.092600|11.690000|1.600000|0.000000|0.000000|6|3d
11.128025|43.430142|140.400000|09/26/2006 07:42:35|0.174911|177.690002|1.600000|0.000000|0.000000|6|3d
...
Read the GPS and heart rate data into an R frame with *read.table*.
raw_gps<-read.table("060926_bike_colle_valdelsa_siena.grass", sep="|", header=TRUE)
raw_hr1<-read.table("060926.0947.txt", header=TRUE)
raw_hr2<-read.table("060926.1447.txt", header=TRUE)
Create vectors to *mask* off the bad data.
mask_4sat <- !((!(raw_gps$satellites > 4)) & NA)
mask_6sat <- !((!(raw_gps$satellites > 6)) & NA)
This uses the identity that FALSE & NA == FALSE, but TRUE & NA == NA.
More data massaging is in order
* I don't have HR data for the entire time interval that I have GPS data for
* I need to line up the data sets
* the GPS data is sampled with a period of 1s, and the HR data is sampled with a period of 5s
The way I choose to deal with that is to create time series using *ts()*, and extend and resample the time series with *window()*.
speed <- ts( raw_gps$speed * mask_4sat, start=c(7, 2554), frequency=3600)
alt <- ts( raw_gps$Altitude * mask_6sat, start=c(7, 2554), frequency=3600)
hr1 <- ts(data=raw_hr1$HR, start=c(7,564), frequency=720)
hr2 <- ts(data=raw_hr2$HR, start=c(12,567), frequency=720)
I've chose my unit of time to be the hour. The *start* argument to ts is particularly important, to get all of the data to line up. The easiest way to get all of this is to ensure that the clock on the heart rate monitor is set to UTC, and the GPS logs are in UTC. The *frequency* argument reflects the sampling rate of the GPS and heart rate monitor.
Now, mash all of this data together into a time series matrix, padding and resampling the constituent vectors to be uniform.
bike_cole_valdelsa_to_sienna <- ts(
matrix(
c(
window(alt, start=7.5, end=13.5, frequency=720, extend=TRUE),
window(speed, start=7.5, end=13.5, frequency=720, extend=TRUE),
window(hr1, start=7.5, end=13.5, frequency=720, extend=TRUE),
window(hr2, start=7.5, end=13.5, frequency=720, extend=TRUE)
),
ncol=4),
start=7.5, end=13.5, frequency=720)
colnames(bike_cole_valdelsa_to_sienna) <- c("altitude", "speed", "HR1", "HR2")
Now the data is all referenced together, and ready for some analysis.
!/i/pics/060926.bike_colle_valdelsa_siena/060926.bike_colle_valdelsa_siena.alt_speed_hr.png!