Modeling GPS data


Is not very easy. It seems pretty obvious but when you look at it closer it's pretty damn hard.

There are a few caveats I've encountered...

Deviation. Especially when you're not outside or when the signal is somehow blocked. But the GPS signal is deviated anyways (to prevent exact pinpointing of the satelites).

The deviation makes your signal very jumpy, especially with bad reception. My signal will jump in a radius of a kilometer when the gps logger is near the window.

Day at home (blue is my home)

The signal is better when outside, especially when moving, but even then there's still a spike every now and then. Or sometimes the signal is off by a few meters, possibly by bad calibration.

The other problem are spikes. This problem is by far easier to tackle since you can easily pinpoint fast changes in speed (5 kilomters in 15 seconds? Right).

But what if you step into a train and your speed suddenly increases to 30 - 40 meters per second? How can you tell the difference from a spike? And how do you know that when you're walking in a circle, you're not just stationary in the centre and the GPS is jumping?

A trip on foot thru Amsterdam (no spikes)

And what is the right interval to record at? Every second? Every minute? What can you cut off the track without losing vital information like corners?

As far as detecting stationary tracks I figured a clustering technique like K-means would work very very well to detect them. But clustering requires several passes through the entire dataset. This can be very costly in scripting languages.

At the moment I can't get to a satisfying algorithm that analyzes and compresses tracks like that without losing too much (vital) track information.

The only think that seems to work decently is cutting away spikes (when the speed betwee two points differs more then 8 m/s from the previous point, you can be pretty sure it's a spike. If not, it's probably a straight enough line anyways) and some point is probably static when the speed is ~ 1m/s.