• Category Archives tech talk
  • Computers and programs, maps and GPS, anything to do with data big or small, as well as my take on the pieces of equipment I use in other hobbies — think bike components, camping gear etc.

  • Back on the Python Train

    I’ve been doing a bit of experimenting with FIT files, and C and ogr2ogr and… I’ve decided to use Python for my latest GIS project.

    I was able to extract some (but not all) of the data I need from the fit file using GPSBabel and, in an unwieldy process, send it as a CSV to PostGIS via ogr2ogr, and do the final processing within the database. What GPSBabel did not get did not get me was the lap info — I’d need to write some kind of program to extract it myself , and any real processing I’d need to do — aggregating my track points into a line for example, or timestamps and GPS positions into average speed — seemed more suitable for doing in a program anyway.

    Meanwhile, I had downloaded the ANT/FIT SDK, and it contained a C library as well as usage examples. These were all written to be a part of some Windows-based IDE’s build process, but they were easy enough to put into Makefile format and get running and, by modifying and re-writing the examples, I managed to extract all the necessary data from the FIT file. The next steps were to process and aggregate the data into summary form, and (using OGR libraries) to add the summarized activity data as a record into my database.

    I did some Googling for advice on how to go about this, but I got few hits for doing it in C and many for Python: basically there’s a library for reading FIT files, and multiple libraries each for processing geometry data and connecting to PostGIS, and the code for my first attempt came to about two dozen lines. It feels slow, but I noticed that my C program also took some time to read and process the points in the file — the Python version isn’t really slower in comparison, and writing the code was soooo easy that using Python was worth it for that reason alone.

    I need to finish my little piece of code, then I can use it on my machine as a standalone program. The next step after that is to find out how to use it from a website — there are several ways, and they all seem easy enough — and build a front end to access my activity data. Knock on wood, but the worst of the learning curve is over.


  • New Project: Down the Rabbit Hole and Still Digging

    I started looking into my new project the other day. The first steps will have to be extracting information from GPX or FIT files, and adding the information to a PostGIS database. I managed to do this in several ways, mostly through a combination of GPSBabel and ogr2ogr, though no single way has done exactly what I want yet: ogr2ogr automatically adds GPX data to the tables in a manner similar to what I want, but extension data (heart rate, temperature) is not treated the way I want, while the FIT data needs to be extracted first into a format readable by ogr2ogr, and then put in the right table form after being put in the database, all of which turned out to be surprisingly easy. (Even so, I may just choose to go with adding the data from GPX for now.)

    The biggest problem I’ve run into so far is that GPSBabel does not extract all the data from the FIT file, and FIT is a proprietary, binary file format — I can’t get lap information, for example, just by scanning the file with awk or something. I may have to download and use the (again, proprietary) FIT SDK, in a C or other program I write myself. This may fit in well with what else I have to do, since I can call the parts of ogr2ogr I specifically need, directly from C.

    Before it gets to that point though, I have to decide what I especially want to do with this data, which will tell me what I need to extract, what I need to save, and what I can disregard, or discard after processing. Do I want to build a full-blown replacement for Garmin Connect, where I keep all relevant data? Or do I want to just build something, like a web badge, to show a minimum of data about the ride, data like distance, duration and a map of the ride, with maybe a link to the ride’s Garmin activity page? I am leaning towards the minimalist approach (which would entail just saving one record per activity, with fields containing aggregate data), but I think I want at least some of the individual track point data because I may want to graph things like elevation or heart rate.

    But maybe I don’t need to keep trackpoint data to build my graphs on the fly. Maybe I can make small graphs as PNG’s or GIF’s for the badge, and store those images in the database — hopefully they would be smaller than the trackpoints themselves. Alternately, I could store the entire FIT file (which is actually pretty small) in the database, and extract whatever I need on the fly. (I would still do a one-time analysis to get and store my aggregate data, since this might be a little too slow for on-the-fly data generation.) These choices will depend on the results of all the little coding/database/GIS experiments I’m doing now, extracting, converting and aggregating sample data.

    Ten Years Gone: This is what I wrote on this date in 2008. We voted today, and I remain hopeful, but it is certainly not as happy a day as that one was, and even with good news I don’t think we’ll match that day.

     


  • It Is Done

    My second GIS routing project is now finished; I just added the final touches to the front end a few minutes ago. It can be improved in several ways — the routing engine could be quite a bit faster, for one thing — and the data it runs on, from OpenStreetMap and other sources, should be updated periodically, but This Project version 1.0 is basically done. (I suppose I should add a write-up here before I put the thing to rest, but you know what I mean: the program/website itself is complete and fully functional.)

    That means I need a new map project. The routing experiment was meant to have three projects, or rather one project done three ways: one each using QGIS, pgRouting, and GRASS, before I decided to branch out into separate projects. I’ve now got the first two completed, but I have no idea what to do for the GRASS project — I guess it will just have to wait until inspiration strikes. In the meantime, I may go back to the first project, or at least glean some of the results from it, to help build a web page for the Lehigh Towpath, something I can add to my old bike page. This may also morph into some trail promotion project in real life.

    Yesterday was pretty nice, if cool, and Trick or Treat was really fun. Today is chilly, rainy, and windy, and I spent the day inside with no regrets. We’re going to see a concert, featuring Anne’s violin teacher, tonight in Palmerton.


  • Race Day

    Well, we’re back from the half marathon in Hersey, and now back also from our nap…

    The race started at 7:30 AM, so we had to be there at say 6:30, so had to leave the house at 5:00, meaning we all had to get up by 4:30. We were all in bed by 9:00 last night, but it was still a hard morning. We got there about 6:45 — crazy parking traffic — and that was almost like “just in the nick of time” considering the bathroom lines, but Bruce & Heather lined up with no problems and the race went off without a hitch, then just as the race started we met up with Lorraine.

    We walked around to several different vantages together, managed to see all our runners (Heather & Bruce, and Adelle & Liz who did it as a relay), and I even got a few photos. The whole thing was over by about 10:00. After navigating back through the parking traffic mess, we all met up for brunch at a place in Hersey. Good to get some food and to catch up with everyone, but it had been a cold, windy day and the place was chilly inside; we were glad to get back in the car and crank the heat. We were home by 2:00.

    New Tools Bring New Opportunities

    One area on my routing map has been a bit problematic: Rt 329 out of Northampton goes past a reservoir, or old quarry or something, and the DEM elevation data dips pretty hard right next to the road, as well as under it at a bridge. Since I find total ascent and descent for each road using interpolated DEM data at points along them, the roads that go over, or even just near, big elevation changes can have large ascent/descent values even of they are relatively flat.

    The bridges have had an easy enough fix for a while: I simply make the ascent and descent (and adjusted ascent/descent) zero for each bridge, and I do the same for very short roads connecting to the bridge, like abutments. In other words, I fudge the data… (I figure the bridges are all fairly flat anyway except some longer, river-crossing ones, and since those are pretty far apart their actual ascent/descent values won’t affect the routing calculations much.)

    Fixing these roads near the quarry was a bit harder. I didn’t want to set ascent/descent values down to zero for the whole long and moderately hilly road, but now that I can update the ascent/descent data much more quickly — this was that “the task went from several hours to under a minute” process improvement from the other day — I was able to do my fudging on the elevation-at-road-points data: I made the elevations in the “dipped” spots the same as the points just outside, then re-updated my database with the new script. It worked great, the roads now route more realistically in that area, and it took about 5 minutes to do.


  • Milestone

    So I’ve been messing with that Lehigh Valley bike commuter routing program again, and I have made some important strides:

    • I found a way to update the recommend routes easy/advanced, etc, by maintaining separate tables of these routes as linestrings (which I can add and subtract, draw and redraw), then updating the relevant field in the main table using a spatial join. The update process is now automated simply by running SQL files, one for each type of route.
    • I sat down with the workflow for updating the main map table, and managed to automate much of it — everything but the SAGA tasks, though I think I can automate them too, eventually. I also managed to streamline one of the more time-consuming tasks: Generating the ascent/descent tables used to take upwards of 12 hours when I first did it (using Python within QGIS), and my next iteration (using PostGIS) took about 20 minutes, but my latest method got it down to about 57 seconds. Fifty-seven seconds! All of these are now also stored as SQL files or functions, so they are available almost at the push of a button. (My goal is a shell script putting all of this together.)

    These were both pretty big deals, since they were the only things keeping the project from being truly functional. Before this, keeping the database up-to-date was like pulling teeth. Unfortunately, I decided to add some front-end functionality, testing to see if selected points are within the Lehigh Valley, and that’s been a bit of a struggle, but if I can find a host for the project I think I can go live really soon.

     


  • Housebound and Alone

    Well not really, but Anne is now into the second week of her bike adventure, and my stubbed toe — can you believe it? — along with the weather, kind of keeps me from wanting to be very active. So, what have I been up to?

    Well, for one thing, I’ve been trying to stay ahead of domestic disaster, here on my own. The Trail Summit kept me busy for a few days, then I went on a round of house cleaning: I straightened, dusted and vacuumed upstairs one day, then did the same downstairs another day, and in between I did some food shopping and ran errands. I also had a bit of an “infrastructure incident:” the support at the wall for one of the clothes hanger rods broke in my closet — I hung up some suits from the drycleaner the day before — so one errand was to Lowes, where I got the support but no other thing I needed. (We need a new kitchen clock, among other things.) This came on the heels of a completely wasted trip to a new phone repair place, which claims in their advertising, and in the “grand opening” article in the local paper, that they can fix just about anything. I show up, looking for a new battery and a replacement dust cover for the charging port — “we can’t fix that.” Yeah I was doing a slow burn after those trips…

    Anyway, I’m continuing with some minor repairs here, changing light bulbs, keeping busy, trying to stay on top of things. I got the phone parts, as well as a new Garmin battery, from Amazon, so when they show up I can do a little DIY repair. There are also a few things at Anne’s office that need doing, which I’ll probably tackle in the next few days. Keeping busy.

    The big thing I’ve been doing has been putting some finishing touches on my Lehigh Valley Commuter Bike Routing Project. I need to update the big database of streets (a daunting task), but I developed a way to quickly get streets that are a part of preferred routes, routes to be avoided, etc identified and updated. This has been a stumbling block, because I’ve had unused logic on the code, to prefer or avoid roads based on which preferred routes layers were visible, and I had no realistic “preferred routes” developed. With this new trick (short scripts to do the updating, based on spatial joins), I drew up a bunch of easy routes, more advanced routes, legal but inappropriate roads, and dirt paths, and added them to the database. Son of a bitch, it all worked!

    I also did a little site cleanup, making things work and look nicer; its close enough to done that I may show it to someone soon — it still has to live on my laptop, since I still have not found a free host that can/will handle pgRouting.

     


  • Odds & Ends

    Posted on by Don

    I have a bunch more photos to put up about the final leg of our vacation (Ben’s graduation), but before I get to that I have a few other items, and a few other vacation photos, I want to post that really don’t go anywhere else.

    Vacation Miscellany

    Just a few photos of things around the cabin. Our place apparently was a camp once, having multiple primitive cabins, etc, and had been refurbished — and had the main house added — after years of downward fashionableness and possible abandonment; three cabins were still standing, one converted into a sort of detached den or game room, and the other two converted into separate sleeping quarters. Behind the cabins, as things were now arranged, was a small pond with a dam at one end. I’m not sure how important the pond had been in the past — it had the look of a kiddie fishing area — but now it was brown and scummy, and working its way back to being a meadow. (The lake was a lot better, but the muck at the bottom made for unpleasant swimming. Only Alex and I tried, and we only tried once.)  There were other camp amenities, including a fire pit which we made use of on the chilly nights.

    Shapes and Clusters

    The clustering experiments were a success, but what I really want is to show the regions or neighborhoods where my cycling amenities are clustered. I’ve been trying several different ways to build a shape around a group of points:

    • Convex Hull: this one is pretty nice, it’s the shape you’d get if a rubber band were stretched around the points. It’s also built into both QGIS and PostGIS. Unfortunately, if the point cluster has concavities the convex hull won’t show them — an L-shaped cluster would get a triangular region.
    • Concave Hull: this one is also available in both QGIS and PostGIS, but I don’t trust it — I can’t find too much about how it really works, its very name doesn’t make all that much sense, and it requires parameters that are not as well documented as I’d like.
    • Alpha Shape: the most promising of the bunch, defined pretty rigorously in “the literature,” and I like the l looks of the shapes it makes. Unfortunately, it doesn’t exist in either QGIS or PostGIS; it is available as a package in R, so I’ve spent some time this week getting R to run correctly after much neglect, then installing the “alphahull” package and trying it out. I managed to import my data and  create alpha shapes; now I have to find how to convert and export the shapes back into my database.

    There is one other method I just thought of, and pretty simple compared to these approaches: I could just make a heat map from the clustered amenities, then use a “contour line” function on the heat map raster. If the others don’t give satisfaction I may try this.

    Around Here

    Today was a brief respite from days of heavy, almost continuous rain — more is coming, starting tomorrow. I took the opportunity to attack the jungle that once was our back yard, managed to use up all the weed-whacker twine, and ran over a yellow jacket’s nest (no stings, but a fairly hasty retreat into the house for a while), and the yard looks much better if not quite 100% yet.

    We’ve also had a Warm Showers guest: a young Brit named Arron who landed in New York and is cycling across the US. He’s early in his ride, not quite acclimated to cycling, and he’s getting a real baptism by fire, or at least by rain and hills and poor road choice, but he was a trooper. He stayed for two nights before heading for Coopersburg.


  • A Sojourn* Into Clustering

    Posted on by Don

    I was looking at the towpath amenities project in the week before we went on vacation, mainly to play with database reporting software, and I noticed that my amenities all were pretty closely grouped together. This stands to reason, since the data is a ready-made cluster — it’s composed of amenities within a kilometer of Sand Island, so the clustering may just be an artifact of that search criterium — but also because the data set encompasses the compact  Main Street restaurant district. Continuing on with my reporting experiments, I looked at all amenities within a mile of Sand Island, and now found myself looking at two distinct groups of amenities, the one around Main Street, and another on the south side of the Lehigh. This also stands to reason — Chamber-of-Commerce types like to joke that we’re the city with two downtowns — but again I wondered if it was some artifact of the analysis, or even if I was seeing patterns that didn’t really exist, and that got me thinking of what I actually thought I meant by “cluster.”

    Turns out, it’s a fairly big subject, with different ways of describing what “cluster” might mean — usually (and intuitively), it’s a subset of similar items within a larger data set, but then what does “similar” mean, and how similar do the members of a cluster have to be, especially compared to the rest of the set? For each way of understanding what a cluster is, there are various ways of finding the clusters within a data set. This whole subject is apparently a big deal, a subject of ongoing research, and an important tool in the fields of machine learning and big data.

    My problem was spatial, so for me “similar” meant “close together in terms of location.” Some Googling found that there were plenty of GIS solutions to clustering problems, and in fact PostGIS contains several functions implementing the more common and important clustering algorithms, including DBSCAN, the algorithm that comes closest to what I think “clustering” should mean for my situation.

    And here is where things became complicated…

    The clustering functions are not available in the version of PostGIS that I had installed. So I decided to upgrade PostGIS, did a bit of research and found many articles with titles like “How to Brick Your Database By Updating PostGIS.” The process itself is not difficult, it uses old-school “make” rather than a package manager, and the pitfalls are easily avoided, but now I was scared and I thought I’d better back up my whole database system before continuing. What this meant though, was that first I had to make room on my hard drive, which has one (small, overcrowded) main partition and a (large, empty) secondary area. First thing would be to back up the secondary partition to the NAS drive — something I’ve been remiss on ever since I installed Mint — then I’d move both my music (35 GB) and my photos (12 GB) over to the secondary drive, and then update the music and photo software so it knew where all the files went — it was starting to sound like that song about the hole in the bucket…

    I got through the first part, backing up the drive (which took hours), before we went on vacation. There was no Internet at our cabin, and I didn’t bring my computer anyway, so the rest had to await my return. The remainder of the hard drive cleanup (music and photos) also took some time but went smoothly enough, and I did a full backup of my databases.

    From here the process was a bit anticlimactic: I downloaded the new version, ran make and typed a few things into the database, and I was done without bricking a damn thing. I needed to lay on my fainting couch and rest for a day after that, but when I finally got around to using the new functions they were a breeze.

    I found some clusters and drew polygons around them — the subjects of another post —  but I have more to do to figure out what these things are actually telling me.

    *Hat tip to Achewood, still my favorite Internet thing ever.


  • Cancel The Exorcist

    Posted on by Don

    I had a pedal-induced creak building in the 5010 over the past week or so. The last time I had anything like this it was one of the pivot bushings, so a few days ago I tightened them — no fix, and the creak was worse than ever yesterday. I did the pivots again today, and the creak remained.

    Next step was to look at the crank and bottom bracket. I’d never taken my crank off this bike, didn’t recognize the system (for the record: it’s a Race Face Aeffect crank with Cinch chainring tech), and tried for about an hour to remove it. This included at least 20 minutes looking through instructional videos, but this information seems to be some kind of secret…

    I finally found one that showed how, and here’s the secret: there is a dust cap on the drive side, removable by an 8mm Allen wrench, but you don’t remove it. Instead, use a 7mm Allen wrench inside the dust cap to unscrew an internal connection to the ISIS drive; this pushes against the dust cap and acts as a self-extractor. It came off smooth as silk — live and learn. (Note: take the dust cap off to re-install the crank, the internal screw is kind of finicky to get going.)

    I cleaned and lubed the crank parts, then looked at the bottom bracket and found my problem: the drive side had come loose, and that, coupled with the grit that subsequently got into the threads, was the likely cause of my creak. I pulled the BB, cleaned and greased the threads, put it all back together, and took it for a test ride. Perfect! No squeaks and no creaks, and that’s good because I don’t know what I would have needed to do next.


  • I Can’t Stop

    Posted on by Don

    Here’s one I made, using Carto:

    This looks a bit closer to usable, though I don’t see much real style control when making the map. Maybe a little studying over at Carto…

    Meanwhile, I tried one of my demo web pages that uses Leaflet, embedded in a test post, and it (the map) worked great. Unfortunately, the map itself was only one part of another website, and stuffing the entire thing into an iframe caused some display/clutter issues, so I decided to write another demo to see what I can do. This can sit and stew for a while, I think I got whatever out of my system — it’s time for a ride.