GIS – Page 2 – The Other Don Kelly

← Older postsNewer posts→

Tag Archives GIS
Anything to do with Geographical Information Systems or mapping.
Native Paths Update

Posted on January 18, 2022 10:02 PM by Don

I kept at it, and am now about a quarter of the way through the trails — the motorway parts, at least — in my Indian Paths of Pennsylvania project. I have a pretty good idea of how the book is organized now, and came up with a pretty decent workflow that gets me through a single path in just under an hour. I do one or two a day. It’s pretty easy to get absorbed, trying to find the tiny old roads and landmarks based on their descriptions in the book, and I’ve been totally sucked into the history of that Colonial-Revolutionary era. (I picked up Mason & Dixon again, since it goes right through the middle of that time and place.)

I also think there will be some epic rides this summer, based on these routes — I’ve been drooling over some of the scenes I see in Google Street View.

📂This entry was posted in day by day looking back tech talk 📎and tagged GIS native paths
Foiled Again!

Posted on January 5, 2022 2:35 PM by Don

I have a love-hate relationship with Paul A. W. Wallace’s Indian Paths of Pennsylvania. I love reading the individual chapters on each path — their descriptions, and the accounts of them in the letters and diary entries of early Colonial explorers, but any hard look at the specifics and the trails themselves become frustratingly vague. This is all the more frustrating because the information looks specific and authoritative enough, until you take that close look…

Some of this is because the original information is vague — nobody was tracking their steps with a GPS back then — so the actual trail location is not perfectly known, and partly it’s because the trails themselves are long gone (though some are at least partly followed by modern roads), so it’s hard to search them out without trespassing, but there also just seems to be some missing ingredient needed to define a trail network.

A few years ago I thought that this last part could be solved with a little bit of GIS detective work, so I started a QGIS project to define the trails and see about building a network, but I sort of ran out of steam — I basically foundered on the vagueness of the trail descriptions. I did one or two in the Lehigh Valley, and realized that the sleuthing needed was a lot more laborious than simple data entry, and the project languished after those first few paths.

I was thinking about all this again recently, and realized that there is a critical first step I ignored: the book serves primarily as an automotive guide, with detailed instructions for driving in the vicinity of each path. I also thought that if I broke the task down to a set of database tables, I could link these auto routes to their various paths and book chapters . (Some trail chapters actually describe multiple trails and subtrails, while some motorway descriptions continue across multiple chapters, so many-to-many relationships abound but that’s what databases are for. Furthermore, most of the trail chapters have a start and an endpoint, yet more data I can use to cross reference.)

This scheme fell apart within the first few trails. The very first trail, the “Allegheny Path,” has Philadelphia as the start point and “Pittsburg and Kittanning” as the endpoint — so which is the endpoint? Apparently neither, because the trail is only described as far as Harrisburg; the “Allegheny Path” chapter ends with references to several other trails (different chapters, in other words) heading West from Harrisburg as possible continuations. So OK, I can deal with this: my endpoints are really Philadelphia and Harrisburg, and I’ll stuff the rest of the info into my “description” column. (There is a second path listed in that first chapter, but it is little more than a historical aside and a reference to another path/chapter. This is going to get tricky.)

Luckily the motorway for the Allegheny Path is easy to follow. I used an open routing plugin to follow along a bunch of control points, and voilá I had my linestring. This ain’t so bad!

The very next chapter, I ran into motorway difficulties: the route description made no sense. Either the routes were not prepared with adequate ground-truthing (unlikely, though I was starting to feel uncharitable), or the roads (and their designations) had changed at some point in the 55 years since the book came out. This seems the more likely explanation, since I-80 goes right through the area in question, was only finished in 1970, and probably changed a lot of things in its wake. I actually found the Wikipedia article on the Bald Eagle Creek Path more useful.

So I’m back to deciphering and making judgement calls rather than strictly converting the information from one format to another, even for these road descriptions. I didn’t expect this project to be done in an afternoon, or even a week or so, but “going to take forever because I’m not really working on it” is now closer to my expectation.

(Note: I found that someone already took these paths and put them into a GIS, but it’s on PA-Share and that’s proved difficult to work with — and deliberately limited, unless you pay — so far. We’ll see…)

📂This entry was posted in looking back tech talk 📎and tagged GIS native paths
Oh No Not That Again (Part 2 of 2)

Posted on December 13, 2021 11:27 AM by Don

So I was playing with the commuter mapping program the other day after doing some simple maintenance, just finding routes from here to there, and it started bothering me again that I could not route onto the towpath from Sand Island — the network was incorrect, it had no intersection from Main Street onto the path.

I get my road data from OpenStreetmap, and I know that, in OSM, the trail is properly connected at Sand Island — I fixed that myself years ago, but never went through the rigamarole of updating and rebuilding my network. It didn’t seem worth the work for such a small change. There is another way to make that change though: I could modify my existing network, but that always seemed like it would be even more complicated and difficult than rebuilding from scratch.

But would it be? The task really boils down to two things: adding a node where I want the new intersection to be, and then splitting the newly intersecting roads in two at the intersection point. Adding a node is easy enough, but splitting a road has a lot of moving parts — each of the two new road segments has to be assigned about 30 attributes, some of which they can inherit directly from the original road, others basically pro-rated from the original road based on the new road segment lengths, and yet others related to connecting the new road segments to the new node. It’s straightforward, but there are a lot of small, tedious calculations to perform and keep track of. Sounds like a job for the computer…

What I did was write a PostgreSQL function that takes the node and the road, and returns two new road segments. I also wrote a wrapper script to update the network by calling this function. (I decided to just add my new nodes “by hand.”) It mostly works, though in one test case it didn’t split the road exactly where I thought it should (no idea why), and the new network routes like a champ.

This isn’t a substitute for rebuilding the network: this is a quick fix for a small problem, and the the pro-rated attributes especially are a hack, an approximation; I can easily see situations where pro-rating say, ascent/descent data would be inaccurate. But this is fine for now.

📂This entry was posted in tech talk 📎and tagged GIS
Oh No Not That Again (Part 1 of 2)

Posted on December 11, 2021 8:08 PM by Don

I’ve been looking at my Lehigh Valley bike commuter routing project again.

I decided to update the recommended routes with additions based on some of our recent CAT rides, and found that the line geometries representing the various routes were missing. It’s no biggie, some things didn’t survive those destructive “upgrades” I did a few years ago, and the actual recommendation info is stored in among the road network data anyway.

But, I still had the old routes as GeoJSON files, and it’s easier to work with them as geometries in their own right than as attributes on the road network, so I added them back into the database. Then I added that new route (Cedar Street, which parallels Union Boulevard but is much quieter), and used it to update the network. Piece of cake!

I also decided to tackle the problem of updating the network paths themselves, which is not so much a piece of cake. I get the roads from OpenStreetmap, and there are mostly automated tools to build a routing network from OSM road data, and that’s followed by a whole lot of additional data massage to put it in the form I use. But the underlying OSM data isn’t always accurate — roads don’t go where they are supposed to, intersections don’t actually connect, that sort of thing. I would find a lot of this out after building the network, but the task of editing the network, once it’s built, is so onerous that my preferred method has been to fix the issues in OpenStreetmap, then just download the roads and rebuild the whole network from scratch — also onerous, but slightly less so.

Anyway, I planned to make this a part of the usual site maintenance if this ever went live: maybe once a month I would download the OSM roads, rebuild the network, and then install all my extra stuff, and in between these upgrades I would fix OpenStreetmap whenever I found a problem.

The last time, and in fact the only time, I ever went through this updating process, was October 2018. I did some serious cleanup on OSM before that, so the map was in pretty good shape, but I got an embarrassing surprise when I demo’ed it to John R (an actual computer professional), who was thinking of commuting to Easton via the towpath. I’d just added offroad path options, and I was eager to show John my new toy, but the program refused to route onto the towpath at Sand Island — there was a missing intersection! A classic case of “broken demo.”

The need for (and my interest in) the routing program faded not long after that, so, although I cleaned up the offending roads and paths on Sand Island within OpenStreetmap, I never did download any newer road versions. And that’s how it sat for three years, until this week…

(to be continued)

📂This entry was posted in day by day tech talk 📎and tagged GIS
Map Update

Posted on August 1, 2021 3:53 PM by Don

I finally got around to riding the southernmost part of the D&L about two weeks ago, riding from Yardley to Bristol and back, and ground-truthing the trail and access points. I can scratch that off my bucket list, and I don’t see any reason to ride south of Yardley again — this trail section, especially the Morrisville-Levittown portion, is nowhere near as nice as other areas — but I got what I needed to finish my trail amenities map. I may do a little exploring on the Black Diamond north of White Haven just for the sake of completeness, but I think I now have everything I was looking for.

📂This entry was posted in tech talk 📎and tagged GIS
The Wheat From The Chaff

Posted on June 28, 2021 8:51 PM by Don

I’m not sure if this is going to rise to the level of “new GIS project,” but I have been playing around a lot lately with the local transportation authority’s GTFS feed — where GTFS stands for “General Transit Feed Specification,” a standard for publishing public transit information on the Internet.

These feeds are like a cross between spreadsheets and database tables, and by a judicious massaging of the data you can extract bus stop and route information. Unfortunately, that massaging is a real necessity: the specification is built to convey a lot of information, and to cover a lot of different transit situations, so there’s no simple route-and-stop information — it’s buried in cross-references and spread across multiple tables. All this extraction and data crunching is fairly straightforward though, and there are even tools to automate the process (I use a QGIS plugin).

Or the process would be straightforward, if we were not dealing with LANTA. These feeds are updated periodically, and about a year ago the new LANTA feeds sort of devolved into chaos, with extra routes showing up that had no real world connection, odd use of abbreviations for bus stop names (abbreviations are sort of frowned upon, for what ought to be obvious reasons), and their cross-referencing system becoming unnecessarily complex. It was hard to figure out what was going on — I thought at first that it was my analysis software mangling the data, but no it was them.

Well, they’ve been working through a huge revamp of their entire bus route network, so maybe that was the source of some of the bogus data. The new routes and schedules went into effect on June 21, and an updated feed followed soon after; I downloaded the new one and crunched the data — and the garbage was all still there! But, I noticed that in among the old chaos was a new and much cleaner set of data, valid starting on the 21st, showing the new bus routes and the correctly-named bus stops. So now I do a double extraction, first massaging the feed into a useful form, then extracting from that the new, valid and cleaned-up route data. Voilá!

I have some vague plan to add these bus routes to OpenStreetMap, but that’s a big undertaking, and I would prefer to rely on eyewitness ground-truthing (ie riding the bus) than a data set — which means even more work. For now I’m content with just having got the damn data.

📂This entry was posted in tech talk 📎and tagged GIS
More Fun with Routing

Posted on January 25, 2020 1:16 PM by Don

I’m not sure why I did it, but I installed PHP and Apache on my new computer, then moved a bunch of my “internal website” stuff over from storage. Everything seemed to work pretty well, so I tried the commuter routing program — I got errors, natch.

I looked at the error messages and realized that the pgRouting routines had changed, so my database routing functions were out of date — that led to me discover that even the newer version of PgAdmin3 doesn’t work well with my newer Postgresql version, especially when it comes to functions. So, I installed phpPgAdmin — which was also borked, and in the same way, but I was able to fix the source code. Even working properly it couldn’t do what I needed though, which was to modify my old function. I tried writing a new function through phpPgAdmin, which was extremely laborious, and basically re-wrote the original, broken function, so now I had two useless functions that I couldn’t modify. Ugggh, time for bed.

I woke up this morning and got it done old-school, writing a SQL script to define the function and running that from the command line. Presto, now I have a working function, and a working commuter routing program. Bonus: the new version of pgRouting is much faster (though that could be the new computer), and some routing errors are now fixed. Wish you could see it!

📂This entry was posted in tech talk 📎and tagged GIS
Project Drift

Posted on November 13, 2019 11:02 PM by Don

I’ve done a few more Road Scholar gigs this year, and my co-guide and I both feel that the ride choices could be improved, mainly by doing more bike paths and rail-trails, and doing less actual road riding. This would avoid the biggest issues we face (traffic and hills), and maybe allow the rides to be a bit longer and more enjoyable.

Meantime, I’d noticed a tendency, among our van drivers, to use Google Maps to navigate our pick-up, drop-off and other van access points. This is I think a good thing, but it’s led to map searches finding the wrong drop-off point — nearby features rather than the specific location we use. It works well enough that “OK, turn left here and pull into that parking lot” will get us there once we’re close enough, but navigating to an actual position (a given latitude and longitude, for instance) would work much better.

Finally, I thought it would be good to have an official repository somewhere, of the rides: their official routes (I use GPS to navigate on the rides) as well as waypoints, like lunch spots, points of interest along the ride, and those pick-up and drop-off points. Ideally, I would be able to load a ride into my GPS and have all info for the ride at my fingertips.

These all coalesced in my mind into the Great Big Ride Database GIS Project. The project would be made of three parts: storage of rides (official or otherwise) and waypoints into a ride database, transfer of rides/waypoints to and from my GPS, and analysis of the ride data.

First Steps, and Revolting Developments

I started by keeping “official versions” of our rides on RideWithGPS, and I would download them as GPX files onto my Garmin when I needed them. This would only take care of the route itself, however; I thought that there was also a need to maintain a list of waypoints associated with each route, so I decided to build some kind of database to hold routes and their waypoints.

Since I would like to be able to just hand over the ride information in some file format, my first attempt was to build the database as a GeoPackage file. This actually worked pretty well, when my plan was just to stuff the data into storage. But then, my plans started to morph: I needed to actually analyze the data (with a spatial query) to generate info I needed. The GeoPackage file should have been able to handle this, but I think I must have done something wrong back when I installed the underlying GeoPackage/SpatiaLite libraries, or I was doing something wrong now, but I just couldn’t get any spatial functions to work. After frustrating myself for a while I just moved the database over to PostGIS. My project was changing, but at least it worked.

So at this point, I started looking at the problem of getting the point data to places where I could use it — like onto and off of my Garmin. I collected a bunch of the waypoints as “saved locations” on my GPS, but then I couldn’t find any good way to export or upload them. (The Google tells me that Garmin apparently has some Windows programs that can manage waypoints, but that does me no good.)

I eventually dropped back and punted by writing a Python script. I scrounged around inside my Garmin and found a file called Locations.fit that seemed to be where the saved locations were stored, and used that fitparse library to rummage inside the FIT file, eventually figuring out the (undocumented) structure used to store waypoints. I could now export the waypoints into a QGIS layer, then I managed to realize that I could import the waypoints to my GPS via a GPX file in the same way I could import rides via GPX, and could even combine waypoints with the ride trackpoints in the same file for importing. Major breakthrough! — though the Garmin seemingly ignores all waypoint information (symbology, comment) except the name.

So things are now a bit different than how I first planned it, but I have a system that works. Next up: evaluating potential routes.

📂This entry was posted in tech talk 📎and tagged GIS
Fun With Maps

Posted on March 26, 2019 1:54 PM by Don

A friend sent me a video how-to to build a 3d map the other day, and while I thought it was really cool I didn’t want to use the software in the video. I have some pretty good stuff already, I thought, and tried to find a way to do it with either GRASS or QGIS. GRASS was a bit of a bust: I really hate the interface they use for 3d, and couldn’t find much on how to drape one layer over another — it used to be easy!

QGIS wasn’t much better, but then I am a few versions behind. There is a plugin, however, which enabled me to make a 3D map website. So here’s mine:

Old School Bethlehem in 3D

I used the USGS topographic map from 1894, and “draped” it over the DEM I made for the Lehigh Valley cycle routing project (which DEM unfortunately has height in feet rather than meters, so the hill heights scale a bit big). The view in the picture is of Bethlehem and environs, with South Mountain and Lehigh Mountain on the left, and the Camel Hump, back when it was still Quaker Hill, in the upper right. Click the image and it’ll take you to the map website.

I noticed, when playing with that topo map, that for things like roads it doesn’t align everywhere with current maps. The map was provided with a CRS by USGS, but I suspect it was guesswork: there is no projection or datum information on the map itself. (The corners do line up exactly.) This may be because of surveying inaccuracies, back then or even for modern maps — I’m mostly using OpenStreetMap, after all — or it could be that the roads themselves were moved or straightened over the years, or they guessed wrong with the CSR. I thought it interesting then, that on the 3D map the hills and contour lines line up as well as they do: the surveyors knew where the hills were, at the very least.

📂This entry was posted in tech talk 📎and tagged GIS
New Project: Down the Rabbit Hole and Still Digging

Posted on November 6, 2018 11:28 AM by Don

I started looking into my new project the other day. The first steps will have to be extracting information from GPX or FIT files, and adding the information to a PostGIS database. I managed to do this in several ways, mostly through a combination of GPSBabel and ogr2ogr, though no single way has done exactly what I want yet: ogr2ogr automatically adds GPX data to the tables in a manner similar to what I want, but extension data (heart rate, temperature) is not treated the way I want, while the FIT data needs to be extracted first into a format readable by ogr2ogr, and then put in the right table form after being put in the database, all of which turned out to be surprisingly easy. (Even so, I may just choose to go with adding the data from GPX for now.)

The biggest problem I’ve run into so far is that GPSBabel does not extract all the data from the FIT file, and FIT is a proprietary, binary file format — I can’t get lap information, for example, just by scanning the file with awk or something. I may have to download and use the (again, proprietary) FIT SDK, in a C or other program I write myself. This may fit in well with what else I have to do, since I can call the parts of ogr2ogr I specifically need, directly from C.

Before it gets to that point though, I have to decide what I especially want to do with this data, which will tell me what I need to extract, what I need to save, and what I can disregard, or discard after processing. Do I want to build a full-blown replacement for Garmin Connect, where I keep all relevant data? Or do I want to just build something, like a web badge, to show a minimum of data about the ride, data like distance, duration and a map of the ride, with maybe a link to the ride’s Garmin activity page? I am leaning towards the minimalist approach (which would entail just saving one record per activity, with fields containing aggregate data), but I think I want at least some of the individual track point data because I may want to graph things like elevation or heart rate.

But maybe I don’t need to keep trackpoint data to build my graphs on the fly. Maybe I can make small graphs as PNG’s or GIF’s for the badge, and store those images in the database — hopefully they would be smaller than the trackpoints themselves. Alternately, I could store the entire FIT file (which is actually pretty small) in the database, and extract whatever I need on the fly. (I would still do a one-time analysis to get and store my aggregate data, since this might be a little too slow for on-the-fly data generation.) These choices will depend on the results of all the little coding/database/GIS experiments I’m doing now, extracting, converting and aggregating sample data.

Ten Years Gone: This is what I wrote on this date in 2008. We voted today, and I remain hopeful, but it is certainly not as happy a day as that one was, and even with good news I don’t think we’ll match that day.

📂This entry was posted in tech talk the sporting life 📎and tagged GIS

First Steps, and Revolting Developments