tech talk – Page 2 – The Other Don Kelly

← Older postsNewer posts→

Category Archives tech talk
Computers and programs, maps and GPS, anything to do with data big or small, as well as my take on the pieces of equipment I use in other hobbies — think bike components, camping gear etc.
Fun With Rasters

Posted on October 17, 2022 10:23 AM by Don
I’ve been experimenting with raster data lately, photographing trail maps from Indian Paths of Pennsylvania and then digitizing them for use as map overlays in my project (I rough in the paths by tracing over them on the maps). This has worked really well, at least for when there is a path on the map — alternate paths are sometimes missing — but it came with a few problems:
- The digitized maps (georeferenced to match my map and converted to GeoTIFF format) look great, but the first one I did weighed in at a whopping 27MB. Since I expect to generate at least a hundred of these, that’s a significant amount of disk space.
- The maps start out as color photographs, and once they are in map form there is a lot of extraneous stuff that overlays (and blocks) the basemap beneath it.
So, I came up with a workflow that brings in my map images while avoiding these problems:
1. I start by taking a photo (with my phone) of the map in question, trying to get “nothing but map” in the shot.
2. Using GIMP, I rotate and crop as necessary, then clean up the photo by making off-white sections white, despeckling, and increasing brightness/contrast. I then invert the colors, making it a B&W negative before saving.
3. In QGIS I georeference the modified photo, using river confluences and other geographic features as my reference points. (I try for six or more “ground control” points to reference, and use the 2nd-order polynomial transformation to account for bent pages in the photo, though if the resulting transformation doesn’t look good I’ll try other options.)
4. Finally I convert the resulting TIFF from RGB format (colors) to PCT (a sort of numeric) format, and save at half the original resolution.
I can load the resulting raster as an overlay, and the raster pixels should be one of only two values (zero and one). I make the zero values transparent and the one values black, and now I have a very usable map overlay. The final GeoTIFF files average about 100KB each.

This makes tracing the paths very easy, maybe too easy: I feel a temptation to take the paths as gospel, even though I have no real idea of either the original map accuracy or the accuracy of the georeferenced overlay. Then again, it is the information as given in the book, and that’s what I set out to capture. Anyway, it’s a good first step. I’ve done about a dozen so far.
tech talk 📎and tagged GIS native paths QGIS
Next Steps for The Native Paths

Posted on August 18, 2022 10:24 AM by Don
So I have about a dozen left of the native paths to add to my database, in this first pass through Indian Paths of Pennsylvania (that book I’m following/analyzing/whatever). This is the pass where I go through the book from beginning to end, adding the basic info for each path to the database, adding the info about the start points and destinations, and generating the routes described in each chapter’s “For the Motorist” section.

There are a few big pieces left to this project, which I think I can do all together in a second pass through the book:
1. I need to document the relationships between the paths, as described for each path in the book (which is what required me to go through the book one first time: to get all the paths documented, before trying to map the relationships).
2. I need to generate the actual footpath routes. This will probably be the most difficult and labor intensive task in the whole project, and I expect it will likely involve digitizing all the (low quality) maps in the book; it may also require trying to find primary sources, old deeds and land grants etc, and even after all that I expect I’ll have to live with a great deal of ambiguity in the routes.
3. I’m not sure if I want to do this yet, but as I go through the book I may document any points of interest (landmarks, native towns that aren’t trail endpoints) that I haven’t already included.
I’ve been thinking about the first part for a while, and have set up a separate bridge table in the database to capture these relationships; the table is set up with links to a subject path (the one that’s doing the referring) and an object path (the one getting referenced), and a link to another table with the list of possible relationships between them: intersections, alternate routes and spurs; aliases and alternate names; concurrencies (ie where the path shares some section of trail with another path); and the ever-popular “for more information see also.” I can add more relationships as I see the need.

For the second task, I think I’ll want to use the maps in the book, even if it’s just to trace over. That means scanning the maps in some way without damaging the book (I may just photograph them with my phone), then georeferencing them and saving the result somewhere. I suspect I’ll end up with a pretty big set of raster data, and now need to consider how to do to organize it. Rasters are not something I have much experience with, so there will likely be a learning curve involved — I think I may put them in the database in some way.

In terms of original research, my plan at the start was to use Indian Paths of Pennsylvania as my sole source — my project would be the book translated into GIS form — but I’ve already used other sources (e.g., Wikipedia, town websites) to flesh out histories and descriptions, and I think I’m seeing the book now as a condensation of other info, even if it’s just the author’s research files; the info in the book may have been “condensed,” oversimplified, to the point of vagueness, more exact versions of the trail descriptions exist somewhere. I really don’t want to get into actual archival research for this though, and it may just end up that if I dig really deep, I’ll only find that all the primary information is pretty vague too…

Finally, that third task has me a bit stuck: I’d originally planned to only record the endpoints of the paths (as given in the book), and even called my points table “termini.” Now I’m looking to enter things like landmarks, known trail junctions — there are several places called “the parting of the ways” — towns that aren’t actually endpoints, and all sorts of other points of interest. I painted myself into a corner with that “termini” name, and even if it wouldn’t be too much of a stretch to stick these other points in the termini table, I may add a separate “points of interest” table, or at least add a column to the termini table to designate non-endpoints. (Maybe I’m overthinking this, I could easily find the endpoints and non-endpoints within a mixed table, just by using simple searches.)

This kind of gets to what I want my native paths data to eventually look like. Many of these landmarks and points of interest are likely to be nodes in a trail network (just like the endpoints), so I am back to thinking they should all be part of the same table. I also expect that I’ll have trail segments from node to node, and my final paths will be lists of trail segments from start point to end point, so my final product will not look much like what I’m building now.

Well, I still have some time to think about it.
tech talk 📎and tagged GIS native paths PostGIS QGIS
Going Mobile

Posted on June 13, 2022 12:30 PM by Don
I downloaded the route data for our upcoming trip from Adventure Cyclist as a GPX file, and I also got the Adventure Cycling route app and downloaded the trip there as well. The trip data consists of the route itself (as a path) and the locations (points) of recommended places for food, lodging, bike repairs, and so on; the data is the distillation of their collected wisdom and experience for any given ride. It’s been a goldmine of information for planning our trip, and knowing that it’s based on the knowledge and experience of other travelers makes me a bit more comfortable relying on it.

The GPX is what I got first, and I plan to put the relevant parts of it on my Garmin for the trip, but I opened it in QGIS first because of course I did…

There were six GPX files representing the bike routes as GPX tracks — the main route, a spur to Banff, and a gravel-bike alternate near Fernie, one file for each route in each direction — and one other file with all the services as GPX waypoints. The tracks didn’t contain much information, though the trackpoints themselves did have elevations; the meat of the data was in the service waypoints, and it was interesting to see what information Adventure Cyclist put in for each feature, and how they fit it within the confines of the GPX format.

I don’t usually use GPX except for when I move things to and from my Garmin, so I don’t know too much about it but my impression is that it is highly structured, and, unless you use “extensions,” which not every application will honor or display, it’s a bit rigid in what it can hold — my data has almost always been square pegs, and GPX is all about round holes…

What Adventure Cyclist did was to stuff a lot of the information into the “name” field using initials and abbreviations (“R, CS, M” for restaurant, convenience store and motel, for instance) along with the name, and put the telephone number (along with some travel directions; these were the only contact info given) into both the “comment” and the “description” fields, possibly because different GPS software would look at different fields.

I took this data and massaged it for my own purposes. I also got the app and downloaded the route there. It had the same data, in a very readable and actually beautiful form; it also looked like they maybe used the same GPX data, maybe in another file format but the same structure, and massaged it on the fly. Very interesting…

This got me thinking about my trail amenities map:
- Do I have too much contact information, or not enough? (Answer: my contact information is just right.)
- Am I presenting the information well, especially for use on a phone? (Answer: not really.)
- How should I represent my amenities data, especially for places that have multiple amenities — hotel with restaurant, convenience store with bathroom? (Answer: this will require a whole lot of rework, but I think I should show multiple amenities as multiple symbols in a popup.)
So I am now rethinking my own map based on what I liked about the Adventure Cycling map, but in the meantime I compromised and added some information-massage code of my own, to turn my phone information a clickable link: click on the number (on your phone) and your phone will make the call.

It’s a start.
tech talk 📎and tagged bicycle GIS
Quick And Dirty For The Win

Posted on May 27, 2022 9:25 AM by Don

Well, that KML problem didn’t get to marinate for long: I decided to just have a python script write the output as a generic text file. It’s a pretty dumb program, but it worked like a charm. I tested it by making maps of all my remaining unofficial rides, maybe half a dozen in all. The process took less than a minute per map, thus saving me about an hour of work, at the cost of only a month of coding… Hey but it was an interesting learning experience!

It was also interesting to see the difference between the maps of official rides and the less-developed, unofficial ones — the new ones seem so sparse, with missing lunch spots and fewer access locations. It’s clear, from seeing them on these maps, that at least some of the new rides aren’t ready for prime time.

Anyway, yesterday was a bike rest day for me, and I got a bunch of yardwork and other chores done. Today and tomorrow look like rain.

tech talk 📎and tagged GIS
Another Day’s Useless Energy Spent

Posted on May 23, 2022 10:28 AM by Don
I make online maps of Shawnee’s Road Scholar bike rides, to help the sag-wagon drivers find things like drop-off and pick-up locations, lunch spots, and other places where they’d meet up with the riders. (I did this after a driver searched for directions to the “D&L trailhead in White Haven,” but there were two trailheads in the town and he went to the wrong one. No biggie, the trailheads were less than a mile apart, but this kind of ambiguity looked like a problem that needed solving.) My first iteration was just a list of all locations in Google Maps’ “My Places,” but now I have a separate, full-blown map (again in Google Maps) for each ride, showing the route itself as well as marked points for the drop-off, pick-up, lunch, and all practical access locations, each with appropriate symbology.

Here is an example for the Lehigh Towpath ride:

(You can view the legend by clicking the button to the left of my name at the top of that map, or view the bigger map by clicking the square at top right.)

I use Google Maps because that’s what the drivers use on their phones for navigation, and this makes it easy for them to navigate to any of my marked points without having to leave the Google Maps universe. To make the maps available, I generate QR codes for each map link; the driver can get the map and open it almost effortlessly with their phone’s barcode scanner. They don’t always use it — they weren’t as dazzled by my geeky brilliance as I was hoping — but hey I was impressed.

The map data comes from a QGIS project I keep for all of the rides we do, or might do, as part of Shawnee’s Road Scholar bike program. My workflow is: select the route for the chosen ride and add it to a KML file, then select the relevant points of interest and append them as another layer in that file. (KML is “Keyhole Markup Language,” Google’s XML-based geographic data format.) I then import the KML file into Google Maps, and I’ve got my route and points on their own new map. A little bit of styling and I’m done — easy peasy! It takes about 5-10 minutes for each map. (The QR codes are generated separately using Jasper Reports.)

And Here My Troubles Began

I have these maps made for all the rides we’ve done, but I have a whole bunch of other rides, either unused or not fully developed, which don’t have their own maps yet, so there are more maps to do. Making each individual map is pretty easy, but I don’t always get the symbology (or the data associated with each feature) exactly the same each time, and I think that automating the task will help with consistency; also, if I start creating a lot more potential rides, the automated map-making could be a potential time saver.

To automate my workflow, I decided to break it down into three basic tasks:
1. extract the data I’d need from the database,
2. use that data to generate a KML file, and
3. import the KML file into Google Maps. (OK, this last step is still pretty manual…)
This is the project I took with me to Colorado, working on it whenever we had an hour or two of downtime — something I can get away with when I vacation with a bunch of readers…

The first step was pretty straightforward, though the SQL (especially for the points of interest) grew into a behemoth, with multiple CTE’s and a whole lot of joins to get things the way I wanted — setting it up just so took a bit of work, but like I said, the task was straightforward.

The next task was more convoluted, and involved a learning curve — I decided to use Python to generate the KML, and this meant using several new packages: geopandas, which makes working with data a bit easier, and simplekml, which, well, does the KML part. Geopandas was pretty easy to use, but simplekml was quirky, maybe a bit too simple, and in the end I actually had to modify the source code to make it work right.

But, in the end I got my KML file (technically, a KMZ file, which included the icons I’d need for styling), and I was able to import it to both Google Earth and Google Maps — the only problem was that Google Maps ignored all my styling! (It worked fine in Google Earth.)

So, somewhere between steps 2 and 3 there was a glitch, a mismatch between what “styles ” meant for KML, and what Google Maps actually used. I spent a good part of the last few days looking at different versions of KML and KMZ files, seeing what worked, or didn’t work, with Google Maps, and I think I have an idea of what’s going on:

Google Maps — or at least, the “My Maps” part of Google Maps — ignores most of the style information, like “use this icon and color,” and seems to code the style information, what they actually use, directly within the style ID’s: a style ID of “icon-1567-9C27B0-normal” would mean “use specific icon #1567, with color #9C27B0 for the normal (un-highlighted) icon.”

So OK, first mystery solved, but what does that get me? I suppose I can use Google’s hard-coded style ID’s, but simplekml does not allow you to specify ID’s or change them, so it looks like that package isn’t as useful as it needs to be, and other Python KML packages seem… not simple.

My next steps? My original plan was to code the entire task in SQL, and I may fall back on that, or I may do something using PHP. I might even take some text-based approach (like I did with geojson a few years ago), since KML is at bottom just a text file. But the truth is, I’m really just going to put this away for a while and let it marinate.
tech talk 📎and tagged bicycle GIS
Indian Paths Update

Posted on March 20, 2022 8:13 PM by Don

I’m still cruising along on this project: I’ve got just over 110 paths in the database (of maybe 150 total), about 130 towns or other path endpoints, and 92 motorway routes. I have added no actual paths yet, but the motor routes are starting to look like a real network.

My current plan is to parse the book three times: once (this time around) to capture the paths, path endpoints, and motor routes; once (the final, and probably most difficult, round) to try and develop the original foot paths; and in between these rounds I will go through the paths/chapters and try to capture all the cross-references between them.

I noticed early on that there were a lot of things like “this is an extension of that other path,” “so-and-so path also goes by this name,” “this path intersects with these others,” and such like throughout the text; the path descriptions are festooned with these kinds of cross-references.

(I also finally picked up on the fact that paths without a path/chapter number are not actually part of the previous chapter, but are basically “chapterless,” just the next path name in alphabetical order. They act sort of as placeholders, the alternate names of other, more fully fleshed-out paths — that is, more cross-references.)

I want to hold on to all this cross-reference information in my database, so I set up a bridge table to work something like a resource description framework, with the referring path as the subject, the referenced path as the object, and for the predicate I would use a description of the relationship type, such as “[subject path] is a continuation of [object path],” “[object path’s name] is an alternate name for [subject path],” “for more info see [object],” and so on. I now have all of this set up and ready to go, but before filling it in with information I want to have all the paths already in the database. Soon…

Meanwhile, the details, of each path or town I add, have all been real eye-openers. I often do a little internet research on each town, or village, or Native name I come across, and each bit of info, each piece of the puzzle is another portal into that era.

looking back tech talk 📎and tagged GIS native paths PostGIS QGIS
New Bike!

Posted on March 19, 2022 11:44 AM by Don

My writer’s block continues, but I thought I’d pop in to announce that I finally got my new bike, a Kona Sutra SE. This is a bike seriously designed for touring: fenders, a triple ring, (mechanical) disk brakes, bullet-proof tires, bar-end shifters, a Brooks saddle, and a rear rack, with frame attachments for a front rack as well — and oh yeah, steel is real, baby!

My New Kona Sutra SE

I picked it up Thursday night from Cutters Bike Shop, where it had just arrived; they knew I wanted one, and they called a week or so ago to say one was on the way, and then it was there at the shop, and so was I… I gave it a test ride with panniers, then rode it home — it doesn’t fit on the roof rack. Yesterday was its real maiden voyage, a bakery ride up to Nazareth.

tech talk the sporting life 📎and tagged bicycle
Native Paths Update

Posted on January 18, 2022 10:02 PM by Don

I kept at it, and am now about a quarter of the way through the trails — the motorway parts, at least — in my Indian Paths of Pennsylvania project. I have a pretty good idea of how the book is organized now, and came up with a pretty decent workflow that gets me through a single path in just under an hour. I do one or two a day. It’s pretty easy to get absorbed, trying to find the tiny old roads and landmarks based on their descriptions in the book, and I’ve been totally sucked into the history of that Colonial-Revolutionary era. (I picked up Mason & Dixon again, since it goes right through the middle of that time and place.)

I also think there will be some epic rides this summer, based on these routes — I’ve been drooling over some of the scenes I see in Google Street View.

day by day looking back tech talk 📎and tagged GIS native paths
N+1

Posted on January 18, 2022 9:42 PM by Don
I knew it would come to this sooner or later — I’m in the market for a new bike. There’s nothing wrong with any of my other bikes, but none of them are really touring bikes, and I’ll be joining Anne on a trip this summer. Fully self-supported, front and rear panniers, camping in the Rockies — the works. (I hated “touring” every time I’ve ever done it, but I suspect that that’s really an equipment issue — I do enjoy our towpath “bikepacking” trips.)

Anyway, I’ve been doing some research, and what I think I need is:
- a bike with a “touring” frame — low bottom bracket, longer wheelbase and chainstays,
- a triple chainring,
- mechanical disk brakes, and
- a few add-ons that would be nice if they came stock, like racks and fenders.
There are a few bikes that I think might fit the bill, ones I found in several “best touring bike” listicles, namely the Trek 520, the Kona Sutra SE, and the Surly Disk Trucker. The Trek looks to be impossible to find anywhere right now, and the Surly only seems available (sight unseen) via the Internet, but I found a Sutra SE at a local bike shop, and it looks like my size. I need to do a bit more search and research — I’m also looking among the world’s used bikes — but I think I can already see how this will shake out.
tech talk the sporting life
Foiled Again!

Posted on January 5, 2022 2:35 PM by Don

I have a love-hate relationship with Paul A. W. Wallace’s Indian Paths of Pennsylvania. I love reading the individual chapters on each path — their descriptions, and the accounts of them in the letters and diary entries of early Colonial explorers, but any hard look at the specifics and the trails themselves become frustratingly vague. This is all the more frustrating because the information looks specific and authoritative enough, until you take that close look…

Some of this is because the original information is vague — nobody was tracking their steps with a GPS back then — so the actual trail location is not perfectly known, and partly it’s because the trails themselves are long gone (though some are at least partly followed by modern roads), so it’s hard to search them out without trespassing, but there also just seems to be some missing ingredient needed to define a trail network.

A few years ago I thought that this last part could be solved with a little bit of GIS detective work, so I started a QGIS project to define the trails and see about building a network, but I sort of ran out of steam — I basically foundered on the vagueness of the trail descriptions. I did one or two in the Lehigh Valley, and realized that the sleuthing needed was a lot more laborious than simple data entry, and the project languished after those first few paths.

I was thinking about all this again recently, and realized that there is a critical first step I ignored: the book serves primarily as an automotive guide, with detailed instructions for driving in the vicinity of each path. I also thought that if I broke the task down to a set of database tables, I could link these auto routes to their various paths and book chapters . (Some trail chapters actually describe multiple trails and subtrails, while some motorway descriptions continue across multiple chapters, so many-to-many relationships abound but that’s what databases are for. Furthermore, most of the trail chapters have a start and an endpoint, yet more data I can use to cross reference.)

This scheme fell apart within the first few trails. The very first trail, the “Allegheny Path,” has Philadelphia as the start point and “Pittsburg and Kittanning” as the endpoint — so which is the endpoint? Apparently neither, because the trail is only described as far as Harrisburg; the “Allegheny Path” chapter ends with references to several other trails (different chapters, in other words) heading West from Harrisburg as possible continuations. So OK, I can deal with this: my endpoints are really Philadelphia and Harrisburg, and I’ll stuff the rest of the info into my “description” column. (There is a second path listed in that first chapter, but it is little more than a historical aside and a reference to another path/chapter. This is going to get tricky.)

Luckily the motorway for the Allegheny Path is easy to follow. I used an open routing plugin to follow along a bunch of control points, and voilá I had my linestring. This ain’t so bad!

The very next chapter, I ran into motorway difficulties: the route description made no sense. Either the routes were not prepared with adequate ground-truthing (unlikely, though I was starting to feel uncharitable), or the roads (and their designations) had changed at some point in the 55 years since the book came out. This seems the more likely explanation, since I-80 goes right through the area in question, was only finished in 1970, and probably changed a lot of things in its wake. I actually found the Wikipedia article on the Bald Eagle Creek Path more useful.

So I’m back to deciphering and making judgement calls rather than strictly converting the information from one format to another, even for these road descriptions. I didn’t expect this project to be done in an afternoon, or even a week or so, but “going to take forever because I’m not really working on it” is now closer to my expectation.

(Note: I found that someone already took these paths and put them into a GIS, but it’s on PA-Share and that’s proved difficult to work with — and deliberately limited, unless you pay — so far. We’ll see…)

looking back tech talk 📎and tagged GIS native paths

And Here My Troubles Began