• Category Archives tech talk
  • Computers and programs, maps and GPS, anything to do with data big or small, as well as my take on the pieces of equipment I use in other hobbies — think bike components, camping gear etc.

  • Map Update

    I finally got around to riding the southernmost part of the D&L about two weeks ago, riding from Yardley to Bristol and back, and ground-truthing the trail and access points. I can scratch that off my bucket list, and I don’t see any reason to ride south of Yardley again — this trail section, especially the Morrisville-Levittown portion, is nowhere near as nice as other areas — but I got what I needed to finish my trail amenities map. I may do a little exploring on the Black Diamond north of White Haven just for the sake of completeness, but I think I now have everything I was looking for.

  • Every Week Is Infrastructure Week


    Speaking of Scott & Kellyn… I went looking for new parts for my road bike, but found instead that the component supply chain is still completely disrupted by COVID, and I may need to wait a long time to replace that freehub. So I went back to CAT, where Scott helped me find find a wheel compatible with my shifting system, and then guided me through rebuilding it: cleaning out and re-greasing the freehub, ditto the bearings, and putting it all together with my old sprockets. It looks a bit weird on my bike — even though it’s also Campagnolo it’s decades older than my other components — and I am still on borrowed time with my other worn drivetrain components, but the bike is rideable again. Thanks Scott!


    My laptop started making a tiny creaking sound when I opened it, and the other day I noticed that the right hinge had become detached in some way. Fixing this, according to YouTube, is an easy enough home repair, and System76 says I can send the laptop back to the factory to get it fixed, but I think I’ll split the difference and bring it to some local repair place.

    Before I brought it somewhere though, I wanted to make sure I had my data backed up. (I used to do backups regularly but, ironically enough, my old backup drive crashed a while ago. So, step one was to get a new drive.) I picked up a 2.0 TB, USB hard drive at Staples, then spent a little time fixing it up the way I wanted it, re-formatting it with a more Linux-friendly filesystem and replacing the drive’s icon with one I like better. After that I just copied my home folder over to the new drive and called it a backup. (I also needed to back up a few other things, like global configuration files, but that was just more file copying.)

    My last remaining backups were the databases. These required using a few special programs, which, since I haven’t used them since my last major upgrade, I just discovered in the moment that they were not configured correctly, and in fact my whole database system was a misconfigured hodgepodge… I had to backtrack a bit and get my system in order, which meant I had to do a bit of learning first, but I eventually got the whole thing running and even managed to automate the process.

    The next step is to bring the laptop to a repair store.

  • More Ball Bearing Woes

    Morning weigh-in: 175.5#, 11.5% BF

    I took my road bike over to the CAT office yesterday and took apart my freehub. What a mess! There are a whole lot of very small ball bearings in there, and the bearing cage basically fell apart — sound familiar? The slop in the freehub was basically caused by missing BB’s but the entire inside was trashed anyway. Maybe it can be serviced, but I think it’ll be better to just get a new one, if I can find something compatible with that wheel. I will also be replacing the cassette, the chain and the front rings, so maybe I should just get a new wheel as well.

    One good thing: while I was at CAT, Scott was able to get my pedals off, so now I can replace them.

    Today we had to run a bunch of errands, and when we got home we were hungry and tired and crabby. So, we hopped on the mountain bikes and rode to Freemansburg, maybe four miles away on the towpath. We got takeout at a place called Cherry’s Caribbean Palace (curried goat, mac & cheese, rice and plantains) and ate at picnic tables by the restored Mule Barn. It totally changed the afternoon’s vibe. I’ve been meaning to check that place out (it’s on my amenities map), and I saw online that they just got recognition as one of the best Caribbean food places around, so this was a pretty good opportunity to do some food exploring. Two thumbs up.

  • New Phones

    Morning weigh-in: 175.5#, 15% BF

    Anne and I just replaced our phones, and in the nick of time too: they were ancient, and falling apart (literally my power button just fell off), and the network they worked on was about to be sunsetted…

    We both got the new Samsung Galaxy A52, which is not top of the line but it still blows our old phones away, thank you very much, and it costs about half of what the top models go for. We bought them online, unlocked, and got new SIM chips from our phone company, and then we transferred our numbers and data to the new phones. Easy enough process, and everything seems to be working OK, now all we have to do is get used to them.

    Rain Comes At You Fast: We had plans to head up to the Finger Lakes for the 4th of July, Ben and Candace (and their new dog) and us, camping and cycling from tomorrow through the 5th. But the forecast has been getting more and more dire as our trip approached, and so we got together in a Zoom chat last night and changed our plans: they’ll be coming to visit us for part of the weekend, and we’ll do a hike if the weather allows.

    I got in another sweltering towpath ride this afternoon, super hot even though I was moseying, but it was beautiful out there. Still, I had the trail mostly to myself, except for a few kids at the swimming holes.

  • The Wheat From The Chaff

    I’m not sure if this is going to rise to the level of “new GIS project,” but I have been playing around a lot lately with the local transportation authority’s GTFS feed — where GTFS stands for “General Transit Feed Specification,” a standard for publishing public transit information on the Internet.

    These feeds are like a cross between spreadsheets and database tables, and by a judicious massaging of the data you can extract bus stop and route information. Unfortunately, that massaging is a real necessity: the specification is built to convey a lot of information, and to cover a lot of different transit situations, so there’s no simple route-and-stop information — it’s buried in cross-references and spread across multiple tables. All this extraction and data crunching is fairly straightforward though, and there are even tools to automate the process (I use a QGIS plugin).

    Or the process would be straightforward, if we were not dealing with LANTA. These feeds are updated periodically, and about a year ago the new LANTA feeds sort of devolved into chaos, with extra routes showing up that had no real world connection, odd use of abbreviations for bus stop names (abbreviations are sort of frowned upon, for what ought to be obvious reasons), and their cross-referencing system becoming unnecessarily complex. It was hard to figure out what was going on — I thought at first that it was my analysis software mangling the data, but no it was them.

    Well, they’ve been working through a huge revamp of their entire bus route network, so maybe that was the source of some of the bogus data. The new routes and schedules went into effect on June 21, and an updated feed followed soon after; I downloaded the new one and crunched the data — and the garbage was all still there! But, I noticed that in among the old chaos was a new and much cleaner set of data, valid starting on the 21st, showing the new bus routes and the correctly-named bus stops. So now I do a double extraction, first massaging the feed into a useful form, then extracting from that the new, valid and cleaned-up route data. Voilá!

    I have some vague plan to add these bus routes to OpenStreetMap, but that’s a big undertaking, and I would prefer to rely on eyewitness ground-truthing (ie riding the bus) than a data set — which means even more work. For now I’m content with just having got the damn data.

  • Got A Screw Loose

    My friend Greg maintains that if your bike has a creak, it’s best to leave it alone, because if you do manage to exorcise that one, another creak will come take its place. Nonetheless, I decided yesterday to deal with a persistent creak down near my bottom bracket on the Santa Cruz.

    This creak started (true to form), just after I’d dealt with a bit of play that had developed in one of my shock mount bushings. That was an easy enough fix, once I got the parts from the bike store, but as soon as I solved that — finally, and after more than a month of annoyance — up popped the new creak.

    I’ve had this creak before, and if it’s the same one it just means cleaning and tightening the bottom bracket and crank threads. I girded my loins with some YouTube how-to’s (I can never remember what type of crank I have on which bike, or how to extract it), went out to start the process, and — the bolt holding the crank on is loose, like loose loose. I tightened it back up, thinking that might be all that was really wrong. Bullet dodged!

    I did a towpath ride this morning to Northampton, and the creak, if anything, was worse.

  • Another Bite At The Apple

    I finished those online courses on SQL. There was a bit of cognitive drift into “other data models” (i.e., JSON and XML) towards the end, but all in all it was a positive experience: I really did learn a lot, and more important, I’m now very comfortable using these new things I learned.

    (The XML foray was an interesting thing in its own right: these courses — and this is probably true of most “free courses” on the internet — are about 10 years old, which translates to “about the time XML’s heyday was starting to fade.” The course touched on XQuery and XSLT, but there was a definite “we suspect you won’t need to know this in the future” vibe about the lessons, and here in the future I had a hard time even finding a way to run the demos, and much of my internet research consisted of reading articles with names like “Why Should We Care About XML Anymore?” I eventually resorted to a bash script — brutishly practical, my workhorse go-to — to invoke the bloated, oh-so-elegant Java classes I found online to perform the queries.)

    Anyway, I was pretty impressed with myself, both for sticking it out and for actually learning something from these courses. I think that the “internet course” format is something I take to pretty well, at least the ones I found at edX, and in fact something I enjoy spending my time with. Naturally enough, I decided to continue by taking another set of courses, and this time I’m giving R another go. I’m about halfway through the third course, in a series on data analysis put out by Harvard, currently working through visualization (graphs) and ggplot, and once again I’m doing well and loving it — so far. Only time will tell.

  • Organization Man

    It wasn’t quite new-year’s-resolution level, but I’ve been having a sustained burst of productivity lately, or if not productivity then at least activity: I have been much better about cello practice; I’ve been more on top of bills, and housework, and exercise (i.e. morning calisthenics, not the biking); I’ve been making progress on learning SQL; and I’ve even chipped away at the greater part of my Flickr photos backlog. And I’ve managed to get all this done, to become my new, more organized self, through the use of my simple, lowly to-do list.

    I’ve written about my to-do list before. It’s basically just a text file; in the morning, or sometimes the night before, I’ll write what I want to get done at the top of the file, then as the day progresses and I do things I can mark the tasks done. If I don’t get to something it’s no big deal, it’s just not marked done and I can add it to the next day’s tasks (or not), but at any idle moment during the day I can see at a glance what I could be productive about, and the process gives me a chance to think about what I want to accomplish, what I ought to be doing, what might be more or less urgent, etc, for any given day. I also add specific appointments (a doctor visit, an afternoon ride with someone) to the end of the list, so I remember to budget my to-do tasks around them. The structure is pretty simple:

    Sunday 1/17/2021
    exercise (done)
    dishes (done)
      phone (done)
      gas (done)
      electric (done)
    study sql
    blog (started... running notes go here until it's marked done)
    @1:00 group road ride (done)
    Saturday 1/16/2021
    dishes (done)
    exercise (done)
    cello (done)
    study sql (done)
    flickr (done)
    work on bikes

    And so on.

    (I also keep a separate file, a spreadsheet that I call my “food diary,” where I keep track of everything I eat each day, but that does not get used nearly as much as the to-do list. It has a different pedigree, being something I saw once about behaviorist approaches to dieting, and has been much less successful in keeping me engaged enough to use it.)

    I find that I am more energetic in the late morning or early afternoon, but that may also be because the morning is when I’m selecting my day’s tasks, and therefore thinking more about them, rather than it being an issue of afternoon energy levels. The one thing that does sap energy levels — the thing that wrecks any given day’s remaining plans — is biking. Any day with a longish bike ride, nothing seems to get done after the ride…

    Anyway, here’s a product of one of my previous to-do lists: my first cycling video, posted on YouTube. The raw GoPro video quality is very high and the files are huge, so I spent some time learning how to process the clip into a format with reasonable values for both quality and file size. It looked great, but YouTube has taken to throttling quality to conserve bandwidth during this COVID-level use era. Here it is:

  • New Project

    I’m coming to a close on the Trail Amenities project — rather, I’m running out of things to do with it — and I’ve been feeling a bit tired of all things GIS lately, so I’ve been looking for a new project, something that will teach me some new software or skill.

    My first thought was to get a better grasp of R, especially R graphics, and to do that I’d work with CDC COVID data. Early in the pandemic I was doing a lot of downloading, analyzing and graphing of case and death data, in a sort of “play along at home” mode. In a way it helped me get an emotional handle on things, in analyzing it myself I felt I regained a bit of control over the situation. At the time I was using the LibreOffice Calc spreadsheet.

    Anyway, when things started getting bad again I returned to looking at the data, and I used it to explore R, but I found I didn’t have the motivation or focus, though I did manage to get more of a handle on graphics. Part of my loss of focus was that the data magic was gone — I had COVID fatigue — but part was also that I constantly bumped up against realizing that the data manipulation would have been easier for me in SQL.

    I switched from the COVID data back to working with my trail amenities data, which gave me a chance to practice accessing my database from R, but in the end I realized I’d rather play with PostgreSQL than R so I decided that my new project would be to really learn how to work with PostgreSQL.

    And that’s what I’m doing. I found a series of free online courses on databases and SQL, from Stanford via EdX, and I’m working my way through them — I’m currently on the second course. This should give me a good handle on using SQL; the other half of what I want is database administration, which so far I’ve just been able to pick up in pieces here and there.

  • Another Way To Look At It

    In my database of towpath-accessible amenities, my original definition of “accessible” was simply “within a half mile, by paths available to a cyclist, of an access point.” This seemed pretty reasonable — people don’t really want to travel more than a few hundred yards off the trail to get a bite or whatever, and beyond that point an amenity isn’t really part of the trail ecosystem anymore. This break-off point is more or less arbitrary, so I chose a half mile as a nice, round and fairly inclusive distance.

    As an example of what this might look like, here is a view of the Sand Island trailhead, with accessible amenities selected using this simple definition:

    map of bethlehem
    Amenities within a half mile of Sand Island

    You can see that there is not much available immediately near the trailhead, and then further north there’s a hotel (blue square) and a bunch of food/drink amenities (yellow circles) on Main Street’s “restaurant row.”

    I saw two issues with this. One is that people will probably be willing to go a bit further from the trail to get to lodging or a bike store, but these places are not shown, and the other issue is that the arbitrary break-off point comes in the middle of a dense clump of amenities — this is the case with “restaurant row” — and it seems silly to include one restaurant just under the half-mile cutoff while excluding the four next door, just beyond it.

    So (to address the second issue) I expanded my definition to include clusters: if there is a dense group of amenities, and at least one of the amenities in the group is within a half mile of the trailhead, all of them in the group are considered accessible.

    I also included any lodging or bike store within a full mile of the trailhead (to deal with the first issue), and now the map looks like this:

    map of bethlehem
    Amenities with a more expansive (ie “clustered”) definition of accessible.

    The map now includes the rest of the tourist/nightlife area near Broad and Main Streets, as well as several more hotels and two bike stores (the orange diamonds). This is the version of “accessible” my amenities map uses.

    But should I be even more expansive in my definition? What if someone wants to find a restaurant near their hotel, or near a bike shop, and the hotel or bike shop is one of the outliers, nowhere near the other “accessible amenities” even if there might be places nearby? (This actually happens in Bethlehem, where there is a separate business district on the south side of the river.) I decided to explore this possibility.

    My first approach to a more expansive definition of accessibility was to expand my definition of an accessible cluster: if a group of amenities contains anything considered accessible, like an amenity within a half mile of the trailhead, or a hotel or bike shop within a mile of the trailhead, then all amenities in the group are accessible. The new amenities can be seen in this map:

    map of bethlehem
    Sand Island amenities, including clusters near hotels and bike shops

    This was easy to implement, it just meant a tweak or two to the function I used for the original cluster definition. Now the Southside downtown is pretty well represented. By the way, here is the same map, but with clusters (accessible and inaccessible) shown:

    map of bethlehem
    Amenities near Sand Island, with cluster regions shown.

    The colored polygons in this map are the regions where clusters of amenities can be found. (Note the yellow polygon in the northeast corner. That represents a cluster where no amenity met any of my criteria, and no amenities are shown.)

    I worked out this definition of accessibility about the same time as the original clustered version, but didn’t use this definition for my map because it seemed a bit too expansive, with these second-order amenities — accessible not to the trailhead per se, but to other places that someone may want to visit — making the whole map too busy without adding much more value. After all, if someone wants dinner recommendations near their hotel, they can always ask at the front desk…

    But lately I’ve been thinking about a different approach to expanding my definition: what about the routes from the trailhead to the hotel (or bike shop), would it be useful to show the amenities along the way? Here is a map, just of the amenities that are within 50 yards of the shortest path to the accessible hotels and bike shops:

    map of bethlehem
    Amenities along the routes to hotels and bike shops.

    This seems to strike a happy medium, inclusive but not too inclusive. Here are those amenities along the hotel/bike shop routes, with the more expansive version of the clustered amenities superimposed:

    map of bethlehem
    Amenities near Sand Island, including those in clusters near, or along routes to, lodging and bike shops

    I kind of like this approach, though I wonder if it’s more a CYA reflex: I don’t want hungry people to pass restaurants on the way to the bike shop, and not see them on my map. After all, this is still a second-order set of amenities, and even if it’s not quite as busy as my first attempt at a more expansive definition, there is a lot of overlap. I’ll be thinking about this a bit more…