• Category Archives tech talk
  • Computers and programs, maps and GPS, anything to do with data big or small, as well as my take on the pieces of equipment I use in other hobbies — think bike components, camping gear etc.

  • More Ball Bearing Woes

    Posted on by Don

    Morning weigh-in: 175.5#, 11.5% BF

    I took my road bike over to the CAT office yesterday and took apart my freehub. What a mess! There are a whole lot of very small ball bearings in there, and the bearing cage basically fell apart — sound familiar? The slop in the freehub was basically caused by missing BB’s but the entire inside was trashed anyway. Maybe it can be serviced, but I think it’ll be better to just get a new one, if I can find something compatible with that wheel. I will also be replacing the cassette, the chain and the front rings, so maybe I should just get a new wheel as well.

    One good thing: while I was at CAT, Scott was able to get my pedals off, so now I can replace them.

    Today we had to run a bunch of errands, and when we got home we were hungry and tired and crabby. So, we hopped on the mountain bikes and rode to Freemansburg, maybe four miles away on the towpath. We got takeout at a place called Cherry’s Caribbean Palace (curried goat, mac & cheese, rice and plantains) and ate at picnic tables by the restored Mule Barn. It totally changed the afternoon’s vibe. I’ve been meaning to check that place out (it’s on my amenities map), and I saw online that they just got recognition as one of the best Caribbean food places around, so this was a pretty good opportunity to do some food exploring. Two thumbs up.


  • New Phones

    Posted on by Don

    Morning weigh-in: 175.5#, 15% BF

    Anne and I just replaced our phones, and in the nick of time too: they were ancient, and falling apart (literally my power button just fell off), and the network they worked on was about to be sunsetted…

    We both got the new Samsung Galaxy A52, which is not top of the line but it still blows our old phones away, thank you very much, and it costs about half of what the top models go for. We bought them online, unlocked, and got new SIM chips from our phone company, and then we transferred our numbers and data to the new phones. Easy enough process, and everything seems to be working OK, now all we have to do is get used to them.

    Rain Comes At You Fast: We had plans to head up to the Finger Lakes for the 4th of July, Ben and Candace (and their new dog) and us, camping and cycling from tomorrow through the 5th. But the forecast has been getting more and more dire as our trip approached, and so we got together in a Zoom chat last night and changed our plans: they’ll be coming to visit us for part of the weekend, and we’ll do a hike if the weather allows.

    I got in another sweltering towpath ride this afternoon, super hot even though I was moseying, but it was beautiful out there. Still, I had the trail mostly to myself, except for a few kids at the swimming holes.


  • The Wheat From The Chaff

    Posted on by Don

    I’m not sure if this is going to rise to the level of “new GIS project,” but I have been playing around a lot lately with the local transportation authority’s GTFS feed — where GTFS stands for “General Transit Feed Specification,” a standard for publishing public transit information on the Internet.

    These feeds are like a cross between spreadsheets and database tables, and by a judicious massaging of the data you can extract bus stop and route information. Unfortunately, that massaging is a real necessity: the specification is built to convey a lot of information, and to cover a lot of different transit situations, so there’s no simple route-and-stop information — it’s buried in cross-references and spread across multiple tables. All this extraction and data crunching is fairly straightforward though, and there are even tools to automate the process (I use a QGIS plugin).

    Or the process would be straightforward, if we were not dealing with LANTA. These feeds are updated periodically, and about a year ago the new LANTA feeds sort of devolved into chaos, with extra routes showing up that had no real world connection, odd use of abbreviations for bus stop names (abbreviations are sort of frowned upon, for what ought to be obvious reasons), and their cross-referencing system becoming unnecessarily complex. It was hard to figure out what was going on — I thought at first that it was my analysis software mangling the data, but no it was them.

    Well, they’ve been working through a huge revamp of their entire bus route network, so maybe that was the source of some of the bogus data. The new routes and schedules went into effect on June 21, and an updated feed followed soon after; I downloaded the new one and crunched the data — and the garbage was all still there! But, I noticed that in among the old chaos was a new and much cleaner set of data, valid starting on the 21st, showing the new bus routes and the correctly-named bus stops. So now I do a double extraction, first massaging the feed into a useful form, then extracting from that the new, valid and cleaned-up route data. Voilá!

    I have some vague plan to add these bus routes to OpenStreetMap, but that’s a big undertaking, and I would prefer to rely on eyewitness ground-truthing (ie riding the bus) than a data set — which means even more work. For now I’m content with just having got the damn data.


  • Got A Screw Loose

    My friend Greg maintains that if your bike has a creak, it’s best to leave it alone, because if you do manage to exorcise that one, another creak will come take its place. Nonetheless, I decided yesterday to deal with a persistent creak down near my bottom bracket on the Santa Cruz.

    This creak started (true to form), just after I’d dealt with a bit of play that had developed in one of my shock mount bushings. That was an easy enough fix, once I got the parts from the bike store, but as soon as I solved that — finally, and after more than a month of annoyance — up popped the new creak.

    I’ve had this creak before, and if it’s the same one it just means cleaning and tightening the bottom bracket and crank threads. I girded my loins with some YouTube how-to’s (I can never remember what type of crank I have on which bike, or how to extract it), went out to start the process, and — the bolt holding the crank on is loose, like loose loose. I tightened it back up, thinking that might be all that was really wrong. Bullet dodged!

    I did a towpath ride this morning to Northampton, and the creak, if anything, was worse.


  • Another Bite At The Apple

    I finished those online courses on SQL. There was a bit of cognitive drift into “other data models” (i.e., JSON and XML) towards the end, but all in all it was a positive experience: I really did learn a lot, and more important, I’m now very comfortable using these new things I learned.

    (The XML foray was an interesting thing in its own right: these courses — and this is probably true of most “free courses” on the internet — are about 10 years old, which translates to “about the time XML’s heyday was starting to fade.” The course touched on XQuery and XSLT, but there was a definite “we suspect you won’t need to know this in the future” vibe about the lessons, and here in the future I had a hard time even finding a way to run the demos, and much of my internet research consisted of reading articles with names like “Why Should We Care About XML Anymore?” I eventually resorted to a bash script — brutishly practical, my workhorse go-to — to invoke the bloated, oh-so-elegant Java classes I found online to perform the queries.)

    Anyway, I was pretty impressed with myself, both for sticking it out and for actually learning something from these courses. I think that the “internet course” format is something I take to pretty well, at least the ones I found at edX, and in fact something I enjoy spending my time with. Naturally enough, I decided to continue by taking another set of courses, and this time I’m giving R another go. I’m about halfway through the third course, in a series on data analysis put out by Harvard, currently working through visualization (graphs) and ggplot, and once again I’m doing well and loving it — so far. Only time will tell.


  • Organization Man

    It wasn’t quite new-year’s-resolution level, but I’ve been having a sustained burst of productivity lately, or if not productivity then at least activity: I have been much better about cello practice; I’ve been more on top of bills, and housework, and exercise (i.e. morning calisthenics, not the biking); I’ve been making progress on learning SQL; and I’ve even chipped away at the greater part of my Flickr photos backlog. And I’ve managed to get all this done, to become my new, more organized self, through the use of my simple, lowly to-do list.

    I’ve written about my to-do list before. It’s basically just a text file; in the morning, or sometimes the night before, I’ll write what I want to get done at the top of the file, then as the day progresses and I do things I can mark the tasks done. If I don’t get to something it’s no big deal, it’s just not marked done and I can add it to the next day’s tasks (or not), but at any idle moment during the day I can see at a glance what I could be productive about, and the process gives me a chance to think about what I want to accomplish, what I ought to be doing, what might be more or less urgent, etc, for any given day. I also add specific appointments (a doctor visit, an afternoon ride with someone) to the end of the list, so I remember to budget my to-do tasks around them. The structure is pretty simple:

    Sunday 1/17/2021
    exercise (done)
    cello
    dishes (done)
    bills:
      phone (done)
      gas (done)
      electric (done)
    study sql
    flickr
    blog (started... running notes go here until it's marked done)
    garbage
    @1:00 group road ride (done)
    
    Saturday 1/16/2021
    dishes (done)
    exercise (done)
    cello (done)
    study sql (done)
    blog
    flickr (done)
    work on bikes

    And so on.

    (I also keep a separate file, a spreadsheet that I call my “food diary,” where I keep track of everything I eat each day, but that does not get used nearly as much as the to-do list. It has a different pedigree, being something I saw once about behaviorist approaches to dieting, and has been much less successful in keeping me engaged enough to use it.)

    I find that I am more energetic in the late morning or early afternoon, but that may also be because the morning is when I’m selecting my day’s tasks, and therefore thinking more about them, rather than it being an issue of afternoon energy levels. The one thing that does sap energy levels — the thing that wrecks any given day’s remaining plans — is biking. Any day with a longish bike ride, nothing seems to get done after the ride…

    Anyway, here’s a product of one of my previous to-do lists: my first cycling video, posted on YouTube. The raw GoPro video quality is very high and the files are huge, so I spent some time learning how to process the clip into a format with reasonable values for both quality and file size. It looked great, but YouTube has taken to throttling quality to conserve bandwidth during this COVID-level use era. Here it is:


  • New Project

    I’m coming to a close on the Trail Amenities project — rather, I’m running out of things to do with it — and I’ve been feeling a bit tired of all things GIS lately, so I’ve been looking for a new project, something that will teach me some new software or skill.

    My first thought was to get a better grasp of R, especially R graphics, and to do that I’d work with CDC COVID data. Early in the pandemic I was doing a lot of downloading, analyzing and graphing of case and death data, in a sort of “play along at home” mode. In a way it helped me get an emotional handle on things, in analyzing it myself I felt I regained a bit of control over the situation. At the time I was using the LibreOffice Calc spreadsheet.

    Anyway, when things started getting bad again I returned to looking at the data, and I used it to explore R, but I found I didn’t have the motivation or focus, though I did manage to get more of a handle on graphics. Part of my loss of focus was that the data magic was gone — I had COVID fatigue — but part was also that I constantly bumped up against realizing that the data manipulation would have been easier for me in SQL.

    I switched from the COVID data back to working with my trail amenities data, which gave me a chance to practice accessing my database from R, but in the end I realized I’d rather play with PostgreSQL than R so I decided that my new project would be to really learn how to work with PostgreSQL.

    And that’s what I’m doing. I found a series of free online courses on databases and SQL, from Stanford via EdX, and I’m working my way through them — I’m currently on the second course. This should give me a good handle on using SQL; the other half of what I want is database administration, which so far I’ve just been able to pick up in pieces here and there.


  • Another Way To Look At It

    In my database of towpath-accessible amenities, my original definition of “accessible” was simply “within a half mile, by paths available to a cyclist, of an access point.” This seemed pretty reasonable — people don’t really want to travel more than a few hundred yards off the trail to get a bite or whatever, and beyond that point an amenity isn’t really part of the trail ecosystem anymore. This break-off point is more or less arbitrary, so I chose a half mile as a nice, round and fairly inclusive distance.

    As an example of what this might look like, here is a view of the Sand Island trailhead, with accessible amenities selected using this simple definition:

    map of bethlehem
    Amenities within a half mile of Sand Island

    You can see that there is not much available immediately near the trailhead, and then further north there’s a hotel (blue square) and a bunch of food/drink amenities (yellow circles) on Main Street’s “restaurant row.”

    I saw two issues with this. One is that people will probably be willing to go a bit further from the trail to get to lodging or a bike store, but these places are not shown, and the other issue is that the arbitrary break-off point comes in the middle of a dense clump of amenities — this is the case with “restaurant row” — and it seems silly to include one restaurant just under the half-mile cutoff while excluding the four next door, just beyond it.

    So (to address the second issue) I expanded my definition to include clusters: if there is a dense group of amenities, and at least one of the amenities in the group is within a half mile of the trailhead, all of them in the group are considered accessible.

    I also included any lodging or bike store within a full mile of the trailhead (to deal with the first issue), and now the map looks like this:

    map of bethlehem
    Amenities with a more expansive (ie “clustered”) definition of accessible.

    The map now includes the rest of the tourist/nightlife area near Broad and Main Streets, as well as several more hotels and two bike stores (the orange diamonds). This is the version of “accessible” my amenities map uses.

    But should I be even more expansive in my definition? What if someone wants to find a restaurant near their hotel, or near a bike shop, and the hotel or bike shop is one of the outliers, nowhere near the other “accessible amenities” even if there might be places nearby? (This actually happens in Bethlehem, where there is a separate business district on the south side of the river.) I decided to explore this possibility.

    My first approach to a more expansive definition of accessibility was to expand my definition of an accessible cluster: if a group of amenities contains anything considered accessible, like an amenity within a half mile of the trailhead, or a hotel or bike shop within a mile of the trailhead, then all amenities in the group are accessible. The new amenities can be seen in this map:

    map of bethlehem
    Sand Island amenities, including clusters near hotels and bike shops

    This was easy to implement, it just meant a tweak or two to the function I used for the original cluster definition. Now the Southside downtown is pretty well represented. By the way, here is the same map, but with clusters (accessible and inaccessible) shown:

    map of bethlehem
    Amenities near Sand Island, with cluster regions shown.

    The colored polygons in this map are the regions where clusters of amenities can be found. (Note the yellow polygon in the northeast corner. That represents a cluster where no amenity met any of my criteria, and no amenities are shown.)

    I worked out this definition of accessibility about the same time as the original clustered version, but didn’t use this definition for my map because it seemed a bit too expansive, with these second-order amenities — accessible not to the trailhead per se, but to other places that someone may want to visit — making the whole map too busy without adding much more value. After all, if someone wants dinner recommendations near their hotel, they can always ask at the front desk…

    But lately I’ve been thinking about a different approach to expanding my definition: what about the routes from the trailhead to the hotel (or bike shop), would it be useful to show the amenities along the way? Here is a map, just of the amenities that are within 50 yards of the shortest path to the accessible hotels and bike shops:

    map of bethlehem
    Amenities along the routes to hotels and bike shops.

    This seems to strike a happy medium, inclusive but not too inclusive. Here are those amenities along the hotel/bike shop routes, with the more expansive version of the clustered amenities superimposed:

    map of bethlehem
    Amenities near Sand Island, including those in clusters near, or along routes to, lodging and bike shops

    I kind of like this approach, though I wonder if it’s more a CYA reflex: I don’t want hungry people to pass restaurants on the way to the bike shop, and not see them on my map. After all, this is still a second-order set of amenities, and even if it’s not quite as busy as my first attempt at a more expansive definition, there is a lot of overlap. I’ll be thinking about this a bit more…


  • Infrastructure Fun

    I got in a few rides these past few weeks, and some good cello time too, but my major focus has been on “infrastructure” projects:

    Bike

    The Santa Cruz, after four years of that “new bike feeling,” is starting to show some signs of age. Nothing bad, just things like shifting problems in the highest gears, so I might need new cables and maybe housing, and some trouble with the tire valves: I’ve got a slow leak in the rear tire caused by a torn o-ring, and a gummed up valve up front.

    For the tires I got a “valve repair kit” from Saucon Valley Bikes. The tubeless tire valves are pretty easy to take apart and work with, so I was able to replace the rear o-ring — I can’t be sure if it worked perfectly, but it’s working enough for now — and have new valve innards on deck if the front tire becomes too annoying. The shifting seems sort of OK for the moment after I did some serious derailleur cleaning, but I can tell I’ll have to deal with those cables sooner rather than later.

    Meantime, I noticed a slight creak coming from the bottom bracket…

    SSL

    For my website I’ve been using an SSL/TLS certificate from Let’s Encrypt, which I obtained using SSLForFree, since Let’s Encrypt is pretty difficult on its own. These certificates need to be renewed every 90 days, but when I went to do it the next-to-last time, I found that SSLForFree had been bought out by ZeroSSL, who use their own certificates and who intend to charge for anything beyond a limited number of free ones. I used them that time, but spent the next three months looking into a better option.

    The ZeroSSL certificate expired a few days ago, but I had already replaced it with one from Let’s Encrypt, using a rather laborious process on yet another website. It’s very doable, but I think I’ll continue looking for a better method.

    Towpath Amenities

    This is a bit of old news, but I’ve added the amenities and access points along the towpath between New Hope and Morrisville. I have about 10 miles left to add, the section from Morrisville to Bristol, and I have all access points and amenities I could find added to my database. All that’s left is to ground-truth some of the info, then I can update the map. This last addition will make the map complete, but that won’t make the job done — this job will never be done

    I started thinking about my method of routing the other day: the routine finds the point on the road network closest to my access point (the start) and the point on the network closest to my amenity (the endpoint), then finds the shortest path through the network between start and end points. But what if the start and end points on the network are not particularly close to their respective access or amenity points?

    I originally assumed that this would not be an issue: access points were basically intersections of the D&L with the road network, and almost all amenities should be very near some road or path that customers use to get there. Then I figured out a way to check…

    Most amenities were within about 25 yards of their route’s endpoint, the distance being mostly open space like a parking lot or driveway. I figured that this was acceptable, but I also found a few amenities that were more than that distance, between say 25 and 50 yards from their endpoints. Again they were on the far sides of parking lots and such from the ends of their routes, but these distances seemed a bit too large to leave be, so I added service lanes and driveways as necessary — I’m not sure why these weren’t already a part of the network, but they are there now; I updated the routes to the offending amenities and all was well.

    There was a third group of amenities that I found, and these were the ones I had been worrying about: the ones where the database has a route, but in real life the route’s endpoint is nowhere near the amenity, and maybe the amenity isn’t even accessible from the endpoint. (One example could be a store along a roadway I’d deliberately excluded from the route network, such as a fast food place along a highway. The routing program would find a path to the closest point still on the allowed roads, and leave the cyclist to connect the endpoint and the amenity “as the crow flies,” crossing freeways or God-know-what, and I’m back to square one.)

    Luckily, I only found a few of these, and they all were total outliers: places that were in the database, but were too distant and isolated to be considered “accessible.” For now I’m leaving them in the database, but I guess I’ll eventually have to remove them. I’ll have to look more carefully at the relationship between new amenities and the road network in the future if I add any more, to make sure they actually connect.

    Network (the other kind)

    One last piece of infrastructure activity: we are switching our internet provider, from DSL on Verizon to RCN cable. I bought a cable modem and a wifi router, and called RCN the other day; the cable installers should be here this afternoon.

    I got us the slowest package, 10 Mbps, which is about four times faster than what we have now and costs about $20/mo less, before even considering the cost of the landline we’ll be abandoning when we get rid of Verizon. (If we need it we can upgrade our package, but we’ve been making do with DSL for so long that 10 Mbps will probably seem blazing fast.)


  • Now What?

    One of the pedals on my road bike has developed a squeaky bearing lately, and I thought that maybe it’s time for a new pair. (I’ve used the same clipless pedal system — Speedplay Frogs — for more than 20 years. With pedals on multiple bikes and cleats on my bike shoes, it’s a fairly big investment in the one technology.) I went online to order the new pedals, and found that they have become extremely scarce — like nonexistent, discontinued scarce. Turns out that Speedplay was bought by Wahoo, and they decided to shut Speedplay down while they “reconsider the product line” or whatever they might call it. WTF?

    My immediate options are to see if I can find a new pair on eBay or whatever (no luck yet), or to replace the bearing (which also requires an eBay purchase, but spare parts seem plentiful so far), or just keep re-packing the pedal with grease and hoping for the best — that’s what I did this afternoon. I guess I’ll eventually have to completely replace the Frogs with some new system, and rather sooner than later. Three sets of pedals and two sets of cleats — it’ll be a substantial chunk of change, but I’m not even sure what that replacement system will be yet. It’s a total shame, really, the Frogs are great pedals.