• Category Archives tech talk
  • Computers and programs, maps and GPS, anything to do with data big or small, as well as my take on the pieces of equipment I use in other hobbies — think bike components, camping gear etc.

  • Got A Screw Loose

    My friend Greg maintains that if your bike has a creak, it’s best to leave it alone, because if you do manage to exorcise that one, another creak will come take its place. Nonetheless, I decided yesterday to deal with a persistent creak down near my bottom bracket on the Santa Cruz.

    This creak started (true to form), just after I’d dealt with a bit of play that had developed in one of my shock mount bushings. That was an easy enough fix, once I got the parts from the bike store, but as soon as I solved that — finally, and after more than a month of annoyance — up popped the new creak.

    I’ve had this creak before, and if it’s the same one it just means cleaning and tightening the bottom bracket and crank threads. I girded my loins with some YouTube how-to’s (I can never remember what type of crank I have on which bike, or how to extract it), went out to start the process, and — the bolt holding the crank on is loose, like loose loose. I tightened it back up, thinking that might be all that was really wrong. Bullet dodged!

    I did a towpath ride this morning to Northampton, and the creak, if anything, was worse.


  • Another Bite At The Apple

    I finished those online courses on SQL. There was a bit of cognitive drift into “other data models” (i.e., JSON and XML) towards the end, but all in all it was a positive experience: I really did learn a lot, and more important, I’m now very comfortable using these new things I learned.

    (The XML foray was an interesting thing in its own right: these courses — and this is probably true of most “free courses” on the internet — are about 10 years old, which translates to “about the time XML’s heyday was starting to fade.” The course touched on XQuery and XSLT, but there was a definite “we suspect you won’t need to know this in the future” vibe about the lessons, and here in the future I had a hard time even finding a way to run the demos, and much of my internet research consisted of reading articles with names like “Why Should We Care About XML Anymore?” I eventually resorted to a bash script — brutishly practical, my workhorse go-to — to invoke the bloated, oh-so-elegant Java classes I found online to perform the queries.)

    Anyway, I was pretty impressed with myself, both for sticking it out and for actually learning something from these courses. I think that the “internet course” format is something I take to pretty well, at least the ones I found at edX, and in fact something I enjoy spending my time with. Naturally enough, I decided to continue by taking another set of courses, and this time I’m giving R another go. I’m about halfway through the third course, in a series on data analysis put out by Harvard, currently working through visualization (graphs) and ggplot, and once again I’m doing well and loving it — so far. Only time will tell.


  • Organization Man

    It wasn’t quite new-year’s-resolution level, but I’ve been having a sustained burst of productivity lately, or if not productivity then at least activity: I have been much better about cello practice; I’ve been more on top of bills, and housework, and exercise (i.e. morning calisthenics, not the biking); I’ve been making progress on learning SQL; and I’ve even chipped away at the greater part of my Flickr photos backlog. And I’ve managed to get all this done, to become my new, more organized self, through the use of my simple, lowly to-do list.

    I’ve written about my to-do list before. It’s basically just a text file; in the morning, or sometimes the night before, I’ll write what I want to get done at the top of the file, then as the day progresses and I do things I can mark the tasks done. If I don’t get to something it’s no big deal, it’s just not marked done and I can add it to the next day’s tasks (or not), but at any idle moment during the day I can see at a glance what I could be productive about, and the process gives me a chance to think about what I want to accomplish, what I ought to be doing, what might be more or less urgent, etc, for any given day. I also add specific appointments (a doctor visit, an afternoon ride with someone) to the end of the list, so I remember to budget my to-do tasks around them. The structure is pretty simple:

    Sunday 1/17/2021
    exercise (done)
    cello
    dishes (done)
    bills:
      phone (done)
      gas (done)
      electric (done)
    study sql
    flickr
    blog (started... running notes go here until it's marked done)
    garbage
    @1:00 group road ride (done)
    
    Saturday 1/16/2021
    dishes (done)
    exercise (done)
    cello (done)
    study sql (done)
    blog
    flickr (done)
    work on bikes

    And so on.

    (I also keep a separate file, a spreadsheet that I call my “food diary,” where I keep track of everything I eat each day, but that does not get used nearly as much as the to-do list. It has a different pedigree, being something I saw once about behaviorist approaches to dieting, and has been much less successful in keeping me engaged enough to use it.)

    I find that I am more energetic in the late morning or early afternoon, but that may also be because the morning is when I’m selecting my day’s tasks, and therefore thinking more about them, rather than it being an issue of afternoon energy levels. The one thing that does sap energy levels — the thing that wrecks any given day’s remaining plans — is biking. Any day with a longish bike ride, nothing seems to get done after the ride…

    Anyway, here’s a product of one of my previous to-do lists: my first cycling video, posted on YouTube. The raw GoPro video quality is very high and the files are huge, so I spent some time learning how to process the clip into a format with reasonable values for both quality and file size. It looked great, but YouTube has taken to throttling quality to conserve bandwidth during this COVID-level use era. Here it is:


  • New Project

    I’m coming to a close on the Trail Amenities project — rather, I’m running out of things to do with it — and I’ve been feeling a bit tired of all things GIS lately, so I’ve been looking for a new project, something that will teach me some new software or skill.

    My first thought was to get a better grasp of R, especially R graphics, and to do that I’d work with CDC COVID data. Early in the pandemic I was doing a lot of downloading, analyzing and graphing of case and death data, in a sort of “play along at home” mode. In a way it helped me get an emotional handle on things, in analyzing it myself I felt I regained a bit of control over the situation. At the time I was using the LibreOffice Calc spreadsheet.

    Anyway, when things started getting bad again I returned to looking at the data, and I used it to explore R, but I found I didn’t have the motivation or focus, though I did manage to get more of a handle on graphics. Part of my loss of focus was that the data magic was gone — I had COVID fatigue — but part was also that I constantly bumped up against realizing that the data manipulation would have been easier for me in SQL.

    I switched from the COVID data back to working with my trail amenities data, which gave me a chance to practice accessing my database from R, but in the end I realized I’d rather play with PostgreSQL than R so I decided that my new project would be to really learn how to work with PostgreSQL.

    And that’s what I’m doing. I found a series of free online courses on databases and SQL, from Stanford via EdX, and I’m working my way through them — I’m currently on the second course. This should give me a good handle on using SQL; the other half of what I want is database administration, which so far I’ve just been able to pick up in pieces here and there.


  • Another Way To Look At It

    In my database of towpath-accessible amenities, my original definition of “accessible” was simply “within a half mile, by paths available to a cyclist, of an access point.” This seemed pretty reasonable — people don’t really want to travel more than a few hundred yards off the trail to get a bite or whatever, and beyond that point an amenity isn’t really part of the trail ecosystem anymore. This break-off point is more or less arbitrary, so I chose a half mile as a nice, round and fairly inclusive distance.

    As an example of what this might look like, here is a view of the Sand Island trailhead, with accessible amenities selected using this simple definition:

    map of bethlehem
    Amenities within a half mile of Sand Island

    You can see that there is not much available immediately near the trailhead, and then further north there’s a hotel (blue square) and a bunch of food/drink amenities (yellow circles) on Main Street’s “restaurant row.”

    I saw two issues with this. One is that people will probably be willing to go a bit further from the trail to get to lodging or a bike store, but these places are not shown, and the other issue is that the arbitrary break-off point comes in the middle of a dense clump of amenities — this is the case with “restaurant row” — and it seems silly to include one restaurant just under the half-mile cutoff while excluding the four next door, just beyond it.

    So (to address the second issue) I expanded my definition to include clusters: if there is a dense group of amenities, and at least one of the amenities in the group is within a half mile of the trailhead, all of them in the group are considered accessible.

    I also included any lodging or bike store within a full mile of the trailhead (to deal with the first issue), and now the map looks like this:

    map of bethlehem
    Amenities with a more expansive (ie “clustered”) definition of accessible.

    The map now includes the rest of the tourist/nightlife area near Broad and Main Streets, as well as several more hotels and two bike stores (the orange diamonds). This is the version of “accessible” my amenities map uses.

    But should I be even more expansive in my definition? What if someone wants to find a restaurant near their hotel, or near a bike shop, and the hotel or bike shop is one of the outliers, nowhere near the other “accessible amenities” even if there might be places nearby? (This actually happens in Bethlehem, where there is a separate business district on the south side of the river.) I decided to explore this possibility.

    My first approach to a more expansive definition of accessibility was to expand my definition of an accessible cluster: if a group of amenities contains anything considered accessible, like an amenity within a half mile of the trailhead, or a hotel or bike shop within a mile of the trailhead, then all amenities in the group are accessible. The new amenities can be seen in this map:

    map of bethlehem
    Sand Island amenities, including clusters near hotels and bike shops

    This was easy to implement, it just meant a tweak or two to the function I used for the original cluster definition. Now the Southside downtown is pretty well represented. By the way, here is the same map, but with clusters (accessible and inaccessible) shown:

    map of bethlehem
    Amenities near Sand Island, with cluster regions shown.

    The colored polygons in this map are the regions where clusters of amenities can be found. (Note the yellow polygon in the northeast corner. That represents a cluster where no amenity met any of my criteria, and no amenities are shown.)

    I worked out this definition of accessibility about the same time as the original clustered version, but didn’t use this definition for my map because it seemed a bit too expansive, with these second-order amenities — accessible not to the trailhead per se, but to other places that someone may want to visit — making the whole map too busy without adding much more value. After all, if someone wants dinner recommendations near their hotel, they can always ask at the front desk…

    But lately I’ve been thinking about a different approach to expanding my definition: what about the routes from the trailhead to the hotel (or bike shop), would it be useful to show the amenities along the way? Here is a map, just of the amenities that are within 50 yards of the shortest path to the accessible hotels and bike shops:

    map of bethlehem
    Amenities along the routes to hotels and bike shops.

    This seems to strike a happy medium, inclusive but not too inclusive. Here are those amenities along the hotel/bike shop routes, with the more expansive version of the clustered amenities superimposed:

    map of bethlehem
    Amenities near Sand Island, including those in clusters near, or along routes to, lodging and bike shops

    I kind of like this approach, though I wonder if it’s more a CYA reflex: I don’t want hungry people to pass restaurants on the way to the bike shop, and not see them on my map. After all, this is still a second-order set of amenities, and even if it’s not quite as busy as my first attempt at a more expansive definition, there is a lot of overlap. I’ll be thinking about this a bit more…


  • Infrastructure Fun

    I got in a few rides these past few weeks, and some good cello time too, but my major focus has been on “infrastructure” projects:

    Bike

    The Santa Cruz, after four years of that “new bike feeling,” is starting to show some signs of age. Nothing bad, just things like shifting problems in the highest gears, so I might need new cables and maybe housing, and some trouble with the tire valves: I’ve got a slow leak in the rear tire caused by a torn o-ring, and a gummed up valve up front.

    For the tires I got a “valve repair kit” from Saucon Valley Bikes. The tubeless tire valves are pretty easy to take apart and work with, so I was able to replace the rear o-ring — I can’t be sure if it worked perfectly, but it’s working enough for now — and have new valve innards on deck if the front tire becomes too annoying. The shifting seems sort of OK for the moment after I did some serious derailleur cleaning, but I can tell I’ll have to deal with those cables sooner rather than later.

    Meantime, I noticed a slight creak coming from the bottom bracket…

    SSL

    For my website I’ve been using an SSL/TLS certificate from Let’s Encrypt, which I obtained using SSLForFree, since Let’s Encrypt is pretty difficult on its own. These certificates need to be renewed every 90 days, but when I went to do it the next-to-last time, I found that SSLForFree had been bought out by ZeroSSL, who use their own certificates and who intend to charge for anything beyond a limited number of free ones. I used them that time, but spent the next three months looking into a better option.

    The ZeroSSL certificate expired a few days ago, but I had already replaced it with one from Let’s Encrypt, using a rather laborious process on yet another website. It’s very doable, but I think I’ll continue looking for a better method.

    Towpath Amenities

    This is a bit of old news, but I’ve added the amenities and access points along the towpath between New Hope and Morrisville. I have about 10 miles left to add, the section from Morrisville to Bristol, and I have all access points and amenities I could find added to my database. All that’s left is to ground-truth some of the info, then I can update the map. This last addition will make the map complete, but that won’t make the job done — this job will never be done

    I started thinking about my method of routing the other day: the routine finds the point on the road network closest to my access point (the start) and the point on the network closest to my amenity (the endpoint), then finds the shortest path through the network between start and end points. But what if the start and end points on the network are not particularly close to their respective access or amenity points?

    I originally assumed that this would not be an issue: access points were basically intersections of the D&L with the road network, and almost all amenities should be very near some road or path that customers use to get there. Then I figured out a way to check…

    Most amenities were within about 25 yards of their route’s endpoint, the distance being mostly open space like a parking lot or driveway. I figured that this was acceptable, but I also found a few amenities that were more than that distance, between say 25 and 50 yards from their endpoints. Again they were on the far sides of parking lots and such from the ends of their routes, but these distances seemed a bit too large to leave be, so I added service lanes and driveways as necessary — I’m not sure why these weren’t already a part of the network, but they are there now; I updated the routes to the offending amenities and all was well.

    There was a third group of amenities that I found, and these were the ones I had been worrying about: the ones where the database has a route, but in real life the route’s endpoint is nowhere near the amenity, and maybe the amenity isn’t even accessible from the endpoint. (One example could be a store along a roadway I’d deliberately excluded from the route network, such as a fast food place along a highway. The routing program would find a path to the closest point still on the allowed roads, and leave the cyclist to connect the endpoint and the amenity “as the crow flies,” crossing freeways or God-know-what, and I’m back to square one.)

    Luckily, I only found a few of these, and they all were total outliers: places that were in the database, but were too distant and isolated to be considered “accessible.” For now I’m leaving them in the database, but I guess I’ll eventually have to remove them. I’ll have to look more carefully at the relationship between new amenities and the road network in the future if I add any more, to make sure they actually connect.

    Network (the other kind)

    One last piece of infrastructure activity: we are switching our internet provider, from DSL on Verizon to RCN cable. I bought a cable modem and a wifi router, and called RCN the other day; the cable installers should be here this afternoon.

    I got us the slowest package, 10 Mbps, which is about four times faster than what we have now and costs about $20/mo less, before even considering the cost of the landline we’ll be abandoning when we get rid of Verizon. (If we need it we can upgrade our package, but we’ve been making do with DSL for so long that 10 Mbps will probably seem blazing fast.)


  • Now What?

    One of the pedals on my road bike has developed a squeaky bearing lately, and I thought that maybe it’s time for a new pair. (I’ve used the same clipless pedal system — Speedplay Frogs — for more than 20 years. With pedals on multiple bikes and cleats on my bike shoes, it’s a fairly big investment in the one technology.) I went online to order the new pedals, and found that they have become extremely scarce — like nonexistent, discontinued scarce. Turns out that Speedplay was bought by Wahoo, and they decided to shut Speedplay down while they “reconsider the product line” or whatever they might call it. WTF?

    My immediate options are to see if I can find a new pair on eBay or whatever (no luck yet), or to replace the bearing (which also requires an eBay purchase, but spare parts seem plentiful so far), or just keep re-packing the pedal with grease and hoping for the best — that’s what I did this afternoon. I guess I’ll eventually have to completely replace the Frogs with some new system, and rather sooner than later. Three sets of pedals and two sets of cleats — it’ll be a substantial chunk of change, but I’m not even sure what that replacement system will be yet. It’s a total shame, really, the Frogs are great pedals.


  • What Is Your Quest?

    Posted on by Don

    I’ve been moving forward with the additional D&L access and amenities points for my project, but the trail sections south of Riegelsville are terra incognita, especially when it comes to trail access, so I relied on GIS to find access points: I split the road network into “trail” and “not trail” sections, and intersection points (that aren’t at bridges) made for pretty good access candidates; some closer map inspections verified a few obvious trailheads, and weeded out some things like private drives. A lot still needed to be verified via “ground truthing” though, and so the other day I went out for a ride, starting from Riegelsville, south along the towpath to Tinicum Park.

    I had my candidate points loaded in my GPS so I could see on the map when I came upon one; I could add locations I’d missed, and delete false positives as I spotted them, and by the time I was done I should have a pretty good idea of how to get on and off the trail. This method worked really well, and the only real problems were judgement calls at what seemed like private access points. (Things are a little different in Bucks County, there are some wealthy homes between the trail and the river, with their own driveways and footbridges, and while some crossings are obviously marked “Private – No Trespassing,” others were maintained, and painted, as if they were park property.) Judgement calls, and I think I made the right calls, but for the most part it didn’t matter — all these these access points were too far from any amenities to be useful.

    It was easy and pleasant work, and I took pictures on the way back:

    Total distance, out and back, was about 24 miles, and the ride took about two and a half hours. I have the new access points and amenities incorporated into my map.


  • Yet Another Half-Finished Magnum Opus

    Posted on by Don

    I put the D&L trail data online, formatted as a map/website that I hope could be useful to someone, as soon as I fix a few tweaks — and add the missing portion of trail. Meantime,

    Click Here

    Enjoy! I’ll soon be adding some more useful UI parts soon, and eventually I’ll get to the trail stuff south of Riegelsville.


  • Trail Project Update

    Posted on by Don

    I got those new amenities into my D&L trail amenities database — it was a piece of cake, once the data was cleaned up. The whole process went smoothly, even the main database report (done through Jaspersoft Studio, which can be a pain in the neck to work with) digested the new information without any glitches.

    But I couldn’t leave well enough alone after that: I decided to add the amenities between Allentown and Northampton where the trail is incomplete. My original feeling was that the trail would be too vague through here — where are the “access points,” the intersections between the trail and the wider world’s road network, if you’re riding on the roads and there is no specific trail? But I’ve been riding this section a bit more lately, since it’s not been inundated with users like other sections, and discovered that much of it is only “unfinished” in the sense that it’s not up to the specifications of the rest of the D&L; it’s perfectly rideable on a mountain bike, and actually more fun than the more polished sections. If you don’t mind riding the rougher stuff there’s not much trail missing, and the remaining road portions are remote enough that there’s no need to worry about nearby amenities.

    So, I repeated the process for the incomplete section: identify trail access points, import amenities from OpenStreetMap (and then clean them up, by far the most laborious part) and finally tie access points to amenities using that routing distance matrix script. Again, it worked like a charm.

    …and then I started thinking about what I might want for output. That Jasper Studios report is nice, but the current output is a PDF — ugh, not very net-friendly — and I thought it might be nicer to get a more straightforwardly data-oriented output, something I can massage and format as necessary in a browser, something like JSON. Unfortunately, though Jaspersoft can do JSON output, I couldn’t quite figure how to get it to do what I wanted. Postgresql, the database I’m using, has JSON capabilities of its own, and final-product-straight-from-the-database seems like a better approach, but I didn’t know much else about those capabilities, so I sat down over the past few days and doped it out. The learning curve was pretty steep, more like a brick wall, and my code is pretty convoluted but I did get it to work. Of course, this success is just a lead-in to another escalation: if I want a web page I now have to code out the rest of the stack.