• Category Archives tech talk
  • Computers and programs, maps and GPS, anything to do with data big or small, as well as my take on the pieces of equipment I use in other hobbies — think bike components, camping gear etc.

  • Just An Experiment

    Here is a test of an embedded Strava activity:

    There’s noting special about this particular run, I just picked it as an example to see what it looked like in my browser.

    Here’s another example of an embedded map, this time from Google Maps:

    Again, there is nothing special about this map — in fact, I’d be wary of using it, as it’s probably years out of date — I just picked it out from a bunch I made once. The point is to notice that the embedded map showed up.

    One more embedded map, this time from Ride With GPS:

    They all just seem to work, right? Contrast these with this one from Garmin:

    If you have anything other than Firefox, you may see the embedded activity (inside the box I added for clarity), but if you’re using (a more modern version of) Firefox you should just see a gray line and a blank space in the box — Firefox is blocking what it now considers an insecure script coming from Garmin. I talked to Garmin tech support, and they say it’s a Firefox problem — that is, their insecure script is really a Firefox problem — and they won’t be fixing it.

    This screws up about a half dozen pages here, and a few more on my old blog, and maybe even some other websites where I’ve embedded Garmin rides over the years. I think I may be going back and re-doing my ride pages in RideWithGps. Ugh, work… Oh well, lesson (re)learned: avoid counting on Garmin, especially their website.


  • Gautama’s Gizmo Time

    Posted on by Don

    Today I’m writing at Lit, the relatively new coffee shop in Southside, while hanging with Anne. I could say I’m now living the dream: laptop and wi-fi in a caffienated third space, except that the dream actually is a bit noisier  than I would have liked…

    Sometimes I think I’m drawn to tech more because I like the idea of tech — the toys and gizmos, smartphones and memory sticks — like the feeling of being in Staples or another stationary store. All that paper! The desks! The organizers! Actually getting the new stationary, or learning the tech, leaves a hollow, let-down feeling, like after all your Christmas presents have been opened, or like a sugar-buzz come-down.

    My latest tech acquisition — my latest trip around the tech wheel of samsara — is a WordPress security plug-in. I’d noticed a lot of traffic on the site, which is cool, but the traffic was basically to my login page — not so cool. The plug-in I installed blocks IP’s that try to login with the wrong user name or password, plus a few other things, and I spent a good part of last night watching it catch and block malicious users. I know I’ll eventually get bored and achieve that come-down, but for now it’s hypnotic…


  • Springtime! Happy Easter!

    Posted on by Don

    I’m sitting at the Bethlehem Library right now — I was planning to maybe ride today but last night’s wet spring snow put the kibosh on that. I felt a bit cooped up and wanted to get out of the house for a bit, so I thought I’d check the new coffee shop (the Church Street Market) across the way but it’s closed on Mondays. I’ll be meeting Anne and maybe Deb at Wise Bean in a bit, but I had a hankering to do the laptop-and-café thing… Oh well, the library works.

    We had a fairly hectic weekend: Friday night was the Adult Easter Egg Hunt at Anne’s niece’s house, Saturday was a Seder at Toby & Erika’s, and yesterday we did an Easter brunch. I got in a ride on Saturday morning, and I’m glad I did since conditions were really good, and it looks like they won’t be good again for a little while. Spring’s coming, it’s just taking its time.

    Fun with Computer Maintenance: I got fed up with Adblock and installed uBlock Origin instead. Better, faster, less intrusive.

    PostGIS Fun: I merged all the bus routes into one GeoJSON file, then loaded them into the database. I then took that table and broke it into two others: one containing the bus stops, with their names, reference numbers,  OSM attributes, and geometries, and a bridging table (sans geometry) containing fields for the bus stop reference, route and stop order for all routes. Some new ideas: “select distinct on” and “window functions.” Works like a charm!

    Off to the Wise Bean…


  • Data Cleanup Automation Fun

    I’m still playing with LANTA’s bus routes data, girding myself for — eventually — adding the bus routes into OpenStreetMap, but I recently decided to rebuild my data, basically starting over from scratch with the PDF’s I got from the LANTA website. My current workflow is:

    PDF  —> CSV —> cleaned-up CSV —> GeoJSON —> cleaned-up GeoJSON —> (eventually) a PostGIS database

    The conversion from PDF to CSV was automatic, using a Java program I found online, and the CSV cleanup — fixing things like transposed latitude/longitude, missing minus signs etc — was done manually (using a text editor and LibreOffice Calc), which was relatively uneventful but  laborious — there are 68 individual bus routes, each with its own file.

    Things got even more laborious with the next two tasks. My first go-around with the conversion to GeoJSON was done manually within QGIS: load each of the CSV files, individually filling out the required parameters for each one. I wasn’t looking forward to converting the files individually, so I wrote a Python script to save all my route layers in GeoJSON format. (Just as an aside, I have to say I really like GeoJSON as a vector file format: I find it much easier to work with than the dated, unwieldy Shapefile standard; it’s also easier to open and work with in the JOSM editor. All my new data, if it’s not going into the database, is getting stored as GeoJSON files.)

    The “GeoJSON cleanup” is where I massage the data into the forms I want: some of the table columns are unnecessary, some need to be renamed, there are a few extra columns to add (and populate), and finally I wanted to convert the LANTA bus stop names format (in  Robbie-the-Robot-style ALL CAPS) to something a little easier o the eyes. Doing this manually would have been beyond laborious, so I wrote another Python script to massage the route files. This turned out to be more of a learning experience — as in, multiple versions of the program failed spectacularly until I got it right — and probably took longer than the brute force, manual changes approach would have, but at least it wasn’t laborious…

    I’m still not really sure why this version worked when others did not, but here is my code:

    # script to run through all visible layers,
    # adding/deleting/renaming fields as required
    # to convert from LANTA bus data to something
    # more like OpenStreetMap bus stop attributes
    # it also properly capitalizes feature names

    from PyQt4.QtCore import QVariant
    import re

    # some functions and regular expressions for string manipulation
    def match_lower( matchobj ):
    return matchobj.group().lower()
    def match_upper ( matchobj ):
    return matchobj.group().upper()
    reg = re.compile( 'Lati|Longi|Time|At Street|On Street|Direct|Placem' )
    reg_ns = re.compile( r'\(ns|fs|mid|off\)|\bncc\b|\bhs\b', re.IGNORECASE )
    reg_nth = re.compile( r'[1-9]?[0-9][a-z]{,2}', re.IGNORECASE )
    reg_leftParen = re.compile( r'([^\s])(\(\w*\))' )
    reg_rightParen = re.compile( r'(\))([^\s])' )
    reg_space = re.compile( r'\s+' )

    for layer in iface.mapCanvas().layers():

    # reset these variables for each new layer processed

    myLayerName = layer.name()
    highwayExists = False
    networkExists = False
    operatorExists = False
    publicExists = False
    busExists = False
    routeExists = False
    delList=[]

    layer.startEditing()
    pr = layer.dataProvider()

    # change the names of some fields
    fields = pr.fields()
    count = 0
    for field in fields:
    fieldName = field.name()
    print "test for changing field name " + fieldName + " count ", count
    if ( fieldName == 'Public Information Name' ):
    print 'changing to name'
    layer.renameAttribute( count, 'name' )
    if ( fieldName == 'Stop Number' ):
    print 'changing to ref'
    layer.renameAttribute( count, 'ref' )
    if ( fieldName == 'Stop Order' ):
    print 'changing to stop_order'
    layer.renameAttribute( count, 'stop_order' )
    count += 1
    layer.updateFields()

    layer.commitChanges()
    layer.reload()
    layer.startEditing()
    pr = layer.dataProvider()

    # delete some fields
    fields = pr.fields()
    count = 0
    for field in fields:
    fieldName = field.name()
    print "test for deleting fields " + fieldName + " count ", count
    m = reg.match( fieldName )
    if m:
    print fieldName
    delList.append( count )
    print delList
    count += 1
    pr.deleteAttributes(delList)
    layer.updateFields()

    layer.commitChanges()
    layer.reload()
    layer.startEditing()
    pr = layer.dataProvider()

    # add some fields, checking if they don't already exist
    count = 0
    fields = pr.fields()
    for field in fields:
    fieldName = field.name()
    print "test for adding fields " + fieldName + " count ", count
    if( fieldName == 'highway' ):
    highwayExists = True
    print 'highway', count
    if( fieldName == 'network' ):
    networkExists = True
    print 'network', count
    if(fieldName == 'public_transport' ):
    publicExists = True
    if(fieldName == 'bus' ):
    busExists = True
    if ( fieldName == 'operator' ):
    operatorExists = True
    if( field.name() == 'route' ):
    routeExists = True
    count += 1
    if( not highwayExists ):
    print "adding highway"
    pr.addAttributes( [ QgsField("highway", QVariant.String) ] )
    layer.updateFields()
    if( not networkExists ):
    print "adding network"
    pr.addAttributes( [ QgsField("network", QVariant.String) ] )
    if( not operatorExists ):
    print "adding operator"
    pr.addAttributes( [ QgsField("operator", QVariant.String) ] )
    if( not publicExists ):
    print "adding public_transportation"
    pr.addAttributes( [ QgsField("public_transport", QVariant.String) ] )
    if( not busExists ):
    print "adding bus"
    pr.addAttributes( [ QgsField("bus", QVariant.String) ] )
    if not routeExists:
    print "adding route"
    pr.addAttributes( [ QgsField('route', QVariant.String) ] )
    layer.updateFields()

    # add attributes to new fields for all features
    for feature in layer.getFeatures():
    feature['highway'] = "bus_stop"
    feature['network'] = 'LANTA'
    feature['operator'] = 'Lehigh and Northampton Transportation Authority'
    feature['public_transport'] = 'platform'
    feature['bus'] = 'yes'
    feature['route'] = myLayerName
    layer.updateFeature(feature)
    layer.updateFields()

    # clean up text in name field
    fieldName = 'name'
    for feature in layer.getFeatures():
    myString = feature[fieldName]
    myString = myString.title()
    myString = reg_ns.sub( match_upper, myString )
    myString = reg_nth.sub( match_lower, myString )
    myString = reg_space.sub( ' ', myString )
    myString = reg_leftParen.sub( r'\1 \2 ', myString )
    mystring = reg_rightParen.sub( r'\1 \2', myString )
    print myString
    feature[fieldName] = myString
    layer.updateFeature(feature)

    layer.commitChanges()
    layer.reload()

    Once I got this up and running, I realized that I wanted t do some more preliminary cleanup on my spreadsheets, so I was back to square one. I couldn’t really find out how to do a bulk load of my CSV files into QGIS, and I realized that QGIS was just using ogr2ogr under the hood, so I decided to do the bulk converting, CSV to GeoJSON, with a shell script that calls ogr2ogr. Yet another learning curve later, and it works great. More code:

    for i in *.csv
    do
    echo $i
    tail -n+2 $i | ogr2ogr -nln ${i%.csv} -f "GeoJSON" ${i%csv}geojson \
    CSV:/vsistdin/ -oo X_POSSIBLE_NAMES=Lon* -oo Y_POSSIBLE_NAMES=Lat* -oo KEEP_GEOM_COLUMNS=NO
    ogrinfo ${i%csv}geojson
    done

    It struck me then, that all my data was really text, and so working with it in a more unix-ey fashion, with shell scripting and text manipulation programs (sed, awk) to do the conversions directly from the CSV files, was probably my better strategy. Oh well, I did the first part of it with bash at least, and the Python script works well enough.

    UPDATE:

    I did it anyway, using awk to add/subtract/rename/Capitalize the CSV data before running it through ogr2ogr. The code (below) is about 35 lines (as opposed to 150 for the Python script, which only does part of the job anyway) and runs really fast:

    #! /bin/bash
    for i in *.csv
    do
    echo Converting $i to GeoJSON
    cat $i | awk -F "," -v rtname=${i%.csv} 'BEGIN {
    }
    $2 != "" && /Stop/ {
    print "ref,Latitude,Longitude,name,stop_order,highway,public_transport,bus,network,operator,route"
    }
    $2 != "" && /^[0-9]{4,4}/ {
    string = tolower($8)
    n=split(string,a," ")
    string=toupper(substr(a[1],1,1)) substr(a[1],2)
    for(i=2;i<=n;i++) {
    string = string " " toupper(substr(a[i],1,1)) substr(a[i],2)
    }
    t_str = "\(ns\)|\(fs\)|\(mid\)|\b[nN]cc\b|\b[Hh]s\b"
    if ( match( string, t_str, match_array ) ) {
    the_match = toupper(match_array[0])
    gsub(t_str, the_match, string)
    }
    gsub(/ *\(/, " (", string)
    $8 = string
    print $1 "," $2 "," $3 "," $8 "," $9 ",bus_stop,platform,yes,LANTA,Lehigh and Northampton Transportation Authority," rtname
    }' > test1.csv
    cat test1.csv
    ogr2ogr -nln ${i%.csv} -f "GeoJSON" ${i%csv}geojson test1.csv -oo KEEP_GEOM_COLUMNS=NO
    ogrinfo ${i%csv}geojson
    rm test1.csv
    done


  • O Brave New World!

    Posted on by Don

    I finally bit the bullet: I put this site on SSL.

    I’ve been wanting to do this for a while, though my motives have been a bit unclear unless I count Internet street cred. The idea is to be able to use HTTPS rather than the HTTP protocol, where the “S” at the end stands for “secure:” the connection between my server and your browser is encrypted (using SSL, or TLS for the pedantic), in a way that keeps the data transfered between them secret, while also verifying that the data source is actually me. This comes in handy if we don’t want third parties snooping on our connection, or modifying the data by adding malware or advertisements. I’m not sure how much those are real problems for my little blog, but all the cool kids are moving to HTTPS so I guess I’d better do it as well.

    The process was a bit time consuming, but it turned out to be easier and more straightforward than I thought it would be. The verification is done via a cryptographically-signed “certificate” from an already-trusted source (a certificate authority), and getting a certificate — for free — from a trusted source was the hardest part of the process — but even counting my learning curve (but not my fretting/waffling), getting and installing the certificate took all of 10 minutes. There were a few more hoops to jump through with my host (another 10 minutes, plus more waffling), but cPanel did most of the work, and now it’s done: my site communicates via the secure HTTPS protocol, redirecting from HTTP if necessary. You can see the green lock up in the address bar, which indicates the secure connection.

    Now you can enjoy my site, knowing that it truly is me, talking to you, in secret…

     


  • I Get Mail!

    Morning weigh-in: 190.5#, 13.5% BF (must have been that midnight egg-salad sandwich)

    I get emails here, mostly spam, and mostly through my Contact Form, which then sends the messages along as emails. Things are usually quiet unless I post, but I’ve been posting almost every day lately, so I’ve been getting one or two every day, lately. It’s kind of funny to watch the evolution of the subject/content, as I slowly ban certain spammy words or phrases when I notice a pattern — I used to get way more than one or two, the low-hanging fruit is all blocked, and sometimes the ones that do get through make so little sense they just can’t be useful to the sender. Here are a few that I haven’t erased yet, most of this batch actually look like they’re supporting some business plan, however poorly thought out:

    Subject: We Will Like To Get Some Information?
    Message Body: Hey, I hope your day is going well; we are sending this message to acquire more information…To find out more about us, please visit our website at: [redacted for your benefit]

     

    Subject: just a though
    Message Body: GoodMorning, I was wondering if you’ve ever heard of [redacted]? its a place where you can purchase digital marketing services for extremely cheap from freelancers.
    I just wanted to ask you about it since I know that as a webmaster or business owner (like myself), you’re always looking for ways to save a few dollars on services like online marketing etc.
    Let me know what you think about it. I’ve used many gigs on there so I wanted to know your thoughts.I do recommend it though!
    I love your site! Have a great day!

     

    Subject: Improve Your Websites Rank in Google
    Message Body: New link building Software builds links to your website whilst you sleep. Save hours of manual labour and get your site ranked page 1 in Google in days. Submit your links to 1000’s of sites on complete autopilot and get a massive increase in traffic to your website. Try it out for a limited time at a special discounted price. Take a look at this powerful software in action; [redacted]

     

    And just today, my sad new favorite:

    Subject: Is It Too Late To Join The Bitcoin Revolution?
    Message Body: Bitcoin has increased in value by over 3000 in the last few years alone. The Crypto World is exploding with potential right now – this is the time to start and profit! Find out more; [redacted]

     

    Kind of click-baity, but at least they aren’t the word salad I sometimes get… I don’t use anything like Captcha, but I have blocked entire domains and blocks of IP addresses, as well as a whole lot of words and phrases — can you guess any of them from the above examples?


  • Too Much Pork For Just One Fork

    Morning weigh-in: 192.5#, 14.0% BF

    Whoops! That’s been up like that for a day or so now — we’ve been eating pulled pork, more puled pork and leftover pork for the last few days too, so I see a pattern. I did get in a trainer workout Tuesday night, but I also found myself struggling, pushing harder than I thought necessary to get my heart rate up — morning yoga/weights/push-ups seem harder over the last few days too. Natural slump? Over-training? I plan to just work through it.

    I went to the allergist yesterday. It was good to see her, get updates on my allergy sensitivities — not much has changed — and get some good advice (and medication) for dealing with my eczema. We’ll see how the new regimen works out.

    Meantime, I’ve been working on some Maps for CAT. Here’s a sample; it’s a work in progress but I feel pretty good about it. I actually used my routing program to do the directions — with a little bit of human editing!

    Bicycle route map: Nazareth to NCC
    Cycling route: Nazareth to NCC

  • Back Out In The Muck

    Yesterday’s ride along the canal was a slopfest, clothes and bike gray & gritty from the gravel/cinder surface, and I was whooped by a 14 mile ride over soft paths. So today, I’m heading out again — this time with Greg H to some actual trails, which stay dryer and more solid, hopefully. I’m heading out in a few minutes, just after blogging and a little lunch.

    A quick aside on the mapping front: I took a long time dithering about it, but I wrote my own chainage routine, and my own ascent/descent calculation function, both in PL/pgSQL, and both — especially the ascent routine, where there was a lot of room for improvement over my PyQGIS script — worked perfectly. (The ascent routine took about 20 minutes to run everything, as opposed to 4-8 hours for QGIS.) I still have to zero out the data at bridges, but I am now back to where I can wait for outside data (recommended routes, etc) to continue.


  • Shoveling Out, On Steroids

    Another dusting last night, an easy shovel job but the neighborhood looks really pretty, especially on my walk this morning. Anne went early to deal with her office’s walkways, then met Debbie for breakfast at the new breakfast place on Main Street (the Flying Egg, go there it’s pretty nice). Anyway — after I got up, and shoveled here — I texted to see if she needed help; she replied that the job was done and I should  come over and join them. Great start to the day, nice to see Debbie, and the point of my story was that it was beautiful out, with early-morning-rosy winter clouds, before it all morphed into a generic “sunny winter day,” which was nice in its own way but that early sky really was cool.

    On the home front, we got our new oven yesterday. It looks pretty nice and stainless-steel modern, the range is a bit more aggressive than our old one and, most important, the oven keeps the correct temperature. Too bad the delivery came while I was trying to sleep in — not too early really, but before 9:00, and I was trying to catch up on my sleep after a rough few days…

    I’ve had a bit of an eczema problem lately, and it really got crazy this week. We super-cleaned the house, I switched to baths instead of showers… and I broke down yesterday, went to my GP and got some prescription strength cortisone cream, as well as a Prednisone prescription. I’ve been warned about euphoria, mania etc as side effects, but nothing: I’ve basically been just putzing around the house today, though my skin is running through a fast-motion miracle cure so there’s that. I have an allergist appointment in the New Year, and I got a referral for a dermatologist from the GP. I’m going up in the attic soon to find the humidifier. Life goes on.

    Meantime, the mapping — rather, the fixing of the mapping scale-up problems — continues. I had problems with getting the elevation changes, and had to eventually abandon a QGIS solution, and build my own PostGIS function to get the “chainages.” The term is apparently a holdover from ancient surveyor days, where they used chains to measure distances; what I needed was a shapefile of points, set every 10 meters along each road in the database, but the new file had to refer back to the road database in a certain way, and the QGIS plugin just wasn’t flexible enough for what I needed. (My solution worked like a charm.) The next step was to use SAGA and my elevation data to give each point an elevation, which since the new chainages were themselves now in the database rather than a standalone file, the process was its own struggle learning experience, but it’s done now. Next up is generating the ascent/descent data, which I might decide to do in PostGIS as well — my current, PyQGIS-based method is run-all-night-check-results-in-the-morning slow. Tomorrow, or this weekend…


  • This Week Today

    Stopping by again…

    Mapping: I had, and still have, a few technical issues to deal with, but the full Lehigh Valley database is now in PostGIS, along with elevation data — bogus elevation data, that’s one of my technical issues — and the demo map can now route with the new database. But it’s got the slows, it’s got the slooowws… With about 3200 road segments in the “toy database,” it could route in about 1-2 seconds, but the full-map version took about 6 seconds per routing task — and there may be multiple routing tasks in each route, from start point, to via point and then through subsequent via points, and finally to the endpoint. Unacceptable!

    I did some searches online, and sure enough there are a lot of people complaining about pgRouting performance and looking to speed it up. The general consensus: there are a few things you can do, including tune your database, but the actual bottlenecks are the pgRouting algorithms. Some suggested using osm2po, another program that converts OpenStreetMap data for databases but can  also do routing: tried it and it’s blindingly fast – d’oh! (Unfortunately, I didn’t see much there in the way of customized, dynamic cost functions, so I can’t see how to turn it into the the answer I’m looking for.) I tried a bunch f the Postgres/PostGIS performance-tuning tips anyway, and they did seem to help a little.

    I eventually came across one potential solution: route only on a subset of the roads in the database, using a bounding box. For each pair of points to route between, I find the smallest rectangle that contains both, then expand it by 2000 meters in every direction (like a buffer zone); this is my bounding box, and the routing search is limited to the roads that touch or fall within that box. This seemed to do the trick: my routing times are back down to about 1-2 seconds.

    Except near — wait for it — those confounded bridges. The valley is broken up by the Lehigh river, with occasional bridges, and if there are no bridges within the bounding box for a route that needs to cross the river, no route will be found. Meanwhile, when routing points are on a diagonal, the bounding boxes are fairly big, but routing points that run mainly east-west or north-south produce long, skinny bounding boxes. I found a few “dead zones” where routes couldn’t be found, especially east-west ones north of Northampton, routes with skinny bounding boxes where the bridges are a little sparser. My original bounding boxes were expanded by a buffer that was only 1000 meters; I went to 2000 meters in an attempt to alleviate the bridge problem. This didn’t solve it entirely, but it did help, and there was no real performance hit going from 1000 to 2000 meters. I’ll probably look at distances between bridges, and revise my buffer zone to be just bigger than say, half that distance.

    Reading: I picked up Don DeLillo’s Underworld again, intending to just read the first part. I love the first chapter but never finished the book because I found the rest boring; now I am engrossed and don’t know what I was thinking back  then.

    Listening: WXPN has been playing “The 70’s, A-Z” this past week, every song they have in their library that was released in the Seventies, played in alphabetical order. We’ve been following along religiously, and it’s been fascinating and fun but they’re only up to “T,” and it gets wearing. Full disclosure: the radio is off right now…

    The only time they weren’t playing the 70’s was for their Friday “Free at Noon” concert at the Word Cafe, which this week featured Russ’s band Cherry. So, we went down to Philly with Ray and Lorraine, where we met Frank and Patricia, and Ben, and Gabby, and we all watched the show and then went out to lunch with Russ at the White Dog Cafe. As always, we spent a few minutes at Penn Books before the ride home. All the talk in Philly, among us and overheard on the street, was about the upcoming snow on Saturday…

    By the way, Saturday was Luminaria Night in Bethlehem, here is a photo of ours:

    candles in bags on sidewalk
    Luminaria Night

    One last thing: here is what I wrote ten years ago.