Category Archives: Code

Non-web related coding, mostly Python.

Innotech iComm – it actually has an API

I’ve mentioned before that we run a rather venerable HVAC controller at the church which drives our heating system.

Last weekend, I was bored enough to start digging in to the network protocol it speaks, but before I spent hours slaving over Wireshark, I took a last look around to make sure I hadn’t missed an “official” programmable interface.

Somewhat to my surprise, I had!

iCommX help screen

1999’s idea of the future

Leaving aside a bit of sniggering about how ActiveX hasn’t been “The Way of the Future” since I was half my current age, let’s try firing it up from a sensible programming language (Python)…

import win32com.client

def get_px(block_name, field_name):
    """Constructs the COM object for reading and writing a block
    Block names can be viewed in MAXMon; most of the field names are also visible there if you open (double click) the block"""
    px = win32com.client.Dispatch("icommx.pointx")
    px.ServerAddress = "localhost"
    px.DeviceAddress = "6:1" # connection number, device number
    px.PointAddress = "%s~%s" % (block_name, field_name)
    return px

def get_temps():
    for tzone in ["Outside Air", "Boiler Flow", "Utility Space", "Church Space", "Hall Space", "HWS Cylinder"]:
        px = get_px(tzone, "OUTPUT")
        print(tzone, px.Value)

get_temps()

There are various hoops to jump through to generate the necessary Python COM bindings, which you can Google for. Needless to say, all this COM stuff requires us to be on Windows. Saving this as icp.py, let’s give it a try:

C:\Users\dtn>python icomm-python\icp.py
Outside Air 27.03
Boiler Flow 24.23
Utility Space 20.67
Church Space 21.32
Hall Space 23.91
HWS Cylinder 54.75

Result! (Hot day in Oxford … the boiler is not running, yet the hot water is toasty thanks to our solar panels.) You can also, somewhat scarily, assign to the “Value” field for blocks which are inputs such as override buttons or temperature set points.

The API is a bit limited because there’s no way to enumerate the available blocks – you just have to know (using one of the companion apps like MaxMon) what they are. More disappointingly, I couldn’t find a way to write the schedule on the seven-day timer blocks. However, a bit of thought made me realise this doesn’t matter too much: I can still write to a boolean flag connected to the “override” pin on those blocks, and write an override value into each one – number of minutes until heating next goes on/off.

Long story short, we can finally wire our church heating directly into the bookings calendar and stop having to program it by hand! Of course, we will need a permanently stationed Windows box, but we’re going to need one anyway for some other projects. Exciting times…

Replacing HallMaster

So, as explained in my last post, we needed a new invoicing system for the church hall. The back-of-a-beer-mat requirements were pretty simple…

  • One-off bookings can be invoiced by a manual process, but the regulars must be sent an invoice by e-mail automatically at the end of each month.
  • One invoice should contain all bookings for a customer in the period covered.
  • Must be possible to see an aggregated summary of which customers owe us how much money
  • Must be easy to track repeating series of bookings and make ad-hoc tweaks

There did not seem to be anything out there to buy (at least, not for sensible money), so I sat down over a couple of weekends and wrote it.

Naturally, I used other systems as much as possible. Back in the day, we used Google Calendar to track the actual bookings, which is a well known interface and worked nicely. Unlike HallMaster, it doesn’t automatically prevent clashing bookings – but we only have one human entering bookings, and this hasn’t been a problem in six months of operation. It also doesn’t guarantee that every calendar event is associated with a known customer – but since most of our invoices go out monthly, we solved that with a weekly nagging e-mail to the staff in question about any calendar events which can’t be matched to a customer or room.

So, with a Google Calendar full of bookings in place, I turned to Django for the next bit. Thanks to the automatic form generation (and a bit of customisation of the automatic admin interface), writing a web app to keep track of customers and their billing contact details was super-quick. Each customer also has a list of “event names” which are the titles under which their bookings appears in Google Calendar. This also allows the use of e.g. “Private Event 42” for customers who don’t want their name visible in the public calendar.

And now to the beating heart of the system: automatic invoicing.

This has to be customer-centric, so the first thing I implemented was a method on the Customer model which could generate an invoice for that customer’s bookings in a given time period. Looking things up in Google Calendar is super-easy: they have a nice, well-documented API including example code which can be copied and pasted for a fast start. Their API also has the crucial feature of forcing the “expansion” of repeating events, i.e. your code doesn’t have to understand repeats; it just gets an event for every repeat.

From there, looking up each room’s hourly cost according to whatever tariff the customer is on is an easy modelling and maths problem. Thankfully we’re not required to charge VAT owing to our size and charitable status, so that complication doesn’t have to be factored in.

The most complicated part turned out to be generating invoices in an acceptable format. My first thought was plain text e-mails, but getting a big table of bookings laid out nicely in one is a challenge, and probably impossible given that many e-mail clients don’t use a monospace font. What’s more, it would look rubbish on a mobile phone, which is where many e-mails get read these days.

Generating PDFs from Python doesn’t seem to be a super-easy solved problem, so in the end, I went for generating invoices as HTML (using a normal Django template) and then turning this in to PDF using wkhtmltopdf (which has some decent Python bindings). This does require the right fonts to be installed on the target system (ttf-mscorefonts-installer on Debian) to avoid invoices coming out in a blocky default font, but other than that blip it seems to be 100% reliable, and doesn’t require an X server.

We send these out attached to a plain text e-mail, which just says how much you owe and refers you to the attachment for details.

Naturally, the invoicing logic is covered by a wide set of automated tests. We do have the odd hirer whose bookings cross multiple days (i.e. span midnight), so I tested that in particular and also the possibility of bookings which span the clocks going forward or backward. Django makes this fairly easy to cope with: store all the times in UTC, which is also what the Google Calendar will give them to you in. Then all you need to do is format them for display to the user in local time (which covers BST when needed).

Automatically issued invoices (recipients redacted)

With that logic in place, it was just a quick custom management command to loop over all customers and invoice them if they are a regular and it’s the first of the month. Set that on a cron job for early morning on the first of the month, sit back, and relax.

Lessons learned

Make it easy to look under the hood. The Django admin interface helps with this, as does the facility I built in to CC all outgoing e-mails to extra recipients. I use this to stick them all in folders under my own inbox so I can refer back to them.

Ensure a failure to invoice one customer doesn’t stop all the others being invoiced. Easy enough with a try/catch, and the customer-centric design makes it easy to issue that one customer’s invoice later with a manual invocation from “manage.py shell” once the problem is fixed. This works well and we’ve had 100% successful invoicing runs for the last five months (out of six since going live).

Make it easy for people to get the payment reference right when paying you via BACS. Our system assigns a customer number to each hirer, and tells them to put just that as their payment reference. There’s no need for an invoice number or dates, and the reference can be the same for all payments, which helps with some online banking systems that make it painful to set a fresh reference for each payment to the same recipient.

  • This works pretty well, and nearly all customers manage to do the right thing. Some insist on quoting an invoice number (which is managable, we can just de-reference that back to the customer), and the odd one or two continue to make stuff up. I have an “upload bank statement” view which allows me to manually assign incoming payments where we haven’t detected the customer automatically. At the moment this is rare enough that I’m letting it slide, but if we ever get an automatic feed in of this information (Monzo and Starling, where are those charity accounts?), we will start cracking down and insisting on the right reference.

Don’t start your invoice numbers with the letter I, people will mis-read it as the number 1. We solved that by switching to “V” instead.

The end result is less than 2,000 lines of code which I’m continuing to evolve and tweak, but overall, this has solved it. We can see exactly who owes us money, split between the one-ofs and the regulars, and being reliably sent a bill on time is helping the vast majority of them to pay up promptly with the right reference. We’re approaching the mid-point of the year and well on track to get back to previously attained levels of hiring income.

The authors of HallMaster might well argue that I haven’t replaced their entire product with two weekends and not much code – and they’d be right, my system is designed to do exactly what we need and not service multiple customers – but I can’t help but feel it’s high time they offered fully automatic invoicing to really save their users some time. It’s not difficult and it makes things feel so much more professional.

Chromebook: the story so far

Google kindly gave my employer a Chromebook recently – all the better to talk to Google’s marketing team on about what one might buy. And once that call was over, there was no immediate use for it around the office, so I … borrowed it … for an extended road test at my flat.

Asus C202S Chromebook

So far, I’m impressed. This particular Chromebook is the Asus C202S, and it’s almost supplanted my company-issue Lenovo X230 as my do-everything laptop at home. One particularly impressive thing is that it can sit on my lap for hours without producing enough heat to be uncomfortable – something any “proper” laptop fails at within 30 minutes or so.

Since nearly everything is done via a browser these days, doing life admin and messing around on the internet is all taken care of. The only things I miss are Thunderbird (because the best webmail client I can find lacks the “redirect” feature) and an SSH client (yes, I have an installation of Shell In A Box, but that requires a password rather than SSH keys).

It uses some slick integration with Google Drive to make up for the lack of a local filesystem, and given I have an Android phone, that works well.

All in all, I suspect I’ll be buying one of these as my next “home laptop” rather than anything more full-fat. But then, I do still have a desktop with two 19″ monitors for the heavy lifting.

Sometimes, you just don’t want to be right

It’s been a few years since I “enjoyed” writing code to mangle Microsoft Office documents, but I’ve been getting back into such things at work lately. My team and I had lots of fun yesterday tracking down the source of a bug in POI which led to Excel mangling comments when the file was saved. Frankly, I wasn’t sure to be delighted or disgusted that the “fix” was changing an ID value in the XML output. Yes, it would seem that a so-called arbitrary identifier has significance to Excel beyond cross-referencing elements in the same file, and using the same not-so-arbitrary identifier written by Excel itself dissuades it from trashing the file on save.

Google and IPv6 e-mail

Update: The change described below does not seem to have reliably stopped Google from bouncing my e-mails. Time to ask them what they’re doing…

I obviously spoke too soon. Having complimented Google for finally enabling IPv6 on Google Apps, I was lying in bed this morning firing off a few e-mails from my phone when this bounce came back:

This message was created automatically by mail delivery software. A message that you sent could not be delivered to one or more of its recipients. This is a permanent error. The following address(es) failed:

example@my-friend-with-google-apps.example.com

SMTP error from remote mail server after end of data:
host ASPMX.L.GOOGLE.COM [2a00:1450:400c:c05::1a]:
550-5.7.1 [2001:41c8:10a:400::1 16] Our system has detected that this 550-5.7.1 message does not meet IPv6 sending guidelines regarding PTR records 
550-5.7.1 and authentication. Please review 550-5.7.1 https://support.google.com/mail/?p=ipv6_authentication_error for more 550 5.7.1 information. ek7si798308wic.60 - gsmtp

Hmm. The recipient address has been changed, but the rest of the above is verbatim. The page Google link to says:

“The sending IP must have a PTR record (i.e., a reverse DNS of the sending IP) and it should match the IP obtained via the forward DNS resolution of the hostname specified in the PTR record. Otherwise, mail will be marked as spam or possibly rejected.”

All of which is reasonable-ish, but the sending IP does have a PTR record which matches the IP obtained by forward resolution:

david@jade:~$ host 2001:41c8:10a:400::1
1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.4.0.a.0.1.0.8.c.1.4.1.0.0.2.ip6.arpa domain name pointer diamond.dnorth.net.

david@jade:~$ host diamond.dnorth.net.
diamond.dnorth.net has address 212.110.165.240
diamond.dnorth.net has IPv6 address 2001:41c8:10a:400::1

So what are they objecting to? Some Googling and some speculation suggests that they might be looking at all hosts in the chain handling the message (!). Further down the bounce in the original message text we find:

Received: from [2a01:348:1af:0:1571:f2fc:1a42:9b38]
	by diamond.dnorth.net with esmtpsa (TLS1.0:RSA_ARCFOUR_MD5:128)
	(Exim 4.80)
	(envelope-from me@mydomain.example.com)
	id 1Vrm3Q-0002Ay-NH; Sat, 14 Dec 2013 10:02:36 +0000

Now, the IPv6 address given there is the one my phone had at the time. It doesn’t have reverse DNS because you can’t disable IPv6 privacy extensions in Android (also Google’s fault!), and assigning reverse DNS to my entire /64 would require a zone file many gigabytes big.

At this point, it’s probably best to stop speculating on Google’s opaque system and start working around it from my end. Others have resorted to disabling IPv6 for their e-mail server altogether – no thanks – or just for sending to gmail.com. This latter approach doesn’t work for me as the example above involves my-friend-with-google-apps.example.com – and potentially lots of different domains will be using Google Apps for mail, so a simple domain-based white/blacklist isn’t going to cut it.

After spending some time with the excellent Exim manual, I’ve come up with a solution. It involves replacing the dnslookup router with two routers, one for mail to GMail/Google Apps hosted domains, and one for other traffic. Other settings on the routers are omitted for brevity, but you should probably keep the settings you found originally.

dnslookup_non_google:
  debug_print = "R: dnslookup (non-google) for $local_part@$domain"
  # note this matches the host name of the target MX
  ignore_target_hosts = *.google.com : *.googlemail.com
  # not no_more, because the google one might take it

dnslookup_google:
  debug_print = "R: dnslookup (google) for $local_part@$domain"
  # strip received headers to avoid Google's silly IPv6 rules
  headers_remove = Received
  headers_add = X-Received: Authenticated device belonging to me or one of my users
  no_more

SQL? Dude, you’re doing it wrong

I’ve increasingly formed the opinion over the past few years that (almost) anyone writing software, certainly in the SME or 90% of open-source space, simply shouldn’t be writing raw SQL.

This is the 21st century, and all the major programming languages have these things called ORMs. Since all you actually wanted from your database was some kind of load/save/search for the objects that make up your software’s state, it turns out encoding that metaphor at the object level is much nicer than writing the code to do it all yourself.

The added bonus of this is that when you want support for a new DBMS, you just need to see if your ORM has support. Generate a schema, run through all your tests, a couple of minor fixes, job done. And you’ll make your sysadmin a happy man by not dictating a choice of DBMS to him or her which makes their life harder.

 You may be drawing breath to argue that writing the raw SQL yourself by hand is ‘more efficient’, but come on. Your blog has a couple of hundred posts and maybe a few thousand comments. The daily hit rate of ten and a half people isn’t going to tax even MySQL – so whether you use it, PostgreSQL, or even the free edition of IBM’s DB2 (eight-character limit on database names, anyone?) really doesn’t matter. It’s an implementation detail you shouldn’t worry too much about, and certainly shouldn’t prematurely optimize by getting too familiar with.

Django has led the way in the Python world for years on the ORM question, but all the other languages have them too. Make the jump, and you can always escape to raw SQL if you really need it in one corner of your application.

Dear Internet, any ideas for a free(ish) room booking system?

I’ve been tasked with finding a room booking/hiring system for St Columba’s.

Requirements:

  • Booking of different spaces by different users
  • Ability to generate a report of billable hours outstanding for a given user
  • Ability to send invoices and reminders to users and internal staff
  • Integration with Google Calendar, as that’s how we publish hiring timetables on the website at the moment

At the moment, the most promising free option out there seems to be MRBS, but I’d have to add the Google Calendar and invoicing integration myself.

Does anyone have any bright ideas? Let me know if so.

The port 0 trick

The port 0 trick came in handy when writing eximunit, and it’s something surprisingly few developers know about, so I thought it worth recounting here:

The problem: you want to set up a web/mail/whatever server programmatically (e.g. as part of some tests). This server wants to bind to port 80/25/whatever. Your first problem is that it can’t bind to these because you’re not running your tests as root (or as an administrator on Windows).

The lazy approach at this point is to hard-code a port number over 1024, which you don’t have to be privileged to bind to. But this all falls to bits if you want to run the same test simultaneously on the same machine, or you need lots of different ports during the course of one test.

At this point, you can reach for the port 0 trick: on both Windows and Linux, if you bind a socket to port 0, the kernel will assign it a free port number somewhere above 1024. Truly well-written software (e.g. Jetty) will not only let you configure it to bind to port 0, but will make it easy to parse its logs to obtain the actual port number it got assigned. Less helpful software (Tomcat) will let you configure it to bind to port 0, but print 0 in all its logs, never the actual number. And the majority of software just won’t let you put 0 as a port number in its configuration.

At this point (subject to a slight race condition), you can grab some port numbers yourself and feed them to whatever you’re trying to configure:

def findFreePorts(howMany=1):
    """Return a list of n free port numbers on localhost"""
    results = []
    sockets = []
    for x in range(howMany):
        s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        s.bind(('localhost', 0))
        # work out what the actual port number it's bound to is
        addr, port = s.getsockname()
        results.append(port)
        sockets.append(s)

    for s in sockets:
        s.close()

    return results

The above is written in Python, but it translates trivially to any programming language which knows what a socket is.

You’ll note that the correct way to get, say, five free port numbers is to call the above method once with 5 as its argument. If you wrote a simple method which just returned one number, there would be nothing to stop it returning the same number each time you called it (the Linux kernel is rather more helpful, and will usually hand out a different number to each port 0 request, but I wouldn’t  rely on this).

Introducing eximunit

For a few years now, I’ve run a hosting co-operative with a few friends. Although the cost savings versus all renting VMs individually are probably marginal at best these days, one of the nice things about it is the chance to run things like our incoming MX on one machine only, instead of all having to run our own anti-spam and other measures. The incoming mail is handled by Exim, and each user of our system can add domains for which mail is processed. They get to toggle SMTP-time rejection of spam and viruses, and specify the final destination machine for incoming mail to their domain.

This has all been working well for over  two years, but occasionally something has to change: a few months ago, we got rid of sender verify callouts, now widely considered abusive by SMTP server admins, and more recently we added support for tagging messages with headers to say if they passed or failed DKIM verification. And every time I make such a change, I worry that I might have inadvertently broken something. This server handles mail for 30 domains and 8 people, some of who rely on it to run businesses! Panic!

I usually end up reassuring myself by doing some ad-hoc testing by hand after reconfiguring the server. At the most basic level, whatever your SMTP server is, you can use netcat to have a conversation with it on port 25:

d@s:~$ nc localhost 25
220 mailserver.splice.org.uk ESMTP Exim 4.71 Sat, 17 Mar 2012 09:51:20 +0000
HELO localhost
250 mailserver.splice.org.uk Hello localhost [127.0.0.1]
MAIL FROM: <>
250 OK
RCPT TO: someaddress@dnorth.net
550-Callout verification failed:
550 550 Unrouteable address
QUIT
221 mailserver.splice.org.uk closing connection

And there, I’ve just convinced myself that one of our features is still working: the mailserver should call forward to the final destination for mail to @dnorth.net addresses to check the local part (‘someaddress’ in this case) is valid, and reject the message up-front if it’s not.

Exim also has a load of other toys you can take advantage of: say I want to check how mail to webmaster@dnorth.net is routed:

d@s:~$ exim4 -bt webmaster@dnorth.net
R: hubbed_hosts for dnorth.net
webmaster@dnorth.net
 router = hubbed_hosts, transport = remote_smtp
 host mx-internal.dnorth.net [192.0.2.100]

(IP addresses changed for example purposes, obviously)

And finally, there’s debug mode: you can run

exim4 -bhc <ip address>

to run a complete ‘fake’ SMTP session as though you were connecting from the given IP address. You can send messages, but they won’t actually go through, and exim prints a lot of debug output to give you a clue as to its inner workings as it decides how to route the message.

This is all very well, but a quick brainstorming session gives a list of over 30 things I might want to check about my mailserver:

  • Basic check that mail is accepted to our domains
  • Only existent addresses on our domains should have mail accepted
  • Domains with SMTP-time spam rejection on should have spam rejected
  • Same for viruses
  • Same for greylisting

Testing all these by hand isn’t going to fly, so what tools can we find for automating it? A bit of Googling turns up swaks, which looks quite handy, but suffers from two drawbacks for me: first, it’s a bit low-level, and a collection of scripts calling it will be a bit difficult to read and maintain for testing all 30 of my assertions. Second, it really sends the e-mails in the success case, and I don’t want my users to get test messages or have to set up aliases for receiving and discarding them. swaks will definitely become my tool of choice for ad-hoc testing in future, but meanwhile…

The other promising Google result is Test::MTA::Exim4, which is a Perl wrapper for testing an exim config file. However, a few problems: (1) it’s Perl, and I Don’t Do Perl. (2), it’s limited to testing the routing of addresses, so it’s not going to cut it for checking spam rejection etc.

Having at least pretended not to be suffering from NIH syndrome, let’s spec out a fantasy system for doing what I want: I would like to be able to write some nice high-level tests in my favourite language, Python, which look a bit like this

class HubbedDomainTests(EximTestCase):
    """
    Tests for domains our server acts as the 'proxy MX' for, doing
    scanning etc before forwarding the mail to the destination machine
    """

    def testProxiedMailAccepted(self):
        """Proxied mail should be accepted"""
        session = self.newSession()
        session.mailFrom('dtnorthie@gmail.com').rcptTo('webmaster@dnorth.net').randomData()

    def testLocalPartsVerifiedWithDestinationMachine(self):
        """Local parts should be verified with the destination machine"""
        session = self.newSession()
        session.mailFrom('dtnorthie@gmail.com').assertRcptToRejected('doesnotexist@dnorth.net')

I could then run these in the usual manner for Python unit tests, and lastly, I want them backed by an exim4 -bhc session so that they’re as realistic as possible without actually sending messages.

This post is long enough already, so I’ll cut to the chase and say that I’ve made a start on writing it, and you can find out more at Bitbucket. In a follow-up post, I’ll talk about how it was done.