Category Archives: SysAdmin

Solid State Disks

For my less geeky readers, a Solid State Disk (SSD) is similar to the mechanical “hard disk” which has traditionally been the storage in most PCs and laptops. However, an SSD has no moving parts and works entirely off memory chips – a bit like a USB memory stick. The big advantage of them is they’re a lot faster to read data from than a mechanical hard disk.

SSDs have been standard issue at work for a while now, but I hadn’t yet had occasion to buy one myself. I was pleasantly surprised by how much of a price collapse had taken place – I picked up a 250GB SSD from Ebuyer for £64. The use-case was my home desktop PC, which was glacially slow at resuming from hibernate, and struggled to run Windows under VirtualBox. A Core2 Duo should have no problem with this from the CPU side, but it felt like I/O performance was the problem.

I was surprised by just how small and light the SSD was – of course, with no moving parts or motors to spin the platters, it’s about half the size and a tenth the weight of the equivalent mechanical disk.

Out with the old, in with the new

Out with the old, in with the new

Since the SSD was the same size (250 marketing gigabytes, which is to say 232.9 actual gigabytes) as the “spinning rust” it was replacing, I simply connected it, booted into SystemRescueCD from my network boot server, and used dd to copy the block device of the old disk over onto the new one. It looked like it was going to take about four hours, so I issued a shutdown for five hours hence and went out for the day.

Having now had a chance to try it out, I’m properly impressed – the machine boots and resumes much faster and Windows under VirtualBox is now snappy enough to be usable. At that sort of price, I shall have to see about getting one for my laptop too.

Update: some good discussion on social media. The downsides of SSDs are pointed out, e.g. limited write capacity (if you were to write to this one continuously for 2.5 days, you’d wear it out), and that they aren’t suitable for archiving as power-off data retention can be limited to months. None of this matters for my use-case, but enterprise-grade SSDs with enterprise-grade price tags also exist to try and solve at least the write lifetime issue.

Thunderbird Autoconfiguration

If you’re a diehard like me who still runs their own e-mail server – perhaps for a few friends and family as well as yourself – you might find Thunderbird’s autoconfiguration useful. Even if you only set up Thunderbird on a new machine once a year, you’ll be into a net saving of time the second year. And it’s especially useful if you have to talk a less technically-minded user through the setup, because all they need is their e-mail address and password.

You simply write an XML file in this format, and expose it at Thunderbird should then proudly announce “Configuration found at ISP”. Sorted.

Raspberry Pi network boot server

I’ve been erasing and reinstalling a few machines lately – some for sale on eBay, some to set them up for friends. And I’ve finally reached the point where I’m fed up of having to dig out the writable DVDs and download DBAN or Knoppix for the nth time. It’s especially irritating when the computer in question doesn’t have a DVD drive.

One answer to this is network booting – any PC with an ethernet port these days can usually do it, so all I needed was a network boot server on my LAN at home. I’ve always been a little nervous of this as it means taking the job of doing DHCP away from the router and doing it myself on a Linux box*, but since I have an always-on Raspberry Pi anyway, I thought I might as well have a go.

There’s plenty of information on the internet about this sort of thing, but most of it is out of date or needlessly complicated. I used dnsmasq as my DHCP and TFTP server, and the very latest and greatest pxelinux to boot to. It’s worth noting that if you get a new enough version (I compiled myself a 6.03), it has native support for fetching the things to be booted over HTTP (no need to mess around chain-loading something more complicated to do the HTTP bit). This is important, as TFTP is glacially slow to the point of being unusable for multi-megabyte boot images.

Armed with this, you can do the following in dnsmasq.conf:

# Network boot
# DHCP Option 210 = PXE path prefix (RFC 5071)

You need to use Apache or similar to expose the right files under /boot/, of course.  Armed with this, you can use paths relative to http://your-network-boot-server/boot/ in the menu configuration for pxelinux and you should see things being booted over HTTP nice and quickly.

I’ll finish with a screenshot of a three-year-old laptop displaying my network boot menu:


* Yes, if you have a clever enough router you can simply have it announce the network boot server in its DHCP replies and leave it doing DHCP, but when was the last time you saw that option in a consumer-grade router?

Linux Capabilities and rdiff-backup

Hazel Smith gave an excellent talk at FLOSS UK‘s Unconference in London last weekend about Linux Capabilities and using them to run a backup system with minimal permissions. Several people in the room, myself among them, sat up and went “Nice idea. I’ll be using that”. Here’s what I did:


For many years, I’ve used rdiff-backup to back up a bunch of different Linux systems. It works well, keeps the most recent backup on disk in its original form (including file owners and permissions) and allows access to the previous n days’ worth of backups too, stored efficiently. I keep 30 days and it’s occasionally been handy to have the history available.

My system was “pull” based as you’d expect: the backup server logged into each of the to-be-backed-up systems over SSH and ran rdiff-backup via sudo. You can then configure sudo so that the “backuphelper” user rdiff-backup logs in as can run rdiff-backup as root without a password being prompted for. This then gives rdiff-backup the power to read all the files and do a system-wide backup. On the receiving end, the entire rdiff-backup process and the scripts calling it run as a cron job under the root user so it can preserve owners and files on the backed-up data.

The problems

This system had served me well for a number of years, but as Hazel’s talk pointed out, it definitely violates the principle of least privilege to run the entire process as root on both the backup and target servers.

Fixing it: the target servers

I followed a similar process to the slides. On Debian, if you haven’t fiddled with the contents of /etc/pam.d then the installation of the libpam-cap package automatically adds the necessary line for to common-auth, so it works for SSH, cron and su spawned shells and you only need to configure in /etc/security/capabilities.conf.

Because rdiff-backup is written in Python, you can’t set capabilities on the script itself: the shell spawns a Python process so you need to set the capabilities on /usr/bin/python (or rather /usr/bin/python2.7 at the time of writing, as the former is a symlink). There is much waffle on the internet about how unfortunate it feels to be putting capabilities on the interpreter rather than the intended program. However, since in this case the capabilities only work if Python is run from a user who has already been granted them by pam_cap, it doesn’t seem like too much of a problem. I briefly pondered using a hard link python-with-extra-caps which was owned by ‘backuphelper’, but that felt like a maintenance burden. Overall, I still think doing it this way exposes a much smaller attack surface than running rdiff-backup as root.

Taking it further

I see that Hazel’s post-talk additions to the slides noted the further possibility of using capabilities to avoid running the backup process as root on the backup server. I had a go and managed to get it working.

Again, because rdiff-backup is Python, the capabilities needed setting on the Python interpreter. Once more I made a dedicated ‘backuphelper’ or similar user to run the backup process. I moved the SSH keys and known_hosts list across from ‘root’ which used to own them, and found that rdiff-backup at the backup server end needs CAP_DAC_OVERRIDE (the ability to write arbitrary files, not just read them), and also CAP_FOWNER if you’re expecting it to preserve Linux owner/group information and permissions.

I don’t think the backup server has gained a great deal of security from this: if you can write and chown/chmod arbitrary files, then you can certainly take total control of a system in a few steps. But at the very least, not running it all as root limits the damage that can be done by accidents (bugs/flaws in rdiff-backup) and adds more steps to get in the way of a poorly crafted or targeted exploit.

A couple of other twiddles were needed: if your backup process sends e-mail, it will now do so as ‘backuphelper’ not root, so make sure that user has a sensible from address. Lastly, my backup scripts run ‘shutdown’ to switch off the backup server when it’s finished work. I had to arrange for this to be done via sudo now that the entire backup process is no longer run by root.


As in many configuration files, order matters in the PAM configuration. RTFM. If you want to see whether pam_cap is working, you can do this:

grep CapInh /proc/$$/status

And use capsh –decode= on the resulting bit string to understand what you’ve got. If it’s all zeroes, check capabilities.conf and your PAM configuration.

Last words

If you’re thinking of giving your backup system an overhaul, don’t forget to test whether the backups the new system takes can actually be restored from.

FLOSS Unconference 2015

I had a good time at this in London yesterday – some interesting talks in the morning including one about Linux Capabilities which I’ll definitely be lifting some ideas from, and a couple of my questions (“Why don’t developers and sysadmins like each other?”, “What did you do with your Raspberry Pi?”) were discussed in the afternoon.

I thought I’d be in a minority as a developer, but in fact it was about two-thirds dev and one-third sysadmin. Some of us considered ourselves both, of course.

Unfortunately I was feeling quite wrung out after a long week and decided to make a dash for the early train home, but I’ll definitely be back at similar events in future.

TalkTalk Business, the good and the bad

Last year, the assimilation of Be into Sky prompted us to have a think about our internet provider at the church. We have a single phone line into the church office, rarely used for calls but often used for internet.

Prompted by the attraction of having one bill to pay, and not paying as much for line rental as we were to BT, we moved both phone and internet over to TalkTalk’s business offering. They did send us a router, but I just plugged their account details into our existing one and left everything as it was. The switch-over was refreshingly simple – because they were providing the landline too and plugging it into their equipment at the exchange, faffing around with migration codes for the ADSL wasn’t necessary.

For six months, all seemed well – our ADSL was fine and running at 9mbps, and the phone line we never used was presumably OK. Then a couple of weeks ago, the real test of the supplier started when the line developed a fault. Our ADSL wouldn’t stay sync’d and was running at a third of its usual speed with massive packet loss.

I was pleasantly surprised to get hold of a human being in support on a bank holiday Monday, and even more so when he was prepared to take my word for it, without arguing, that I’d tried replacing everything my side of the master socket and even used the test socket* to eliminate a possible fault in our equipment. The only annoyance was a classic call-centre screw-up – the automated system picks up, asks you to key in the phone number you’re calling about, then puts you through to a human who … asks you for the phone number you’re calling about. Pretty shaky for a telecoms company…

It all went a bit sideways from there – I explained that the church building isn’t manned continuously, so if he was going to get BT Openreach to send an engineer, he needed to (a) call me back and say when that would be, and (b) make damn sure the engineer had my mobile number. Neither of those things happened, and I had to call back two days later to be told an engineer had been sent and failed to gain access to the premises. I was told an engineer had been re-booked for Thursday between 1 and 6. I duly spent my Thursday afternoon sat in a chilly church with no WiFi, and phoned to tell them nobody had turned up. I was told that I had been misinformed, and they’d failed to re-book the engineer. Suppressing my anger, I asked them to try again and make less of a hash of it. This time, I got the 8am to 1pm slot on Friday morning, and thankfully BT’s man arrived by 8.15.

He swiftly identified a junction box just inside our property (but before the master socket) which was full of water. One replacement later, and everything is fine again.

I’m not sure who to blame for the screw-up over sending an engineer – perhaps such problems are an inevitable side-effect of being TalkTalk’s customer on BT’s piece of copper, much like the loss of accountability between Network Rail and the train companies.

What else? I wasn’t impressed by TalkTalk’s free router – the web interface has a noticeable delay on all operations, and it’s slightly lacking in features (e.g. you can configure the DHCP to always give the same IP address to a given device, but you can’t specify which IP address). The fact that the £20 TP-Link one I got from Argos is better is a bit of a clue as to how much they spent on theirs.

* Un-screw the faceplate from your master socket, and you’ll find the test socket behind it.

Blocking executable files (even buried inside ZIPs) in Exim

One of the handful of family members I host e-mail for had a narrow escape the other day, just about managing to avoid opening an .exe file buried inside a ZIP file attached to an e-mail purporting to be from Amazon.

The quality of some fake e-mails sloshing around these days is very, very good, and it seemed in this case that even the full might of SpamAssassin and ClamAV (with unofficial malware signatures) hadn’t sufficed to stop this one getting to the user’s inbox.

Spurred on by the thought of how long it might have taken me to disinfect their Windows box if they’d opened the .exe, I decided to take more drastic measures and block attachments containing .exes on the server.

Plenty of recipes for doing this are to be found on the net. The really nice bit for me, though, was the chance to break out Eximunit and do some test-driven sysadmin:

from eximunit import EximTestCase

from email.MIMEMultipart import MIMEMultipart
from email.MIMEBase import MIMEBase
from email.MIMEText import MIMEText
from email.Utils import COMMASPACE, formatdate
from email import Encoders

EXE_REJECT_MSG = """Executable attachments are not accepted. Contact postmaster if you have a
legitimate reason to send such files."""

ZIP_EXE_REJECT_MSG = """Executable attachments are not accepted, even inside ZIP files. Contact
postmaster if you have a legitimate reason to send such files."""

class ExeTests(EximTestCase):
    """Tests for .exe rejection"""

    def setUp(self):
        # Sets the default IP for sesions to be faked from

    def testDavidDomainRejectsExe(self):
                         .rcptTo('').assertDataRejected(self.messageWithAttachment('test.exe'), EXE_REJECT_MSG)

    def testDavidDomainRejectsExeZip(self):
                         .rcptTo('').assertDataRejected(self.messageWithAttachment(''), ZIP_EXE_REJECT_MSG)

    def testDavidDomainAcceptsJPG(self):

    def testDavidDomainAcceptsJpgZip(self):

    def messageWithAttachment(self, filename):
        msg = MIMEMultipart()
        msg['From'] = ''
        msg['To'] = COMMASPACE.join('')
        msg['Date'] = formatdate(localtime=True)
        msg['Subject'] = 'This is a subject about .exe and or .zip'

        msg.attach(MIMEText('Test message body'))

        part = MIMEBase('application', "octet-stream")
        part.set_payload( open(filename,"rb").read() )
        part.add_header('Content-Disposition', 'attachment; filename="%s"' % os.path.basename(filename))
        return msg.as_string()

The tests (both positive and negative cases) helped me to hammer out a couple of initial bugs. I really don’t know how anyone runs a live e-mail service without this sort of reassurance when tweaking the settings.

P.S. In the 48 hours since it went live, the new check has rejected over 60 messages, all of them containing a single .exe buried inside a ZIP. Many, but not all, of the messages are purporting to be from Amazon, and a surprising variety of different hosts are sending them, presumably part of a botnet of compromised machines.

Google and IPv6 e-mail

Update: The change described below does not seem to have reliably stopped Google from bouncing my e-mails. Time to ask them what they’re doing…

I obviously spoke too soon. Having complimented Google for finally enabling IPv6 on Google Apps, I was lying in bed this morning firing off a few e-mails from my phone when this bounce came back:

This message was created automatically by mail delivery software. A message that you sent could not be delivered to one or more of its recipients. This is a permanent error. The following address(es) failed:

SMTP error from remote mail server after end of data:
host ASPMX.L.GOOGLE.COM [2a00:1450:400c:c05::1a]:
550-5.7.1 [2001:41c8:10a:400::1 16] Our system has detected that this 550-5.7.1 message does not meet IPv6 sending guidelines regarding PTR records 
550-5.7.1 and authentication. Please review 550-5.7.1 for more 550 5.7.1 information. ek7si798308wic.60 - gsmtp

Hmm. The recipient address has been changed, but the rest of the above is verbatim. The page Google link to says:

“The sending IP must have a PTR record (i.e., a reverse DNS of the sending IP) and it should match the IP obtained via the forward DNS resolution of the hostname specified in the PTR record. Otherwise, mail will be marked as spam or possibly rejected.”

All of which is reasonable-ish, but the sending IP does have a PTR record which matches the IP obtained by forward resolution:

david@jade:~$ host 2001:41c8:10a:400::1 domain name pointer

david@jade:~$ host has address has IPv6 address 2001:41c8:10a:400::1

So what are they objecting to? Some Googling and some speculation suggests that they might be looking at all hosts in the chain handling the message (!). Further down the bounce in the original message text we find:

Received: from [2a01:348:1af:0:1571:f2fc:1a42:9b38]
	by with esmtpsa (TLS1.0:RSA_ARCFOUR_MD5:128)
	(Exim 4.80)
	id 1Vrm3Q-0002Ay-NH; Sat, 14 Dec 2013 10:02:36 +0000

Now, the IPv6 address given there is the one my phone had at the time. It doesn’t have reverse DNS because you can’t disable IPv6 privacy extensions in Android (also Google’s fault!), and assigning reverse DNS to my entire /64 would require a zone file many gigabytes big.

At this point, it’s probably best to stop speculating on Google’s opaque system and start working around it from my end. Others have resorted to disabling IPv6 for their e-mail server altogether – no thanks – or just for sending to This latter approach doesn’t work for me as the example above involves – and potentially lots of different domains will be using Google Apps for mail, so a simple domain-based white/blacklist isn’t going to cut it.

After spending some time with the excellent Exim manual, I’ve come up with a solution. It involves replacing the dnslookup router with two routers, one for mail to GMail/Google Apps hosted domains, and one for other traffic. Other settings on the routers are omitted for brevity, but you should probably keep the settings you found originally.

  debug_print = "R: dnslookup (non-google) for $local_part@$domain"
  # note this matches the host name of the target MX
  ignore_target_hosts = * : *
  # not no_more, because the google one might take it

  debug_print = "R: dnslookup (google) for $local_part@$domain"
  # strip received headers to avoid Google's silly IPv6 rules
  headers_remove = Received
  headers_add = X-Received: Authenticated device belonging to me or one of my users

Sysadmin by point and click

I promised an update on Google Apps some time ago. This week, we finally flipped the e-mail for Saint Columba’s over to Google Apps hosted mail. Combined with using Mythic’s DNS servers (free with the domain registration and with a pretty nice web interface), that meant the church’s online presence was fully disentangled from my servers for the first time in years.

While that’s a good thing on a pragmatic, I-don’t-have-time-to-run-this-any-more basis, it’s been a bit of a rough ride. Here are some of the things I’ve learned.

  • Google apps ‘groups’ are a replacement for what you think of as e-mail aliases or forwarders. Make a group, set it to ‘anyone on the internet can post’, and add the addresses you want it to forward to. In my case, I found Google Apps e-mail users which were part of the group failed to get any messages sent to it. Then it suddenly started working after 24 hours. The frustration of not knowing why this is (or if it’ll stop working again tomorrow) was probably the low point of the whole exercise.
  • Forwarding mail from a google account somewhere else requires somewhere else to give you a confirmation code before it’ll work. Not entirely unreasonable, but frustrating when it’s two accounts in the same Apps domain.
  • The out of office auto-responders on external GMail accounts don’t work for mail forwarded via apps groups/aliases. There is no documentation of this, they just fail to send a reply (not really an Apps problem, more a migration from old GMail account to Apps problem).
  • Single-line or empty test messages with just a subject line will often get eaten/filtered to junk by Google Apps as spam. This makes testing rather ‘fun’.
  • Google Apps e-mail supports two-stage authentication with your phone, which is handy, but domain administrators have to flip a setting to let users enable it. Whilst you’re there, enable SSL everywhere (boo for this not being the default).
  • All parts of Google Apps have IPv6 enabled. Nice to know we can remain the only (?) church in Oxford with an IPv6 website and e-mail.

Now it’s done and working, we’ll be sticking with it, but I can’t pretend it’s been very good for my blood pressure getting there.

Twin Nvidia graphics cards and Ubuntu

My home desktop PC, running Ubuntu, has had two monitors for quite some time, but when an obliging friend on IRC offered me a third, I couldn’t pass it up.

This meant installing a second graphics card, and fortunately I had a second Nvidia GeForce 9400GT (PCI Express), identical to the one already fitted. This went in perfectly but wasn’t recognised by anything. When I took the cover off again and looked more closely, it seems there’s a jumper which needs rotating to switch into ‘dual video cards’ mode:





Having done this, everything sprang into life. Tell Nvidia’s settings applet to overwrite the existing Xorg.conf file (not merge with it), set them all as separate X displays, and Ubuntu spreads cleanly across three of them. Shame I don’t have a fourth…