Planet BitFolk

Jon SpriggsTalk Summary – An Eulogy for Auntie Pat

Format: Theatre Style room. ~30 attendees.

Slides: No slides provided (nothing to present on!), but the script is here

Video: Not recorded.

Slot: 11 AM, 10th February 2025, 10 minutes

Notes: This is a little unusual. both because I’m posting it as a “Talk Summary” but also because it was a Eulogy. Auntie Pat died in December. The talk I delivered was my memories of her, augmented by a few comments from her next nearest relative, the daughter of her cousin. The room was mostly filled with people I didn’t know, except for one row with my brother and his family. Following the funeral, several people suggested I’d done very well. One person remarked they hadn’t heard the talk because they forgot to wear their hearing aid. I guess when someone passes away in their 80’s, most of their friends will be too. Several people expressed sadness that they hadn’t known all the things I shared about her. We all enjoyed memories of her.

Jon SpriggsBuilding a Linux Firewall with AlmaLinux 9, NetworkManager, BGP, DHCP and NFTables with Puppet

I’m in the process of building a Network Firewall for a work environment. This blog post is based on that work, but with all the identifying marks stripped off.

For this particular project, we standardised on Alma Linux 9 as the OS Base, and we’ve done some testing and proved that the RedHat default firewalling product, Firewalld, is not appropriate for this platform, but did determine that NFTables, or NetFilter Tables (the successor to IPTables) is.

I’ll warn you, I’m pretty prone to long and waffling posts, but there’s a LOT of technical content in this one. There is also a Git repository with the final code. I hope that you find something of use in here.

This document explains how it is using Vagrant with Virtualbox to build a test environment, how it installs a Puppet Server and works out how to calculate what settings it will push to it’s clients. With that puppet server, I show how to build and configure a firewall using Linux tools and services, setting up an NFTables policy and routing between firewalls using FRR to provide BGP, and then I will show how to deploy a DHCP server.

Let’s go!

The scenario

A network diagram, showing a WAN network attached to the top of firewall devices and out via the Host machine, a transit network linking the bottom of the firewall devices, and attached to the side, networks identified as "Prod", "Dev" and "DHCP" each with IP allocations indicated.

To prove the concept, I have built two Firewall machines (A and B), plus six hosts, one attached to each of the A and B side subnets called “Prod”, “Dev” and “Shared”.

Any host on any of the “Prod” networks should be able to speak to any host on any of the other “Prod” networks, or back to the “Shared” networks. Any host on any of the “Dev” networks should be able to speak to any host on the other “Dev” networks, or back to the “Shared” networks.

Any host in Prod, Dev or Shared should be able to reach the internet, and shared can reach any of the other networks.

To ensure I can guarantee the MAC addresses I will be using, I am using a standard Virtual Machine prefix: 16:0D:EC:AF: followed by an octet to identify the firewall ID, fwA is 11 and fwB is 12, and then the interface ID as the last octet. The WAN interface gets 01, prod gets 02, dev gets 03, shared 04 and transit 05. This also means that when I move from deploying this on my laptop with Vagrant, to deploying it on my actual lab environment, I can apply the same MAC addressing scheme, and guarantee that I’ll know which interface is which, no matter what order they’re detected by the guest VM.

A note on IP addresses and DNS names used in this document

In this blog post, “Private” IP addresses are using the “Inter-networking” network assignment from IANA, as documented in RFC2544, while the “Public” IP addresses are using the default Vagrant “Host” network of 10.0.2.0/24 with the host being assigned 10.0.2.2 and would provide the default gateway to the guests.

In the actual lab environment, these addresses would be replaced by assigned network segments in RFC1918 “Private” address spaces, or by ranges allocated by the upstream ISP. Please *DO NOT* build your network assuming these addresses are appropriate for your use! In addition, DNS names will use example.org following the advice of RFC2606.

Building the Proof of Concept

I’m using Vagrant with Virtualbox to build my firewall and some test boxes. The “WAN” interface will be simulated by the NAT interface provided by Vagrant’s first interface (which is required for provisioning anyway), and will receive a DHCP address. All other interfaces will be private, host-only networks using the Virtualbox network manager. Once the firewall is built and running, it will serve DHCP to all downstream clients.

All of the following code can be found on my Github repository: JonTheNiceGuy/vagrant-puppet-firewall

Working from a common base

To start with, I build out my Vagrantfile (link to the code). A Vagrantfile is used to define how Vagrant will build one or more virtual machines, similar to how you might use a Terraform HCL file to deploy some cloud assets. I’ll show several sections from this file as we go along, but here’s the start of it. This part won’t be used to provision any virtual machines, and is instead just Boilerplate for the hosts which follow.

############################################################
############################## Define variables up-front
############################################################
vms_A_number    = 11
vms_B_number    = 12
global_mac_base = "160DECAF"
vms_A_mac_base  = "#{global_mac_base}#{vms_A_number < 10 ? '0' : ''}#{vms_A_number}"
vms_B_mac_base  = "#{global_mac_base}#{vms_B_number < 10 ? '0' : ''}#{vms_B_number}"
############################################################
############################## Standard VM Settings
############################################################
Vagrant.configure("2") do |config|
  ############################ Default options for all hosts
  config.vm.box = "almalinux/9"
  config.vm.synced_folder ".", "/vagrant", type: :nfs, mount_options: ['rw', 'tcp', 'nolock']
  config.vm.synced_folder "../..", "/etc/puppetlabs/code/environments/production/src_modules/", type: :nfs, mount_options: ['rw', 'tcp', 'nolock']
  config.vm.provision "shell", path: 'client/make_mount.py'
  config.vm.provider :virtualbox do |vb|
    vb.memory = 2048
    vb.cpus = 2
    vb.linked_clone = true
  end
  ############################ Install nginx to host a simple webserver
  config.vm.provision "shell", inline: <<-SCRIPT
    # Setup useful tools
    if ! command -v fping >/dev/null
    then
      dnf install -y epel-release && dnf install -y fping mtr nano nginx && systemctl enable --now nginx
      # Configure web server to reply with servername
      printf '<!DOCTYPE html><head><title>%s</title></head><body><h1>\n%s\n</h1></body></html>' "$(hostname -f)" > /usr/share/nginx/html/index.html
    fi
SCRIPT
  ############################ Vagrant Cachier Setup
  if Vagrant.has_plugin?("vagrant-cachier")
    config.cache.scope = :box
    # Note that the DNF plugin was only finalised after the last
    # release of vagrant-cachier before it was discontinued. As such
    # you must do `vagrant plugin install vagrant-cachier` and then
    # find where it has been installed (usually
    # ~/.vagrant/gems/*/gems/vagrant-cachier-*) and replace it with
    # the latest commit from the upstream git project. Or uninstall
    # vagrant-cachier :)
    config.cache.enable :dnf
    config.cache.synced_folder_opts = {
      type: :nfs,
      mount_options: ['rw', 'tcp', 'nolock']
    }
  end
end

This does several key things. Firstly it defines the size of the virtual machines which will be deployed, and installs some common testing tools, it sets up some variables for use later in the script (around MAC addresses and IP offsets) and it makes sure that mounted directories are always remounted (because Vagrant isn’t very good at doing that following a reboot).

There’s one script in here called make_mount.py, which I won’t go into in detail, but essentially, it just creates all the NFS mounts that Vagrant setup for the subsequent reboots. Unfortunately, I couldn’t do something similar for the Virtualbox Shared Folders. Feel free to bring this up in the comments if you want to know more.

Building a Puppetserver for testing your module

As I do more with Puppet, I’ve realised that being able to test a manual deployment of a set of modules with puppet apply /path/to/manifest.pp doesn’t actually test how your manifests will work in a real environment. To solve this, each of the test environments I build deploy a puppet server as well as the test machine or machines, and then I join the devices to the puppet server to let them deploy.

Let’s setup that Puppetserver, starting with the Vagrantfile definition. This snippet goes inside the section Vagrant.configure("2") do |config| block, just at the end of the code snippet I pasted before.

  config.vm.define "puppet" do |config|
    config.vm.hostname = "puppet"
    # \/ The puppetserver needs more memory
    config.vm.provider "virtualbox" do |vb|
        vb.memory = 4096
    end
    # \/ Fixed IP address needed for Vagrant-Cachier-NFS
    config.vm.network "private_network", ip: "192.168.56.254", name: "vboxnet0"
    # \/ Install and configure the Puppet server, plus the ENC.
    config.vm.provision "shell", path: "puppetserver/setup.sh"
  end

This showcases some really useful parts about Vagrant. Firstly, you can override the memory allocation, going from 2048 (which we had set as a default), and also you can define new networks to attach the VMs to. In this case, we have a “private_network”, configured to be a “host only network” in Virtualbox lingo, which means it’s attached not-only to the Virtual Machine, but also to the host machine.

When we run vagrant up with just this machine attached, it will run the scripts defined before, and then starting this setup script. Let’s dig into that for a second.

Setting up a Puppet Server

A puppet server is basically just a Certificate Authority, plus a web server to return the contents of your manifests to your client, plus some settings to use with that. Here’s a simple setup for that.

#!/bin/bash
START_PUPPET=0
################################################################
######### Install the Puppet binary and configure it as a server
################################################################
if ! command -v puppetserver >/dev/null
then
    rpm -Uvh https://yum.puppet.com/puppet8-release-el-9.noarch.rpm
    dnf install -y puppetserver puppet-agent
    alternatives --set java "$(alternatives --list | grep -E 'jre_17.*java-17' | awk '{print $3}')/bin/java"
    /opt/puppetlabs/bin/puppet config set server puppet --section main
    /opt/puppetlabs/bin/puppet config set runinterval 60 --section main
    /opt/puppetlabs/bin/puppet config set autosign true --section server
    START_PUPPET=1
fi

Like this, we can setup the puppet server to accept, automatically, any connecting client. There are security implications here!

Getting modules into the server

In a real-world deployment, you’ll have your Puppet Server which will have modules full of manifests attached to it. You may use some sort of automation to install or refresh those manifests, for example, in our lab, we use a script called r10k to update puppet modules on the host.

Instead of doing that, for the test, in the Vagrant file I mounted my “puppet modules” directory from the host machine to the puppet server, and then link each directory from the mounted path into where the puppet modules reside. This means we can also install public released modules, like puppetlabs-stdlib which has a series of standard resources, into the puppet server, without impacting my puppet modules directory. Here’s that code:

cd /etc/puppetlabs/code/environments/production/src_modules || exit 1
for dirname in puppet-module*
do
    TARGET="/etc/puppetlabs/code/environments/production/modules/$(echo "$dirname" | sed -E -e 's/.*puppet-module-//')"
    if [ ! -e "$TARGET" ]
    then
        ln -s "/etc/puppetlabs/code/environments/production/src_modules/${dirname}" "$TARGET"
    fi
done

################################################################
######### Install common modules
################################################################
/opt/puppetlabs/bin/puppet module install puppetlabs-stdlib

Defining what the clients will get

The puppet server then needs to know which manifests and settings to deploy to any node which connects to it. This is called an “External Node Classifier” or ENC.

The ENC receives the certificate name of the connecting host, and matches that against some internal logic to work out what manifests, in what environment they are coming from, and what settings to ship to the node. It then returns this as a JSON string for the Puppet Server to compile and send to the client.

The ENC defined in this dummy puppet server is extremely naive, and basically just reads a JSON file from disk. Here’s how it’s installed from the setup script

if ! [ -e /opt/puppetlabs/enc.sh ]
then
    cp /vagrant/puppetserver/enc.sh /opt/puppetlabs/enc.sh && chmod +x /opt/puppetlabs/enc.sh
    /opt/puppetlabs/bin/puppet config set node_terminus exec --section master
    /opt/puppetlabs/bin/puppet config set external_nodes /opt/puppetlabs/enc.sh --section master
    START_PUPPET=1
fi

Then here is the enc.sh script

#!/bin/bash

if [ -e "/vagrant/enc.${1}.json" ]
then
    cat "/vagrant/enc.${1}.json"
    exit 0
fi
if [ -e "/vagrant/enc.json" ]
then
    cat "/vagrant/enc.json"
    exit 0
fi
printf '{"classes": {}, "environment": "production", "parameters": {}}'

And finally, here’s the enc.json for this test environment:

{
    "classes": {
        "nftablesfirewall": {},
        "basevm": {},
        "hardening": {}
    },
    "environment": "production",
    "parameters": {}
}

So, we now have enough to provision a device connecting into the puppet server. Now we need to build our first Firewall.

Building a firewall

First we need the Virtual Machine to build. Again, we’re using the Vagrantfile to define this.

  config.vm.define :fwA do |config|
    # eth0 mgmt via vagrant ssh, simulating "WAN", DHCP to 10.0.2.x                        # eth0 wan
    config.vm.network "private_network", auto_config: false, virtualbox__intnet: "prodA"   # eth1 prod
    config.vm.network "private_network", auto_config: false, virtualbox__intnet: "devA"    # eth2 dev
    config.vm.network "private_network", auto_config: false, virtualbox__intnet: "sharedA" # eth3 prod
    config.vm.network "private_network", auto_config: false, virtualbox__intnet: "transit" # eth4 transit
    config.vm.provider "virtualbox" do |vb|
      vb.customize ["modifyvm", :id, "--macaddress1", "#{vms_A_mac_base}01"] # wan
      vb.customize ["modifyvm", :id, "--macaddress2", "#{vms_A_mac_base}02"] # prod
      vb.customize ["modifyvm", :id, "--macaddress3", "#{vms_A_mac_base}03"] # dev
      vb.customize ["modifyvm", :id, "--macaddress4", "#{vms_A_mac_base}04"] # shared
      vb.customize ["modifyvm", :id, "--macaddress5", "#{vms_A_mac_base}05"] # transit
    end
    config.vm.network "private_network", ip: "192.168.56.#{vms_A_number}", name: "vboxnet0" # Only used in this Vagrant environment for Puppet
    config.vm.hostname = "vms#{vms_A_number}fw#{vms_A_number}"
    config.vm.provision "shell", path: "puppetagent/setup-and-apply.sh"
  end

This gives us enough to build Firewall A, to build Firewall B, replace any “A” string (like “sharedA”, “vms_A_mac_base” or “vms_A_number”) with B (so “SharedB” and so on). The firewall has 5 interfaces, which are:

  • wan; technically a NAT interface in Vagrant, but in our lab would be completely exposed to the internet for ingress and egress traffic.
  • transit; used to pass traffic between VLANs (shared, prod and dev)
  • shared, prod and dev which carry the traffic for the machines classified as “production” or “development”, or for the shared management and access to them.

The puppet manifests we’ll see in a minute rely on those interfaces having the last 4 hexadecimal digits of the MAC address defined with specific values in order to identify the machine ID and the interface association. Fortunately, Virtualbox can assign these interfaces specific MAC addresses! Another win for Vagrant+Virtualbox. As before, we also add the private network which gives access to Puppet, which would normally be accessed over the WAN interface.

In here we have another shell script, this time puppetagent/setup-and-apply.sh. This one joins the puppet worker to the server, links the build modules (like we did with the puppet server) to replicate the build process with Packer, and then applies “standard” configuration from the local machine. Finally, it asks the server to apply the server configuration (using the ENC script we setup before). The local build modules (called “basevm” and “hardening”) I won’t go into here, because in this context they’re basically just saying “I ran” and then ending. But let’s take a look at the puppet module itself

Initialising the Puppet Module

There are six files in the puppet manifests, starting with init.pp. If you’ve not written any Puppet before, a module is defined as a manifest class with some optional parameters passed to it. You can also define default values using hiera to retrieve values from the data directory. The manifest can call out so subclasses, and can also transfer files and build templates. Let’s take a look at that init.pp file.

# @summary Load various sub-manifests
class nftablesfirewall {
  # Setup interfaces
  class { 'nftablesfirewall::interfaces': }

  # make this server route traffic
  class { 'nftablesfirewall::routing':
    require => Class['nftablesfirewall::interfaces'],
  }
  class { 'nftablesfirewall::bgp':
    require => Class['nftablesfirewall::interfaces'],
  }

  # Allow traffic flows across the firewall
  class { 'nftablesfirewall::policy':
    require => Class['nftablesfirewall::interfaces'],
  }

  # make this server assign IP addresses
  class { 'nftablesfirewall::dhcpd':
    require => Class['nftablesfirewall::interfaces'],
  }
}

The class calls subclasses by using the construct class { 'class::subclass': } and in some cases, use the “meta parameters” require, before or notify to establish order of running. The subclasses are named according to the subclass name, so let’s take a look at these.

Defining the interfaces

The later subclasses need the interfaces to be defined properly first, so when we take a look at interfaces.pp, it does one of three things. Let’s pull these apart one at a time.

If there is an interface called eth0, then we’ve not renamed these interfaces, so we need to do that first of all. Let’s take a look at that:

  if ($facts['networking']['interfaces']['eth0']) {
    #################################################################
    ######## Using the MAC address we've configured, define each
    ######## network interface. On cloud platforms, we'd need to
    ######## figure out a better way of doing this!
    #################################################################
    # This relies HEAVILY on the mac address for the device on eth0 
    # following this format:       16:0D:EC:AF:xx:01
    # The first 8 hex digits (160DECAF) don't really matter, but the
    # 9th and 10th are the VM number and the 11th and 12 are the 
    # interface ID. This MAC prefix I found is a purposefully 
    # unallocated prefix for virtual machines.
    #
    # Puppet magic to turn desired interface names etc into MAC
    # addresses, thanks to ChatGPT.
    #
    # https://chatgpt.com/share/67ae1617-a398-8002-807b-4bc4298b40bb
    $interface_map = {
      'wan'     => '01',
      'prod'    => '02',
      'dev'     => '03',
      'shared'  => '04',
      'transit' => '05',
    }
    $interfaces = $interface_map.map |$role, $suffix| {
      $match = $facts['networking']['interfaces'].filter |$iface, $details| {
        $details['mac'] and $details['mac'] =~ "${suffix}$"
      }

      if !empty($match) {
        { $role => $match.values()[0]['mac'] }  # Store the MAC address
      } else {
        {}
      }
    }.reduce |$acc, $entry| {
      $acc + $entry  # Merge all key-value pairs into a final hash
    }

    file { '/etc/udev/rules.d/70-persistent-net.rules':
      ensure  => present,
      owner   => root,
      group   => root,
      mode    => '0644',
      content => template('nftablesfirewall/etc/udev/rules.d/70-persistent-net.rules.erb'),
      notify  => Exec['Reboot'],
    } -> exec { 'Reboot':
      command     => '/bin/bash -c "(sleep 30 && reboot) &"',
      # We delay 30 seconds so the reboot doesn't kill puppet and report an error.
      refreshonly => true
    }
  }

I’m unashamed to say that I asked ChatGPT for some help here! I wanted to figure out how to name the interfaces without knowing the exact MAC address. Fortunately, Puppet identifies lots of details about the system, referred to as facts (you can read all of the facts your Puppet system knows about a node by running facter -p on a system with Puppet installed). In this case, we’re asking Puppet to parse all of the interfaces, and check the details of the MAC address to figure out which one was which. Once it knows that, it creates a file in udev, a system which identifies how to initialize the components, and in some cases, rename how they are seen by the system. When we do this, the system won’t recognise the changes until it’s been rebooted, so if we create or modify that file, sleep 30 seconds (to let the puppet module finish running) and then reboot.

What does the template for that udev file look like? Pretty simple actually.

<% @interfaces.each do |interface,mac| -%>
SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="<%= mac %>", NAME="<%= interface %>"
<% end %>

Once that’s run, it looks like this:

SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="16:0D:EC:AF:11:01", NAME="wan"
SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="16:0D:EC:AF:11:02", NAME="prod"
SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="16:0D:EC:AF:11:03", NAME="dev"
SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="16:0D:EC:AF:11:04", NAME="shared"
SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="16:0D:EC:AF:11:05", NAME="transit"

Once the system comes back up, Puppet will run immediately, and we take advantage of this! At the end of code block identifying if the interface is still called eth0, if it isn’t called eth0, we can set some IP addresses here. To do that, we use the MAC address allocation again! This time we’re using the second-from-last pair of hex digits to work out the firewall ID, and we use that firewall ID to identify the subnets to use, by adding this ID to the base value for the third IP octet in the local subnets (shared, prod, dev) and the last IP octet in the connecting subnets (wan and transit). Let’s take a look at just this bit. It starts at the top of the file, where we pass some parameters into the class:

class nftablesfirewall::interfaces (
  String  $network_base   = '198.18',
  Integer $prod_base      = 32, # Start of Supernet
  Integer $prod_mask      = 24,
  Integer $dev_base       = 64, # Start of Supernet
  Integer $dev_mask       = 24,
  Integer $shared_base    = 96, # Start of Supernet
  Integer $shared_mask    = 24,
  Integer $transit_actual = 255,
  Integer $transit_mask   = 24,
) {

And then later, we have this:

    #################################################################
    # This block here works out which host we are, based on the 5th
    # octet of the MAC address
    #################################################################
    $vm_offset = Integer(
      regsubst(
        $facts['networking']['interfaces']['wan']['mac'],
        '.*:([0-9A-Fa-f]{2}):[0-9A-Fa-f]{2}$',
        '\1'
      )
    )

    #################################################################
    # Next calculate the IP addresses to assign to each NIC
    #################################################################
    $transit_ip    = "${network_base}.${transit_actual}.${vm_offset}/${transit_mask}"
    $dev_actual    = $dev_base + $vm_offset
    $dev_ip        = "${network_base}.${dev_actual}.1/${dev_mask}"
    $prod_actual   = $prod_base + $vm_offset
    $prod_ip       = "${network_base}.${prod_actual}.1/${dev_mask}"
    $shared_actual = $shared_base + $vm_offset
    $shared_ip     = "${network_base}.${shared_actual}.1/${dev_mask}"

Once we have these values, we can start assigning IP addresses. In the diagram at the top of the page, I used the offsets 11 for fwA and 12 for fwB, and in the diagram it shows the IP addresses allocated to each of those networks; for fwA, wan gets a DHCP address, prod gets 198.15.43.1/24 ,dev gets 198.15.75.1/24, and shared gets 198.15.107.1/24, transit gets 198.15.255.11/24. These are all offset from the supernet allocation. If you were expecting more hosts than 32 in your supernet (the array starts at “0”, so offsets of 0 to 31) then you could allocate different ranges!

Anyway, to allocate the addresses to the interfaces, I want to use NetworkManager, as it’s built into these systems, and has some pretty good tooling around it. You can either mangle text files and re-apply them, or interact with a command line tool called nmcli. Rather than putting a whole load of work into building the text files, or executing lots of nested nmcli commands, I wrote a single python script, called configure_nm_if.py, and we execute this from the manifest, both as a test, to see if we need to make any changes, and to make the change itself.

    exec { 'Configure WAN Interface': # wan interface uses DHCP, so set to auto
      require => File['/usr/local/sbin/configure_nm_if.py'],
      command => '/usr/local/sbin/configure_nm_if.py wan auto',
      unless  => '/usr/local/sbin/configure_nm_if.py wan auto --test',
      notify  => Exec['Reboot'],
    }
    exec { 'Configure Dev Interface':
      require => File['/usr/local/sbin/configure_nm_if.py'],
      command => "/usr/local/sbin/configure_nm_if.py dev ${dev_ip}",
      unless  => "/usr/local/sbin/configure_nm_if.py dev ${dev_ip} --test",
      notify  => Exec['Reboot'],
    }

The script starts by working out which interfaces are configured by checking all the files in /etc/NetworkManager/system-connections and /run/NetworkManager/system-connections. In each of those files, lines are generally split into a key (like “interface” or “uuid”) and a value, which is what we’re looking for. Here’s that bit of code:

class nm_profile:
    file = None
    settings = {}

    def __init__(self, search_string: str):
        if search_string is None:
            raise ArgumentException('Invalid Search String')
        search = re.compile(f'^([^=]+)=(.*)\s*$')
        nm_dir = pathlib.Path("/run/NetworkManager/system-connections")
        for file_path in nm_dir.glob("*.nmconnection"):
            if file_path == search_string:
                self.file = file_path
            else:
                with open(file_path, "r") as f:
                    lines = f.readlines()
                    for line in lines:
                        compare = search.match(line)
                        if compare and compare.group(2) == search_string:
                            self.file = file_path
                            break
            if self.file is not None:
                break
        # Do the same thing for /etc/NetworkManager (cropped for brevity)
        if self.file is None:
            raise ProfileNotFound(
                f'Unable to find a profile matching the search string "{search_string}"')

        nmcli = subprocess.run(
            ["/bin/nmcli", "--terse", "connection", "show", self.file],
            capture_output=True, text=True
        )
        for line in nmcli.stdout.splitlines():
            data = line.split(":", 1)
            value = data[1].strip()
            if value == '':
                value = None
            self.settings[data[0].strip()] = value

This means that when we find the file with the configuration we want, run nmcli to get the full, calculated collection of settings for that file. Next we work out if anything would change from what it is (from the nmcli connection show command) and what we want it to be (from the arguments we pass into the script). That’s here:

def main():
    parser = argparse.ArgumentParser(
        description="Modify NetworkManager connection settings.")
    parser.add_argument("ifname", help="Interface name")
    parser.add_argument("ip", help="IP address or 'auto'")
    parser.add_argument("--dryrun", action="store_true",
                        help="Enable dry run mode")
    parser.add_argument("--test", action="store_true",
                        help="Enable test mode")
    args = parser.parse_args()

    actions = {}

    nm = nm_profile(args.ifname)

    current_id = nm.settings.get("connection.id")
    next_id = args.ifname
    if current_id != next_id:
        logging.debug(f'Change id from "{current_id}" to "{next_id}"')
        actions['connection.id'] = next_id

    current_method = nm.settings.get("ipv4.method")
    next_method = "manual" if args.ip != "auto" else "auto"

    if current_method != next_method:
        logging.debug(f'Change method from {current_method} to {next_method}')
        actions['ipv4.method'] = next_method

    current_ip = nm.settings.get("ipv4.addresses")
    next_ip = args.ip if args.ip != "auto" else None
    if next_ip is None and current_ip is not None:
        logging.debug(f'Change ipv4.address from {current_ip} to ""')
        actions['ipv4.addresses'] = ""
    elif next_ip != current_ip:
        logging.debug(
            f'Change ipv4.address from {current_ip if not None else "None"} to {next_ip}')
        actions['ipv4.addresses'] = next_ip

And then we use the fact that there are, or aren’t changes to be made, and either, if we’re testing for those changes, return a “success” or a “failure” (to provoke the manifest to trigger the change), or make the change. That’s here:

    if len(actions) > 0:
        if args.test:
            logging.debug('There are outstanding actions, exit rc 1')
            sys.exit(1)

        command = [
            '/bin/nmcli', 'connection', 'modify', 
            nm.settings.get('connection.uuid', str(nm.file))
        ]

        for action in actions.keys():
            command.append(action)
            command.append(actions[action])
        logging.info(f'About to run the following command: {command}')

        if not args.dryrun:
            nmcli = subprocess.run(
                command,
                capture_output=True, text=True
            )
            if nmcli.returncode > 0:
                raise NmcliFailed(
                    f'Failed to run command {command}, RC: {nmcli.returncode} StdErr: {nmcli.stderr} StdOut: {nmcli.stdout}')

And then if we’ve made changes, we restart the connection, which provides us with a test that the change is a valid one!

        command = [
            '/bin/nmcli', 'connection', 'down', nm.settings.get(
                'connection.uuid', str(nm.file))
        ]
        logging.info(f'About to run the following command: {command}')

        if not args.dryrun:
            nmcli = subprocess.run(
                command,
                capture_output=True, text=True
            )
            if nmcli.returncode > 0:
                raise NmcliFailed(
                    f'Failed to run command {command}, RC: {nmcli.returncode} StdErr: {nmcli.stderr} StdOut: {nmcli.stdout}')

            command = [
                '/bin/nmcli', 'connection', 'up', nm.settings.get(
                    'connection.uuid', str(nm.file))
            ]
            logging.info(f'About to run the following command: {command}')
            nmcli = subprocess.run(
                command,
                capture_output=True, text=True
            )
            if nmcli.returncode > 0:
                raise NmcliFailed(
                    f'Failed to run command {command}, RC: {nmcli.returncode} StdErr: {nmcli.stderr} StdOut: {nmcli.stdout}')

Once that script has executed for each of the interfaces, we trigger a reboot (30 seconds after the Puppet agent has finished running, again). This is because the Puppet agent only gathers the details of the interfaces when it first runs, and so the subsequent manifests need these interfaces to be detected properly.

I mentioned before that the interfaces subclass needed to do three things. The last thing it “should” do is nothing, because this subclass is heavily reliant on reboots! If there are no changes it needs to make, just let the code carry on so we can start working with the other aspects, and we’ll go next to BGP.

A brief note on my understanding of BGP

I want to take a quick diversion here before I get started on the puppet code here. I’m not hugely comfortable with BGP, or, in fact, any of the dynamic routing protocols. I do understand that it’s a core and key part of the internet, and without it networking teams across the world would be lost!

That said, I’ve relied heavily on advice from a colleague at this point, so while this file does work, it may not be best practice. Please speak to someone more competent and confident with routing to help you if you have ANY issues what-so-ever at this point!

Routing with BGP and FRR

I’m using FRR to setup BGP peers. Each peer advertises it’s own network segment to all it’s peers. Like with the interface subclass manifest, we calculate the network segments in the same way for the BGP subclass manifest. We also build a list of all of the peers (the other firewalls in the supernets)

  if ($facts['networking']['interfaces']['transit'] and $facts['networking']['interfaces']['transit']['ip']) {
    $vm_lan_ip_address = $facts['networking']['interfaces']['transit']['ip']

    #################################################################
    ######## Work out the offset to get the firewall ID
    #################################################################
    $split_ip = split($vm_lan_ip_address, '[.]')
    # Extract the last octet, ensuring it exists
    if $split_ip and size($split_ip) == 4 {
      $vm_last_octet = Integer($split_ip[3])

      # Time to add the other important addresses for this device
      $dev_address    = "${network_base}.${$dev_offset + $vm_last_octet}.0/24"
      $prod_address   = "${network_base}.${$prod_offset + $vm_last_octet}.0/24"
      $shared_address = "${network_base}.${$shared_offset + $vm_last_octet}.0/24"

      # Calculate the peers from the range 0..31 (excluding this one)
      $peer_addresses = range(0, 31).map |$i| {
        "${network_base}.${transit_octet}.${i}"
      }.filter |$ip| { $ip != $vm_lan_ip_address }

We can start to build our configuration file… after we’ve defined a handful of initial variables:

class nftablesfirewall::bgp (
  String  $bgp_our_asn            = '65513',
  Boolean $bgp_our_peer_enabled   = true,
  Boolean $bgp_advertise_networks = true,
  Boolean $bgp_cloud_peer_enabled = false,
  String  $bgp_cloud_peer_asn     = '65511',
  Array   $bgp_cloud_peer_ips     = ['198.18.0.2', '198.18.0.3'],
  String  $network_base           = '198.18',
  Integer $transit_octet          = 255,
  Integer $prod_offset            = 32,
  Integer $dev_offset             = 64,
  Integer $shared_offset          = 96,
) {

The ASNs are in the range of “Private ASNs” from 64512-65535 allocated by IANA in RFC1930, and are roughly equivalent to the IP allocation 10.0.0.0/8.

FRR configuration looks a little like a Cisco Router, and starts off as a template, like this:

! ######################################################
! # Basic Setup
! ######################################################
!
log syslog informational
frr defaults traditional
!
! ######################################################
! # Our BGP side
! ######################################################
!
router bgp <%= @bgp_our_asn %>
no bgp ebgp-requires-policy
bgp router-id <%= @vm_lan_ip_address %>
!
<%- if @bgp_our_peer_enabled -%>
! ######################################################
! # Firewall BGP peers (how we find our own routes)
! ######################################################
!
neighbor FW-PEERS peer-group
neighbor FW-PEERS remote-as <%= @bgp_our_asn %>
<% @peer_addresses.each do |ip| -%>
neighbor <%= ip %> peer-group FW-PEERS
<% end -%>
!
<%- end -%>
<%- if @bgp_cloud_peer_enabled -%>
! ######################################################
! # Cloud BGP peers (how Cloud finds us)
! ######################################################
!
neighbor CLOUD-PEERS peer-group
neighbor CLOUD-PEERS remote-as <%= @bgp_cloud_peer_asn %>
<% @bgp_cloud_peer_ips.each do |ip| -%>
neighbor <%= ip %> peer-group CLOUD-PEERS
<% end -%>
!
<%- end -%>
<%- if @bgp_advertise_networks -%>
! ######################################################
! # Our local networks
! ######################################################
!
address-family ipv4 unicast
    network <%= @dev_address %>
    network <%= @prod_address %>
    network <%= @shared_address %>
!
<%- end -%>
<%- if @bgp_our_peer_enabled -%>
! ######################################################
! Firewall BGP peers
! ######################################################
!
neighbor FW-PEERS activate
!
<%- end -%>
<%- if @bgp_cloud_peer_enabled -%>
! ######################################################
! Cloud BGP peers
! ######################################################
!
neighbor CLOUD-PEERS activate
!
<%- end -%>
exit-address-family
!
! ######################################################
! We don't use IPv6 yet
! ######################################################
!
address-family ipv6 unicast
exit-address-family
!

When rendered down, it looks like this:

aa

When FRR is running we can access the “virtual TTY” interface of FRR by running vtysh and issuing commands. The main one I’ve been using is show ip bgp summary which tells you if your peer is connected, like this:

[root@vms11fw11 frr]# vtysh 

Hello, this is FRRouting (version 8.5.3).
Copyright 1996-2005 Kunihiro Ishiguro, et al.

vms11fw11# show ip bgp summary

IPv4 Unicast Summary (VRF default):
BGP router identifier 198.15.255.11, local AS number 65513 vrf-id 0
BGP table version 6
RIB entries 11, using 2112 bytes of memory
Peers 31, using 22 MiB of memory
Peer groups 1, using 64 bytes of memory

Neighbor        V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd   PfxSnt Desc
198.18.255.0    4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.1    4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.2    4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.3    4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.4    4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.5    4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.6    4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.7    4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.8    4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.9    4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.10   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.12   4      65513         4         4        0    0    0 00:00:27            3        3 N/A
198.18.255.13   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.14   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.15   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.16   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.17   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.18   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.19   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.20   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.21   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.22   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.23   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.24   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.25   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.26   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.27   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.28   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.29   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.30   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.31   4      65513         0         0        0    0    0    never       Active        0 N/A

Total number of neighbors 31
vms11fw11#

There is a separate block of configuration which allows for the upstream cloud provider to also offer BGP, but this is not available in Vagrant!

What’s next?

Defining your Firewall policy

I mentioned at the top of the post that I was using NFTables, which is the successor to IPTables. The policy we are defining is very simple, but you can see quite quickly how this policy can be enhanced. This isn’t a template (although it could be), it’s just a plain file that Puppet installs via the policy subclass manifest, and then configures the default value in sysconfig to load that policy file.

How does that policy look? It has four pieces, variable definitions, the input policy, the forward policy and the postrouting masquerading (or NAT) chain. Let’s pick these apart separately.

In each block (table inet filter for policy elements and table ip nat for Masquerading) we can define some variables. They are separate and distinct from each other. Here I’ll specify all of the supernets (both “this cloud” and “another cloud”) in the policy and just the relevant local supernets for the masquerading.

#!/usr/sbin/nft -f

flush ruleset

table inet filter {

    ##############################################################################
    # Define network objects to be used later
    ##############################################################################
    set management_networks {
      type ipv4_addr
      flags interval
      ##############################################
      # NOTES BELOW ON WHY EACH NETWORK IS SPECIFIED
      ##############################################
      #            Vagrant      Local          Cloud
      elements = { 10.0.2.0/24, 198.18.0.0/19, 198.19.0.0/19 }
    }

    set prod_networks {
      type ipv4_addr
      flags interval
      ##############################################
      # NOTES BELOW ON WHY EACH NETWORK IS SPECIFIED
      ##############################################
      #            Local           Cloud
      elements = { 198.18.32.0/19, 198.19.32.0/19 }
    }

    set dev_networks {
      type ipv4_addr
      flags interval
      ##############################################
      # NOTES BELOW ON WHY EACH NETWORK IS SPECIFIED
      ##############################################
      #            Local           Cloud
      elements = { 198.18.64.0/19, 198.19.64.0/19 }
    }

    set shared_networks {
      type ipv4_addr
      flags interval
      ##############################################
      # NOTES BELOW ON WHY EACH NETWORK IS SPECIFIED
      ##############################################
      #            Local           Cloud
      elements = { 198.18.96.0/19, 198.19.96.0/19 }
    }

    # ...... Chains follow

    chain output {
        type filter hook output priority 0; policy accept;
    }
}

table ip nat {
    ##############################################################################
    # Define network objects to be used later
    ##############################################################################
    set masq_networks {
      type ipv4_addr
      flags interval
      ##############################################
      # NOTES BELOW ON WHY EACH NETWORK IS SPECIFIED
      ##############################################
      #            Prod            Dev             Shared
      elements = { 198.18.32.0/19, 198.18.64.0/19, 198.18.96.0/19 }
    }

    # ...... Chain follows
}

Input refers to what traffic is connecting to the host in question. As it’s a firewall, we want as little available as possible; ICMP, SSH from management addresses, DHCP assignment for VMs attached to this firewall, and BGP, to allow the peers to see each other. We should also allow established traffic to flow, and the “loopback” lo interface, should be allowed to talk to anything on this host. This is actually combined with the previous code block, and I’ll indicate where that has happened, like I did before.

table inet filter {
    # ...... Variables as before
    chain input {
        type filter hook input priority 0; policy drop;

        # Allow loopback traffic
        iifname "lo" accept

        # Allow established and related connections
        ct state { established, related } accept

        # Allow ICMP traffic
        ip protocol icmp accept

        # Allow SSH (TCP/22) from specific subnets
        ip saddr @management_networks tcp dport 22 log prefix "A-NFT-input.management: " accept
        ip saddr @shared_networks     tcp dport 22 log prefix "A-NFT-input.shared: " accept

        # Allow DHCP and BOOTP traffic
        # This means that the nodes attached to this device can get IP addresses.
        ip protocol udp udp sport 68 udp dport 67 accept
        ip protocol udp udp sport 67 udp dport 68 accept

        # Allow BGP across the Transit interface
        iifname "transit" ip protocol tcp tcp dport 179 accept
        oifname "transit" ip protocol tcp tcp dport 179 accept

        # Drop everything else
        log prefix "DROP_ALL-NFT-input: " drop
    }
    # ...... Forward chain follows
}

Forwarding relates to what passes over this box. We want:

  • all established traffic to be allowed to pass
  • almost all ICMP traffic to be permitted
  • the shared supernet to be able to talk to any host
  • the dev supernet to be able to talk to any other host in the dev supernet, or to any host in the shared supernet
  • the prod supernet to be able to talk to any other host in the prod supernet, or to any host in the shared supernet
  • any host in the shared, dev and prod supernets to be able to talk to any host on the internet (except excluded network ranges)
  • excluded network ranges to be dropped

Let’s take a look at that.

table inet filter {
    # ...... Variables as before
    # ...... Input chain as before
    chain forward {                                         # Forward is "What can go THROUGH this host"
        type filter hook forward priority 0; policy drop;

        # Allow established and related connections
        ct state { established, related } accept

        # ICMP rules
        ip protocol icmp icmp type { echo-reply, echo-request, time-exceeded, destination-unreachable } accept

        # Shared network can talk out to anything
        ip saddr @shared_networks log prefix "A-NFT-forward.shared-any: " accept
        
        # Allow intra-segment traffic
        ip saddr @dev_networks    ip daddr @dev_networks  log prefix "A-NFT-forward.dev-dev: "   accept
        ip saddr @prod_networks   ip daddr @prod_networks log prefix "A-NFT-forward.prod-prod: " accept
        
        # Allow Prod, Dev access to Shared
        ip saddr @dev_networks    ip daddr @shared_networks log prefix "A-NFT-forward.dev-shared: " accept
        ip saddr @prod_networks   ip daddr @shared_networks log prefix "A-NFT-forward.prod-shared: " accept

        # Allow all segments access to the Internet, block the following subnets
        ip daddr != {
          0.0.0.0/8,                                      # RFC1700 (local network)
          10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16,      # RFC1918 (private networks)
          169.254.0.0/16,                                 # RFC3300 (link local)
          192.0.0.0/24,                                   # RFC5736 ("special purpose") 
          192.0.2.0/24, 198.51.100.0/24, 203.0.113.0/24,  # RFC5737 ("TEST-NET")
          192.88.99.0/24,                                 # RFC3068 ("6to4 relay")
          198.18.0.0/15,                                  # RFC2544 ("Inter-networking tests")
          224.0.0.0/4, 240.0.0.0/4                        # RFC1112, RFC6890 ("Special Purpose" and Multicast)
        } log prefix "A-NFT-forward.all-internet: " accept

        # Drop everything else
        log prefix "DROP_ALL-NFT-forward: " drop
    }
}

And lastly, we take a look at the Masquerading part of this. Here we want to masquerade (or “Hide NAT”) any traffic leaving on the WAN interface.

table ip nat {
    # ...... Variables as before
    chain postrouting {
        type nat hook postrouting priority 100; policy accept;

        # Masquerade all traffic going out of the WAN interface
        ip saddr @masq_networks oifname "wan" masquerade
    }
}

As you can see, the language of these policies is quite easy:

  • iifname the interface the traffic came in on
  • oifname the interface the traffic exits on
  • ip saddr the IP address, subnets or variable name the source address is in
  • ip daddr the IP address, subnets or variable name the destination address is in
  • ip protocol {udp|tcp|icmp} the protocol that the service travels over
  • {tcp|udp} sport The source TCP or UDP port
  • {tcp|udp} dport The destination TCP or UDP port
  • ct state the connection status of a packet
  • accept Permit the traffic to flow
  • drop Stop the traffic from flowing
  • log prefix "Some String" Add a prefix to the log line

By making this file executable, the #!/usr/sbin/nft -f at the start of the file means that to apply this policy, you just need to execute it! Dead simple.

The only thing left to do is to setup DHCP for the nodes and to test it!

DHCPd

DHCP is a protocol for automatically assigning IP addresses to nodes. In this case, we’re using dnsmasq, which is a small server that performs DNS resolution, as well as DHCP and, if we need it later, TFTP. This is a simple package-install away, and a very simple configuration file template too.

dhcp-option=option:dns-server,<%= @dns_servers %>

# Listen only on the specified interfaces
interface=<%= @dev_nic %>
dhcp-range=<%= @dev_nic %>,<%= @dev_subnet %>.10,<%= @dev_subnet %>.250,255.255.255.0,6h
dhcp-option=<%= @dev_nic %>,option:router,<%= @dev_gateway %>

interface=<%= @prod_nic %>
dhcp-range=<%= @prod_nic %>,<%= @prod_subnet %>.10,<%= @prod_subnet %>.250,255.255.255.0,6h
dhcp-option=<%= @prod_nic %>,option:router,<%= @prod_gateway %>

interface=<%= @shared_nic %>
dhcp-range=<%= @shared_nic %>,<%= @shared_subnet %>.10,<%= @shared_subnet %>.250,255.255.255.0,6h
dhcp-option=<%= @shared_nic %>,option:router,<%= @shared_gateway %>

Here we define the network interface to listen on (ending _nic) and the subnet range to allocate (ending _subnet) as well as the gateway address of this host (ending _gateway). We’ve also told it where to get it’s DNS records from too (dns_servers).

Building the testing hosts

Back to our Vagrantfile. We define the “VM” entries in the diagram at the top, attached to each of the networks, (Prod, Dev and Shared) on the A and B sides. The configuration is largely the same between each of these items, so I’ll only show one of them:

  config.vm.define :prodA do |config|
    config.vm.network "private_network", auto_config: false, virtualbox__intnet: "prodA"
    config.vm.network "private_network", ip: "192.168.56.#{vms_A_number + 10}", name: "vboxnet0"
    config.vm.hostname = "prod-#{vms_A_number}"
    config.vm.provision "shell", path: "client/manage_routes.sh"
  end

Honestly, the hostname didn’t need to be set, but makes life easier, and the private_network on vboxnet0 is just there for the DNF Cache, as we’re not using Puppet here. The only thing the client/manage_routes.sh script does is to remove the default route that Vagrant puts in to connect the node to the host for outbound NAT, ensuring it all goes through the firewall!

So, once we’ve got all of that, we can test it!

Testing your Lab

Running vagrant up will start all the VMs. Each node has 2GB of RAM, plus the puppet server which has 4GB, so make sure your host OS has at least 20GB RAM. Once you’re done with your test, destroy it with vagrant destroy and it will ask you if you’re sure. If you’ve done some tweaking, and need to re-provision something, run vagrant provision or vagrant up --provision. You can also just do vagrant up hostname (like vagrant up puppet, vagrant up puppet fwA or vagrant up puppet fwA fwB, for example) or vagrant destroy hostname to manage individual nodes.

Because of how Puppet works, if you do this, be aware you may need to remove puppet certificates with puppetserver ca clean --certname hostname.as.fqdn (you’ll see the hostnames when puppet agent is run). Honestly, I ended up recreating everything if I was doing that much tweaking!

Once you’ve got nodes up and running, you can run vagrant ssh hostname (like vagrant ssh prodA) and execute commands on there. Remember up near the top of this, I created an nginx server? With this, and running for node in prodA prodB devA devB sharedA sharedB; do echo $node ; vagrant ssh $node -- ip -4 -br a ; done to get a list of the IP addresses, you can run vagrant ssh prodA -- curl http://ip.for.sharedA.node (like vagrant ssh prodA -- curl http://198.18.107.123) to make sure that your traffic across the firewalls is working right.

You can also do vagrant ssh fwA and then run sudo journalctl -k | grep A-NFT-forward to see packets flowing across the firewall, sudo journalctl -k | grep DROP_ALL-NFT to see packets being dropped, and sudo journalctl -k | grep A-NFT-input to see packets destined for the firewall. Beware with that last one, you’ll also see all your new SSH connections into it!


Wow! This is a BIG one! I hope you’ve found it useful. It took a while to build, and even longer to test! Enjoy!!

Chris WallaceArt at Southmead Hospital

During my short stay with a bout of pneumonia, I spent each of four nights in a different ward. I...

Alan PopeSpotlighting Community Stories

tl;dr I’m hosting a Community Spotlight Webinar today at Anchore featuring Nicolas Vuilamy from the MegaLinter project. Register here.


Throughout my career, I’ve had the privilege of working with organizations that create widely-used open source tools. The popularity of these tools is evident through their impressive download statistics, strong community presence, and engagement both online and at events.

During my time at Canonical, we saw the tremendous reach of Ubuntu, along with tools like LXD, cloud-init, and yes, even Snapcraft.

At Influxdata, I was part of the Telegraf team, where we witnessed substantial adoption through downloads and active usage, reflected in our vibrant bug tracker.

Now at Anchore, we see widespread adoption of Syft for SBOM generation and Grype for vulnerability scanning.

What makes Syft and Grype particularly exciting, beyond their permissive licensing, consistent release cycle, dedicated developer team, and distinctive mascots, is how they serve as building blocks for other tools and services.

Syft isn’t just a standalone SBOM generator - it’s a library that developers can integrate into their own tools. Some organizations even build their own SBOM generators and vulnerability tools directly from our open source foundation!

$ docker-scout version
 ⢀⢀⢀ ⣀⣀⡤⣔⢖⣖⢽�
 ⡠⡢⡣⡣⡣⡣⡣⡣⡢⡀ ⢀⣠⢴⡲⣫⡺⣜�⢮⡳⡵⡹⡅
 ⡜⡜⡜⡜⡜⡜⠜⠈⠈ �⠙⠮⣺⡪⡯⣺⡪⡯⣺
 ⢘⢜⢜⢜⢜⠜ ⠈⠪⡳⡵⣹⡪⠇
 ⠨⡪⡪⡪⠂ ⢀⡤⣖⢽⡹��⣖⢤⡀ ⠘�⢮⡚ _____ _
 ⠱⡱� ⡴⡫�⢮⡳�⢮⡺⣪⡳�⢦ ⠘⡵� / ____| Docker | |
 � ⣸�⣕⢗⡵�⢮⡳�⢮⡺⣪⡳⣣ � | (___ ___ ___ _ _| |_
 ⣗�⢮⡳�⢮⡳�⢮⡳�⢮⢮⡳ \___ \ / __/ _ \| | | | __|
 ⢀ ⢱⡳⡵⣹⡪⡳�⢮⡳�⢮⡳⡣� ⡀ ____) | (_| (_) | |_| | |_
 ⢀⢾⠄ ⠫�⢮⡺�⢮⡳�⢮⡳�� ⢠⢣⢂ |_____/ \___\___/ \__,_|\__|
 ⡼⣕⢗⡄ ⠈⠓�⢮⡳�⠮⠳⠙ ⢠⢢⢣⢣
 ⢰⡫⡮⡳�⢦⡀ ⢀⢔⢕⢕⢕⢕⠅
 ⡯�⢯⡺⣪⡳�⢖⣄⣀ ⡀⡠⡢⡣⡣⡣⡣⡣⡃
⢸�⢮⡳�⢮⡺⣪⡳⠕⠗⠉� ⠘⠜⡜⡜⡜⡜⡜⡜⠜⠈
⡯⡳⠳�⠊⠓⠉ ⠈⠈⠈⠈



version: v1.13.0 (go1.22.5 - darwin/arm64)
git commit: 7a85bab58d5c36a7ab08cd11ff574717f5de3ec2

$ syft /usr/local/bin/docker-scout | grep syft
 ✔ Indexed file system /usr/local/bin/docker-scout
 ✔ Cataloged contents f247ef0423f53cbf5172c34d2b3ef23d84393bd1d8e05f0ac83ec7d864396c1b
 ├── ✔ Packages [274 packages]
 ├── ✔ File digests [1 files]
 ├── ✔ File metadata [1 locations]
 └── ✔ Executables [1 executables]
github.com/anchore/syft v1.10.0 go-module

(I find it delightfully meta to discover syft inside other tools using syft itself)

A silly meme that isn't true at all :)

This collaborative building upon existing tools mirrors how Linux distributions often build upon other Linux distributions. Like Ubuntu and Telegraf, we see countless individuals and organizations creating innovative solutions that extend beyond the core capabilities of Syft and Grype. It’s the essence of open source - a multiplier effect that comes from creating accessible, powerful tools.

While we may not always know exactly how and where these tools are being used (and sometimes, rightfully so, it’s not our business), there are many cases where developers and companies want to share their innovative implementations.

I’m particularly interested in these stories because they deserve to be shared. I’ve been exploring public repositories like the GitHub network dependents for syft, grype, sbom-action, and scan-action to discover where our tools are making an impact.

The adoption has been remarkable!

I reached out to several open source projects to learn about their implementations, and Nicolas Vuilamy from MegaLinter was the first to respond - which brings us full circle.

Today, I’m hosting our first Community Spotlight Webinar with Nicolas to share MegaLinter’s story. Register here to join us!

If you’re building something interesting with Anchore Open Source and would like to share your story, please get in touch. ğŸ™�

Alun JonesLithium Ion Discharge Curve

Note: I started writing this post in July 2023, and forgot to finish it. The live battery graphs, linked below, give a bunch of extra info for guesstimating discharge curves - about 476 words

Chris WallaceTrees of Essaouira

Palms and Norfolk pines dominate the street scene. The Norfolk pines (Auracaria hetrophylla) do...

BitFolk WikiSponsored hosting

Sponsored Projects: +BarCamp Surrey

← Older revision Revision as of 23:53, 13 January 2025
(One intermediate revision by the same user not shown)
Line 43: Line 43:


* [https://57north.co/ 57North Hacklab]
* [https://57north.co/ 57North Hacklab]
* [https://barcampsurrey.org/ BarCamp Surrey]
* [https://edinburghhacklab.com/ Edinburgh Hacklab]
* [https://edinburghhacklab.com/ Edinburgh Hacklab]
* [https://eof.org.uk/ EOF Hackspace]
* [https://eof.org.uk/ EOF Hackspace]
Line 54: Line 55:
* [http://www.somakeit.org.uk/ Southampton Makerspace]
* [http://www.somakeit.org.uk/ Southampton Makerspace]
* [http://www.surrey.lug.org.uk/ Surrey Linux User Group]
* [http://www.surrey.lug.org.uk/ Surrey Linux User Group]
* [http://ubuntupodcast.org/ Ubuntu Podcast]

BitFolk Issue TrackerBilling - Feature #219 (Closed): Add host name to data transfer reports

Seems to be working

BitFolk Issue TrackerBilling - Feature #219 (In Progress): Add host name to data transfer reports

BitFolk WikiBooting

Boot process: Removed tag for lost animated gif from imgur. Very old info anyway

← Older revision Revision as of 02:13, 10 January 2025
Line 28: Line 28:


==Boot process==
==Boot process==
<imgur thumb="yes" w="562">cufm6gd.gif</imgur>
On the right you should see an animated GIF of a terminal session where a Debian stretch VPS is booted with a GRUB-legacy config file. The bootloader config is viewed and booted. Then '''grub-pc''' package is installed to convert the VPS to GRUB2. ttyrec or ttygif seem to have introduced some corruption and offset characters, but you probably get the idea.
If you're looking at your console in the Xen Shell then the first thing that you should see is a list of boot methods that BitFolk's GRUB recognises. It checks for each of the following things, in order, on each of your block devices and every partition of those block devices:
If you're looking at your console in the Xen Shell then the first thing that you should see is a list of boot methods that BitFolk's GRUB recognises. It checks for each of the following things, in order, on each of your block devices and every partition of those block devices:


Alun JonesManaging load from abusive web bots

A few months back I created a small web application which generated a fake hierarchy of web pages, on the fly, using a Markov Chain to make gibberish content that - about 1987 words

David LeadbeaterDéjà vu: Ghostly CVEs in my terminal title

Exploring a security bug in Ghostty that is eerily familiar.

BitFolk Issue TrackerBilling - Feature #219 (Closed): Add host name to data transfer reports

A customer has requested that rather than just the VPS account name being used in the data transfer emails, the host name (reverse DNS of the primary IPv4 address) should be shown too, as they found it confusing which VPS was the subject of the report.

So instead of:

From: BitFolk Data Transfer Monitor <xfer@bitfolk.com>
Subject: [4/4] BitFolk VPS 'tom' data transfer report

This is a weekly report of data transfer for domain 'tom' on
host 'tanqueray.bitfolk.com'. This report is informational only and not
an invoice or bill.

It would look more like:

From: BitFolk Data Transfer Monitor <xfer@bitfolk.com>
Subject: [4/4] BitFolk VPS 'tom' (tom.dogsitter.services) data transfer report

This is a weekly report of data transfer for domain 'tom'
(tom.dogsitter.services) on host 'tanqueray.bitfolk.com'. This report
is informational only and not an invoice or bill.

BitFolk Issue TrackerBitFolk - Feature #216: Add phishing-resistant authentication for https://panel.bitfolk.com/

Thanks. I have started to give this a read now. 😀

Jon SpriggsQuick Tip: Don’t use concat in your spreadsheet, use textjoin!

I found this on Threads today

CONCAT vs TEXTJOIN – The ultimate showdown! 🥊
TEXTJOIN is the GOAT:
=TEXTJOIN(“, “, TRUE, A1:A10)
â—� Adds delimiters automatically
â—� Ignores empty cells
â—� Works with ranges
Goodbye CONCAT, you won’t be missed!

And I’ve tested it this morning. I don’t have excel any more, but it works on Google Sheets, no worries!

BitFolk Issue TrackerBitFolk - Feature #216: Add phishing-resistant authentication for https://panel.bitfolk.com/

Here's an excellent tour through everything WebAuthn.

https://www.imperialviolet.org/tourofwebauthn/tourofwebauthn.html

Andy SmithI recommend avoiding the need to have panretinal photocoagulation (PRP) laser treatment

WARNING

This article contains descriptions of medical procedures on the eye. If that sort of thing makes you squeamish you may want to give it a miss.

Yesterday I had panretinal photocoagulation (PRP) laser treatment in both eyes, and it was quite unpleasant. I recommend trying really hard to avoid having to ever have that if possible.

PRP is used to manage symptoms of proliferative diabetic retinopathy. A laser is used to burn abnormal new blood vessels around the retina.

Having had a different kind of laser treatment before I wasn't expecting this to be a big deal. Unfortunately I was wrong and it was a bit of an ordeal.

As usual at these eye examinations I had drops to dilate my pupils and a bunch of different scans and photographs of the back of my eye taken so they knew what they were dealing with. Then in preparation for the procedure, some numbing eye drops. It's an odd sensation not being able to feel your eyelids or the skin around your eyes, but that part wasn't uncomfortable.

Next up the consultant held some sort of eyepiece firmly against the surface of my eyeball and applied a decent amount of pressure to keep it in place.

Then the laser pulses began. Many, many pulses. Each caused an unpleasant stabbing sensation in my eyeball with a dull ache following it. It wasn't so much that it was painful — Wikipedia describes this as "stinging" and in isolation I'd agree with that description. However while this was taking place my head was in a chin rest with an eyepiece thing pressed against my eyeball and the knowledge that if I moved unexpectedly then I risked having my vision destroyed by the laser. And these laser pulses were coming multiple times per second.

I was doing some grunting at the discomfort of each laser pulse when…

Consultant: What! I'm on 30% power. If I make it any lower it'll be homeopathy, know what I mean? It needs to be effective.

Me, through gritted teeth: Just do what you need to do.

Another thing I was not prepared for was total blindness during the procedure and for a few minutes after. He was telling me to look in certain directions, but my vision had gone completely black due to the laser so I couldn't actually tell which direction I was looking in.

Then when it was finally over for one eye, I still could not see anything and due to the anaesthetic could not even tell if my eye was open or not as I couldn't feel my eyelid. Thankfully that recovered after a couple of minutes so he could begin on the other eye…

Post procedure was not too bad. It's an outpatient procedure and I was immediately able to go home on the bus! My eyes just felt tired and took a lot longer to recover from the dilation drops than they usually do (I have vision tests several times a year and they always involve dilation drops). A headache between the temples did force me to go to bed early, but feel fine today.

So… if you have diabetes then blood sugar control is important to help avoid having to go through something like this. If you lose a genetic lottery then after decades living with diabetes you may need it anyway, or if you win then perhaps you never do, but I just suggest doing what you can to improve your odds.

This is still only the second most unpleasant procedure I've had on my eye though!

Jon SpriggsA few weird issues in the networking on our custom AWS EKS Workers, and how we worked around them

For “reasons”, at work we run AWS Elastic Kubernetes Service (EKS) with our own custom-built workers. These workers are based on Alma Linux 9, instead of AWS’ preferred Amazon Linux 2023. We manage the deployment of these workers using AWS Auto-Scaling Groups.

Our unusal configuration of these nodes mean that we sometimes trip over configurations which are tricky to get support on from AWS (no criticism of their support team, if I was in their position, I wouldn’t want to try to provide support for a customer’s configuration that was so far outside the recommended configuration either!)

Over the past year, we’ve upgraded EKS1.23 to EKS1.27 and then on to EKS1.31, and we’ve stumbled over a few issues on the way. Here are a couple of notes on the subject, in case they help anyone else in their journey.

All three of the issues below were addressed by running an additional service on the worker nodes in a Systemd timed service which triggers every minute.

Incorrect routing for the 12th IP address onwards

Something the team found really early on (around EKS 1.18 or somewhere around there) was that the AWS VPC-CNI wasn’t managing the routing tables on the node properly. We raised an issue on the AWS VPC CNI (we were on CentOS 7 at the time) and although AWS said they’d fixed the issue, we currently need to patch the routing tables every minute on our nodes.

What happens?

When you get past the number of IP addresses that a single ENI can have (typically ~12), the AWS VPC-CNI will attach a second interface to the worker, and start adding new IP addresses to that. The VPC-CNI should setup routing for that second interface, but for some reason, in our case, it doesn’t. You can see this happens because the traffic will come in on the second ENI, eth1, but then try to exit the node on the first ENI, eth0, with a tcpdump, like this:

[root@test-i-01234567890abcdef ~]# tcpdump -i any host 192.0.2.123
tcpdump: data link type LINUX_SLL2
dropped privs to tcpdump
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes
09:38:07.331619 eth1  In  IP ip-192-168-1-100.eu-west-1.compute.internal.41856 > ip-192-0-2-123.eu-west-1.compute.internal.irdmi: Flags [S], seq 1128657991, win 64240, options [mss 1359,sackOK,TS val 2780916192 ecr 0,nop,wscale 7], length 0
09:38:07.331676 eni989c4ec4a56 Out IP ip-192-168-1-100.eu-west-1.compute.internal.41856 > ip-192-0-2-123.eu-west-1.compute.internal.irdmi: Flags [S], seq 1128657991, win 64240, options [mss 1359,sackOK,TS val 2780916192 ecr 0,nop,wscale 7], length 0
09:38:07.331696 eni989c4ec4a56 In  IP ip-192-0-2-123.eu-west-1.compute.internal.irdmi > ip-192-168-1-100.eu-west-1.compute.internal.41856: Flags [S.], seq 3367907264, ack 1128657992, win 26847, options [mss 8961,sackOK,TS val 1259768406 ecr 2780916192,nop,wscale 7], length 0
09:38:07.331702 eth0  Out IP ip-192-0-2-123.eu-west-1.compute.internal.irdmi > ip-192-168-1-100.eu-west-1.compute.internal.41856: Flags [S.], seq 3367907264, ack 1128657992, win 26847, options [mss 8961,sackOK,TS val 1259768406 ecr 2780916192,nop,wscale 7], length 0

The critical line here is the last one – it’s come in on eth1 and it’s going out of eth0. Another test here is to look at ip rule

[root@test-i-01234567890abcdef ~]# ip rule
0:	from all lookup local
512:	from all to 192.0.2.111 lookup main
512:	from all to 192.0.2.143 lookup main
512:	from all to 192.0.2.66 lookup main
512:	from all to 192.0.2.113 lookup main
512:	from all to 192.0.2.145 lookup main
512:	from all to 192.0.2.123 lookup main
512:	from all to 192.0.2.5 lookup main
512:	from all to 192.0.2.158 lookup main
512:	from all to 192.0.2.100 lookup main
512:	from all to 192.0.2.69 lookup main
512:	from all to 192.0.2.129 lookup main
1024:	from all fwmark 0x80/0x80 lookup main
1536:	from 192.0.2.123 lookup 2
32766:	from all lookup main
32767:	from all lookup default

Notice here that we have two entries from all to 192.0.2.123 lookup main and from 192.0.2.123 lookup 2. Let’s take a look at what lookup 2 gives us, in the routing table

[root@test-i-01234567890abcdef ~]# ip route show table 2
192.0.2.1 dev eth1 scope link

Fix the issue

This is pretty easy – we need to add a default route if one doesn’t already exist. Long before I got here, my boss created a script which first runs ip route show table main | grep default to get the gateway for that interface, then runs ip rule list, looks for each lookup <number> and finally runs ip route add to put the default route on that table, the same as on the main table.

ip route add default via "${GW}" dev "${INTERFACE}" table "${TABLE}"

Is this still needed?

I know when we upgraded our cluster from EKS1.23 to EKS1.27, this script was still needed. When I’ve just checked a worker running EKS1.31, after around 12 hours of running, and a second interface being up, it’s not been needed… so perhaps we can deprecate this script?

Dropping packets to the containers due to Martians

When we upgraded our cluster from EKS1.23 to EKS1.27 we also changed a lot of the infrastructure under the surface (AlmaLinux 9 from CentOS7, Containerd and Runc from Docker, CGroups v2 from CGroups v1, and so on). We also moved from using an AWS Elastic Load Balancer (ELB) or “Classic Load Balancer” to AWS Network Load Balancer (NLB).

Following the upgrade, we started seeing packets not arriving at our containers and the system logs on the node were showing a lot of martian source messages, particularly after we configured our NLB to forward original IP source addresses to the nodes.

What happens

One thing we noticed was that each time we added a new pod to the cluster, it added a new eni[0-9a-f]{11} interface, but the sysctl value for net.ipv4.conf.<interface>.rp_filter (return path filtering – basically, should we expect the traffic to be arriving at this interface for that source?) in sysctl was set to 1 or “Strict mode” where the source MUST be the coming from the best return path for the interface it arrived on. The AWS VPC-CNI is supposed to set this to 2 or “Loose mode” where the source must be reachable from any interface.

In this case you’d tell this because you’d see this in your system journal (assuming you’ve got net.ipv4.conf.all.log_martians=1 configured):

Dec 03 10:01:19 test-i-01234567890abcdef kernel: IPv4: martian source 192.168.1.100 from 192.0.2.123, on dev eth1

The net result is that packets would be dropped by the host at this point, and they’d never be received by the containers in the pods.

Fix the issue

This one is also pretty easy. We run sysctl -a and loop through any entries which match net.ipv4.conf.([^\.]+).rp_filter = (0|1) and then, if we find any, we run sysctl -w net.ipv4.conf.\1.rp_filter = 2 to set it to the correct value.

Is this still needed?

Yep, absolutely. As of our latest upgrade to EKS1.31, if this value isn’t set, then it will drop packets. VPC-CNI should be fixing this, but for some reason it doesn’t. And setting the conf.ipv4.all.rp_filter to 2 doesn’t seem to make a difference, which is contrary to the documentation in the relevant Kernel documentation.

After 12 IP addresses are assigned to a node, Kubernetes services stop working for some pods

This was pretty weird. When we upgraded to EKS1.31 on our smallest cluster we initially thought we had an issue with CoreDNS, in that it sometimes wouldn’t resolve IP addresses for services (DNS names for services inside the cluster are resolved by <servicename>.<namespace>.svc.cluster.local to an internal IP address for the cluster – in our case, in the range 172.20.0.0/16). We upgraded CoreDNS to the EKS1.31 recommended version, v1.11.3-eksbuild.2 and that seemed to fix things… until we upgraded our next largest cluster, and things REALLY went wrong, but only when we had increased to over 12 IP addresses assigned to the node.

You might see this as frequent restarts of a container, particularly if you’re reliant on another service to fulfil an init container or the liveness/readyness check.

What happens

EKS1.31 moves KubeProxy from iptables or ipvs mode to nftables – a shift we had to make internally as AlmaLinux 9 no longer supports iptables mode, and ipvs is often quite flaky, especially when you have a lot of pod movements.

With a single interface and up to 11 IP addresses assigned to that interface, everything runs fine, but the moment we move to that second interface, much like in the first case above, we start seeing those pods attached to the second+ interface being unable to resolve service addresses. On further investigation, doing a dig from a container inside that pod to the service address of the CoreDNS service 172.20.0.10 would timeout, but a dig against the actual pod address 192.0.2.53 would return a valid response.

Under the surface, on each worker, KubeProxy adds a rule to nftables to say “if you try and reach 172.20.0.10, please instead direct it to 192.0.2.53”. As the containers fluctuate inside the cluster, KubeProxy is constantly re-writing these rules. For whatever reason though, KubeProxy currently seems unable to determine that a second or subsequent interface has been added, and so these rules are not applied to the pods attached to that interface…. or at least, that’s what it looks like!

Fix the issue

In this case, we wrote a separate script which was also triggered every minute. This script looks to see if the interfaces have changed by running ip link and looking for any interfaces called eth[0-9]+ which have changed, and then if it has, it runs crictl pods (which lists all the running pods in Containerd), looks for the Pod ID of KubeProxy, and then runs crictl stopp <podID> [1] and crictl rmp <podID> [1] to stop and remove the pod, forcing kubelet to restart the KubeProxy on the node.

[1] Yes, they aren’t typos, stopp means “stop the pod” and rmp means “remove the pod”, and these are different to stop and rm which relate to the container.

Is this still needed?

As this was what I was working on all-day yesterday, yep, I’d say so 😊 – in all seriousness though, if this hadn’t been a high-priority issue on the cluster, I might have tried to upgrade the AWS VPC-CNI and KubeProxy add-ons to a later version, to see if the issue was resolved, but at this time, we haven’t done that, so maybe I’ll issue a retraction later 😂

Featured image is “Apoptosis Network (alternate)� by “Simon Cockell� on Flickr and is released under a CC-BY license.

Ross YoungerGetting value from CI

Many of my peers within the software world know the value of Continuous Integration and don’t need convincing. This article is for everybody else.

Introduction

In my first job out of college we had what you’d recognise as CI, though the term wasn’t so popular then. It was powerful, very useful, but a source of Byzantine complexity.

I’ve also worked for people who didn’t think CI was worth doing because it was too expensive to set up and maintain. This is not totally unreasonable; the real question is to figure out where the value for your project might lie.

Recently, a friend wrote:

I don't really know very much about CI. I would be interested in knowing more and might even use some of the quick wins (...) I do not want to become completely reliant upon GitHub for anything.

So let’s start with a primer.

Terminology: What is CI?

Unfortunately the term “CI” is sometimes misused and/or confused.

The short answer is that it’s automation that regularly (continuously) does something useful with your codebase. These actions might take place on every commit, nightly, or be activated by some external trigger.

CI usually refers to a spectrum of practices, each step building on the last:

Continuous… Typical activities
Build Builds your code, usually to the unit or module level. Runs unit tests.
Integration Assembles modules to a “finished application”, whatever that means. Runs integration tests.
Test A full suite of automated tests. May include regression, performance, deployability and data migration.
Delivery When the test suite passes, the latest version of the system is automatically released to a staging environment. This might involve building packages and putting them in a download area.
Deployment When the automated tests pass, the software automatically goes live. Hold tight!

Exactly what these phases mean for your project, and how far you go with them, depends on your project.

  • What suits my embedded firmware probably won’t suit your cloud app or that other person’s desktop app.
  • The lines between the phases are blurry. For example, it may or may not make sense to build and integrate everything in one go.

� Why CI

If deployed appropriately, CI can save time, reduce costs and improve quality. Even on a hobby project, there is often value in saving your time.

1. Automating stuff, so the humans don’t have to

You could use your engineers to do the repetitive drudge work of creating a release across multiple platforms. You could have them run a full barrage of tests before committing a code change… but should you? Engineers are expensive and generally dislike boring stuff, so the smart business move is usually to automate away the repetitive parts and have them focus where they can deliver most value.

If you’re not sure, consider this: how much time does your team spend per release cycle on the repetitive parts? Consider your expected frequency of release cycles, that should lead you to the answer.

2. Automatic analysis and status reporting

One place I worked had a release process which relied on an engineer reading multiple megabytes of log file to see if things had been successful. Many things could go wrong and leave the final output in a plausible but half-broken state. Worse, it wasn’t as simple as running the script in stop-on-error mode, because some of the steps were prone to false alarms.

You may be ahead of me here, but I didn’t think much of that setup.

Compilation failed? Show me the compiler output from the file that failed.

A test failed? I want to see the result of that test (expected & observed results).

Everything passed? Great, but don’t spend megabytes to convey one single bit of information.

At its simplest, a small project will have a single main branch, and the operational information you need can be boiled down to a small number of states:

Red traffic light Yellow traffic light Green traffic light
Something is broken Non-critical warning (not all projects use this) Everything is working

In a non-remote workplace it might make sense to set up some sort of status annunciator.

  • Some people use coloured lava lamps or similar.
  • At one place I worked the machinery in the factory had physical traffic light (andon) lamp sets. We set one of these up, driven by a Raspberry Pi wired in to the build server.
  • Some projects build more elaborate virtual dashboards that suit their needs. Multiple branches, multiple build configurations, whatever makes sense.

3. Improved quality

This one might be self-evident, but I’ll spell it out anyway.

A good CI system will let you incorporate tests of many different types, with variable pass/fail criteria. Think beyond unit and integration testing:

  • Regression (check that your bugs stay fixed)
  • Code quality (code/test coverage analysis; static analysis; dynamic memory leak analysis; automated code style checks)
  • Security analysis (are there any known issues in your dependencies?)
  • License/SBOM compliance
  • Fuzz testing (how does it handle randomised, unexpected inputs?)
  • Performance requirements
  • “Early warning” performance canaries
  • Standards compliance
  • System data migration
  • On-device testing (might be real, emulated or simulated hardware)
Performance canaries

Particularly where physical devices are involved, you might have a performance margin built in to your hardware spec. As the project evolves, inevitably new features will erode this margin. When you run out things are going to go wrong, so you want to take action before you get there.

An early warning canary is some sort of metric with a threshold. Examples might include free memory, CPU/MCU consumption, or task processing time. When the threshold is passed, that's a sign that things are getting tight and it's time to take pre-emptive action. You might plan to spend some time on algorithmic optimisations, or to kick off a new hardware spin.

If you can automate a really robust set of tests, you can have a lot of confidence in the state of your code at any given time. This gives incredible agility: you can release at any time, if the tests pass. This is the key to moving quickly, and is how a number of tech companies operate.

For a success story involving physical devices, check out the HP LaserJet team’s DevOps transformation.

4. Reduced time to resolve issues

If there’s one thing I’ve learned in the software business, it’s that it’s cheaper to find bugs closer to development - by orders of magnitude.

In other words, reduce the feedback cycle to reduce your costs. This is where automated tests and checks have great value.

  • If there is something wrong in code I modified a minute ago, I’m still in the right headspace and can usually fix it pretty quickly.
  • If it takes a few days to get a test result, I won’t remember all the detail and will have to refresh my memory.
  • If it takes several months to hear that something’s wrong, I may be working on a totally different part of the system and it will take longer to context switch.
  • If a bug report comes in from the field a year or two later, I might as well be starting again from scratch.

But - as ever - engineering is a trade-off. You can’t write a test to catch a bug you haven’t foreseen. It may be prohibitively expensive to test all possible combinations before release.

� Why not CI

CI is not suitable for all software projects.

If you’re writing a scratch throw-away project that won’t live for very long, even simple CI may not be worth it.

If you have a legacy codebase that was written without testing in mind, it might be prohibitively expensive to refactor to set these up. Nevertheless, in such projects there is often still some value to be found in a continuous build.

Let’s be pragmatic.

Tests aren’t everything

On the face of it, more testing means greater quality, right? Well… maybe?

Keep the end goal in sight. It’s up to you to decide what makes sense for your situation; I recommend taking a whole-of-organisation view.

  • You need to balance test runtime against overall feedback cycles. If the tests take too long to run, you’re slowing people down.
  • Some tests are expensive in terms of time or consuming resources, so you might not want to run them daily.
  • Tests involving physical devices can be difficult to automate, and risk creating a process bottleneck. (Consider emulation and/or simulation where appropriate.)
  • Beware of over-testing; you may not need to exhaustively check all the combinations. Statistical techniques might help you out here.
  • Beware of making your black-box tests too strict; this can lead to brittle tests that are more hassle to maintain than they are worth.

Costs and maintenance

It will take time and effort to set up CI. How much time and effort, I can’t say.

In times past, CI was quite the bespoke effort.

These days there is good tooling support for many environments, so it is usually pretty quick to get something going. From there you can decide how far to go.

It might be too big for your platform

CI platforms are designed for small, lightweight processes. Think seconds to minutes, not hours.

If you need to build a large application or a full Yocto firmware image, it’s going to be tough to make that fit within the limits of a cloud-hosted CI platform. Don’t despair! There are ways out, but you need to be smart. Alternative options include:

  • self-hosting CI runners that are take part in a cloud source repository;
  • self-hosting the CI environment (e.g. Gitlab, Jenkins, CircleCI), noting that most source code hosting platforms have integrations;
  • split up the task into multiple smaller CI jobs making good use of artefacts between stages;
  • reconsidering what is truly worth automating anyway.

👷 Steps you can take

1. Build your units

In most projects you already had to set up a buildsystem. Automating this is usually pretty cheap though you will need to get the tooling right.

Tooling on cloud platforms

On-cloud CI (as provided by Github, Gitlab, Bitbucket and others) is generally containerised. What this means is that your project has to know how to install its own tooling, starting with a minimal (usually Linux) container image.

This is really good practice! Doing so means your required tools are themselves expressed in source code under revision control.

Where this might get tricky is if you have multiple build configurations (platforms or builds with different features). Don’t be surprised if automating reveals shortcomings in your setup.

If you have autogenerated documentation, consider running that too. (In Rust, for example, it could be as easy as adding a cargo doc step.)

2. Test your units

Adding unit tests to CI is usually pretty cheap though it will depend on the language and available test frameworks.

If you want to include language-intrinsic checks (e.g. code style, static analysis) this is a good time to build them in. Some analyses can be quite expensive so it may not make sense to run all the checks at the same frequency.

3. Integrate it

If you’re pulling multiple component parts (microservices, standalone executables) together to make an end result, that’s the next step. Do they play nicely? Do you want to run any isolated tests among them before you move to delivery-level tests?

4. Add more checks

I spoke about these above.

This is where things stop being cheap and you have to start thinking about building out supporting infrastructure.

5. Deliver it

Now we’re getting quite situation-specific. Think about what it means to deliver your project.

Are you building a package for an ecosystem (Rust crate / Python pypi / npm.js / …) ? You might be able to automate the packaging steps and that might be pretty cheap.

Are you building an application? Perhaps you can automate the process of building the installer / container / whatever shape it takes. If you have multiple build configurations or platforms, it could get very tedious to build them all by hand and there is often a win for automation.

Where there's code signing involved, you'll need to decide whether it makes sense to automate that or leave it as a manual release step. Never put private keys or other code signing secrets directly into source! Some platforms have secrets mechanisms that may be of use, but it pays to be cautious. If your secrets leak, how will you repair the situation?

Closing thoughts

  • Most projects will benefit from a little CI. You don’t need to have unit tests, though they are a good idea.
  • You’re going to have to maintain your CI, so build it for maintainability like you do your software.
  • Apply agile to your CI as you do to your deliverables. Perfect is the enemy of good enough. Build something, get feedback, iterate!
  • CI vendors want to lock you in to their platform. Keep your eyes open.
  • Don’t let CI become an all-consuming monster that prevents you from delivering in the first place!

Andy SmithCheck yo PTRs

Backstory

The other day I was looking through a log file and saw one of BitFolk's IP addresses doing something. I didn't recognise the address so I did a reverse lookup and got 2001-ba8-1f1-f284-0-0-0-2.autov6rev.bitfolk.space — which is a generic setting and not very useful.

It's quick to look this up and fix it of course, but I wondered how many other such addresses I had forgotten to take care of the reverse DNS for.

ptrcheck

In order to answer that question, automatically and in bulk, I wrote ptrcheck.

It was able to tell me that almost all of my domains had at least one reference to something without a suitable PTR record.

$ ptrcheck --server [::1] --zone strugglers.net� 192.168.9.10 is pointed to by: intentionally-broken.strugglers.net. Missing PTR for 192.168.9.10 1 missing/broken PTR record$

Though it wasn't all bad news. 😀

$ ptrcheck --server [::1] --zone dogsitter.services -vConnecting to ::1 port 53 for AXFR of zone dogsitter.servicesZone contains 57 recordsFound 3 unique address (A/AAAA) records� 2001:ba8:1f1:f113::80 is pointed to by: dogsitter.services., dev.dogsitter.services., www.dogsitter.services. Found PTR: www.dogsitter.services.� 85.119.84.147 is pointed to by: dogsitter.services., dev.dogsitter.services., tom.dogsitter.services., www.dogsitter.services. Found PTR: dogsitter.services.� 2001:ba8:1f1:f113::2 is pointed to by: tom.dogsitter.services. Found PTR: tom.dogsitter.services.� 100.0% good PTRs! Good job!$

How it works

See the repository for full details, but briefly: ptrcheck does a zone transfer of the zone you specify and keeps track of every address (A / AAAA) record. It then does a PTR query for each unique address record to make sure it

  1. exists
  2. is "acceptable"

You can provide a regular expression for what you deem to be "unacceptable", otherwise any PTR content at all is good enough.

Why might a PTR record be "unacceptable"??

I am glad you asked.

A lot of hosting providers generate generic PTR records when the customer doesn't set their own. They're not a lot better than having no PTR at all.

Failure to comply is no longer an option (for me)

The program runs silently (unless you use --verbose) so I was able to make a cron job that runs once a day and complains at me if any of my zones ever refer to a missing or unacceptable PTR ever again!

By the way, I ran it against all BitFolk customer zones; 26.5% of them had at least one missing or generic PTR record.

BitFolk WikiMonitoring

Setup: NRPE example config

← Older revision Revision as of 11:17, 19 November 2024
Line 21: Line 21:


These sorts of checks can work without an agent (i.e. without anything installed on your VPS). More complicated checks such as disk space, load or anything else that you can check with a script will need some sort of agent such as an [https://exchange.nagios.org/directory/Addons/Monitoring-Agents/NRPE--2D-Nagios-Remote-Plugin-Executor/details NRPE] daemon or [[Wikipedia:SNMP|SNMP]] daemon.
These sorts of checks can work without an agent (i.e. without anything installed on your VPS). More complicated checks such as disk space, load or anything else that you can check with a script will need some sort of agent such as an [https://exchange.nagios.org/directory/Addons/Monitoring-Agents/NRPE--2D-Nagios-Remote-Plugin-Executor/details NRPE] daemon or [[Wikipedia:SNMP|SNMP]] daemon.
===NRPE===
NRPE is a typical agent you would run that would allow BitFolk's monitoring system to execute health checks on your VPS. On Debian/Ubuntu systems it can be installed from the package '''nagios-nrpe-server'''. This will normally pull in the package '''monitoring-plugins-basic''' which contains the check plugins.
Check plugins end up in the '''/usr/lib/nagios/plugins/''' directory. NRPE can run any of these when asked and feed the info back to BitFolk's Icinga. All of the existing ones should support a <code>--help</code> argument to let you know how to use them, e.g.
<syntaxhighlight lang="text">
$ /usr/lib/nagios/plugins/check_tcp --help
</syntaxhighlight>
You can run check plugins from the command line:
<syntaxhighlight lang="text">
$ /usr/lib/nagios/plugins/check_tcp -H 85.119.82.70 -p 443
TCP OK - 0.000 second response time on 85.119.82.70 port 443|time=0.000322s;;;0.000000;10.000000
</syntaxhighlight>
There are a large number of Nagios-compatible check plugins in existence so you should be able to find one that does what you. If there's not, it's easy to write one. Here's an example of using '''check_disk''' to check the disk space of your root filesystem.
<syntaxhighlight lang="text">
$ /usr/lib/nagios/plugins/check_disk -w '10%' -c '4%' -p /
DISK OK - free space: / 631 MB (11% inode=66%);| /=5035MB;5381;5739;0;5979
</syntaxhighlight>
Once you have that working, you put it in an NRPE config file such as '''/etc/nagios/nrpe.d/xvda1.cfg'''.
<syntaxhighlight lang="text">
command[check_xvda1]=/usr/lib/nagios/plugins/check_disk -w '10%' -c '4%' -p /
</syntaxhighlight>
You should then tell BitFolk (in a support ticket) what the name of it is ("'''check_xvda1'''"). It will then get added to BitFolk's Icinga.
By this means you can check anything you can script.


==Alerts==
==Alerts==

BitFolk WikiUser:Equinox/WireGuard

← Older revision Revision as of 20:33, 13 November 2024
Line 1: Line 1:


'''STOP !!!!!!!!'''


'''This is NOT READY ! I guarantee it won't work yet. (Mainly the routing, also table=off needs research, I can't remember exactly what it does).'''
'''STOP !!!!!'''


'''Is is untried / untested. It's a first draft fished out partly from my running system and partly my notes.'''
'''This is a first draft! If you're a hardy network type who can recover from errors / omissions in this page then go for it (and fix this page!)'''
 
'''HOWEVER...''' If you are a hardy network type and you want a go ... Have at it.




Line 32: Line 29:
PrivateKey = # Insert the contents of the file server-private-key generated above
PrivateKey = # Insert the contents of the file server-private-key generated above
ListenPort = # Pick an empty UDP port to listen on. Remember to open it in your firewall
ListenPort = # Pick an empty UDP port to listen on. Remember to open it in your firewall
Address = 10.254.1.254/24
Address = 10.254.1.254/32
Address = 2a0a:1100:1018:1::fe/64
Address = 2a0a:1100:1018:1::fe/128
Table = off
</syntaxhighlight>
</syntaxhighlight>


Line 53: Line 49:
Address = 10.254.1.1/24
Address = 10.254.1.1/24
Address = 2a0a:1100:1018:1::1/64
Address = 2a0a:1100:1018:1::1/64
Table = off
</syntaxhighlight>
</syntaxhighlight>


Line 60: Line 55:
=== Add the Client Information to the Server ===
=== Add the Client Information to the Server ===


Append the following to the server /etc/wireguard/wg0.conf, inserting your generated information where appropriate:
Append the following to the '''server''' /etc/wireguard/wg0.conf, inserting your generated information where appropriate:


<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
Line 73: Line 68:
=== Add the Server Information to the Client ===
=== Add the Server Information to the Client ===


Append the following to the server /etc/wireguard/wg0.conf, inserting your generated information where appropriate:
Append the following to the '''client''' /etc/wireguard/wg0.conf, inserting your generated information where appropriate:


<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
Line 93: Line 88:
# systemctl start wg-quick@wg0
# systemctl start wg-quick@wg0
# systemctl enable wg-quick@wg0      # Optional - start VPN at startup
# systemctl enable wg-quick@wg0      # Optional - start VPN at startup
</syntaxhighlight>
If all went well you should now have a working tunnel. Confirm by running:
<syntaxhighlight lang="text">
# wg
</syntaxhighlight>
If both sides have a reasonable looking "latest handshake" line then the tunnel is up.
The wg-quick scripts automatically set up routes / default routes based on the contents of the wg0.conf files, so at this point you can test the link by pinging addresses from either side.
Two further things may/will need to be configured to allow full routing...
==== Enable IP Forwarding ====
Edit /etc/sysctl.conf, or a local conf file in /etc/sysctl.d/ and enable IPv4 and/or IPv6 forwarding
<syntaxhighlight lang="text">
net.ipv4.ip_forward=1
net.ipv6.conf.all.forwarding=1
</syntaxhighlight>
Reload the kernel variables:
<syntaxhighlight lang="text">
systemctl reload procps
</syntaxhighlight>
==== WireGuard Max MTU Size ====
If some websites don't work properly over IPv6 (Netflix) you may be running into MTU size problems. If using nftables, this can be entirely fixed with the following line in the forward table:
<syntaxhighlight lang="text">
oifname "wg0" tcp flags syn tcp option maxseg size set rt mtu
</syntaxhighlight>
==== Enable Forwarding In Your Firewall ====
You will have to figure out how to do this for your flavour of firewall. If you happen to be using nftables, the following snippet is an example (by no means a full config!) of how to forward IPv6 back and forth. This snippet allows all outbound traffic but throws incoming traffic to the table "ipv6-incoming-firewall" for further filtering. (For testing you could just "accept" but don't leave it like that!)
<syntaxhighlight lang="text">
chain ip6-forwarding {
    type filter hook forward priority 0; policy drop;
    oifname "wg0" tcp flags syn tcp option maxseg size set rt mtu
    ip6 saddr 2a0a:1100:1018:1::/64 accept
    ip6 daddr 2a0a:1100:1018:1::/64 jump ipv6-incoming-firewall
}
</syntaxhighlight>
</syntaxhighlight>


Jon SpriggsTalk Summary – OggCamp ’24 – Kubernetes, A Guide for Docker users

Format: Theatre Style room. ~30 attendees.

Slides: Available to view (Firefox/Chrome recommended – press “S” to see the required speaker notes)

Video: Not recorded. I’ll try to record it later, if I get a chance.

Slot: Graphine 1, 13:30-14:00

Notes: Apologies for the delay on posting this summary. The talk was delivered to a very busy room. Lots of amazing questions. The presenter notes were extensive, but entirely unused when delivered. One person asked a question, I said I’d follow up with them later, but didn’t find them before the end of the conference. One person asked about the benefits of EKS over ECS in AWS… as I’ve not used ECS, I couldn’t answer, but it sounds like they largely do the same thing.

Ross YoungerAnnouncing qcp

The QUIC Copier (qcp) is an experimental high-performance remote file copy utility for long-distance internet connections.

Source repository: https://github.com/crazyscot/qcp

📋 Features

  • 🔧 Drop-in replacement for scp
  • 🛡ï¸� Similar security to scp, using existing, well-known mechanisms
  • 🚀 Better throughput on congested networks

📖 About qcp

qcp is a hybrid protocol combining ssh and QUIC.

We use ssh to establish a control channel to the target machine, then spin up the QUIC protocol to transfer data.

This has the following useful properties:

  • User authentication is handled entirely by ssh
  • Data is transmitted over UDP, avoiding known issues with TCP over “long, fat pipe” connections
  • Data in transit is protected by TLS using ephemeral keys
  • The security mechanisms all use existing, well-known cryptographic algorithms

For full documentation refer to qcp on docs.rs.

Motivation

I needed to copy multiple large (3+ GB) files from a server in Europe to my home in New Zealand.

I’ve got nothing against ssh or scp. They’re brilliant. I’ve been using them since the 1990s. However they run on top of TCP, which does not perform very well when the network is congested. With a fast fibre internet connection, a long round-trip time and noticeable packet loss, I was right in the sour spot. TCP did its thing and slowed down, but when the congestion cleared it was very slow to get back up to speed.

If you’ve ever been frustrated by download performance from distant websites, you might have been experiencing this same issue. Friends with satellite (pre-Starlink) internet connections seem to be particularly badly affected.

💻 Getting qcp

The project is a Rust binary crate.

You can install it:

  • as a Debian package or pre-compiled binary from the latest qcp release page (N.B. the Linux builds are static musl binaries);
  • with cargo install qcp (you will need to have a rust toolchain and capnpc installed);
  • by cloning and building the source repository.

You will need to install qcp on both machines. Please refer to the README for more.

See also

Andy SmithProtecting URIs from Tor nodes with the Apache HTTP Server

Recently I found one of my web services under attack from clients using Tor.

For the most part I am okay with the existence of Tor, but if you're being attacked largely or exclusively through Tor then you might need to take actions like:

  • Temporarily or permanently blocking access entirely.
  • Taking away access to certain privileged functions.

Here's how I did it.

Step 1: Obtain a list of exit nodes

Tor exit nodes are the last hop before reaching regular Internet services, so traffic coming through Tor will always have a source IP of an exit node.

Happily there are quite a few services that list Tor nodes. I like https://www.dan.me.uk/tornodes which can provide a list of exit nodes, updated hourly.

This comes as a list of IP addresses one per line so in order to turn it into an httpd access control list:

$ curl -s 'https://www.dan.me.uk/torlist/?exit' |
    sed 's/^/Require not ip /' |
    sudo tee /etc/apache2/tor-exit-list.conf >/dev/null

This results in a file like:

$ head -10 /etc/apache2/tor-exit-list.conf
Require not ip 102.130.113.9
Require not ip 102.130.117.167
Require not ip 102.130.127.117
Require not ip 103.109.101.105
Require not ip 103.126.161.54
Require not ip 103.163.218.11
Require not ip 103.164.54.199
Require not ip 103.196.37.111
Require not ip 103.208.86.5
Require not ip 103.229.54.107

Step 2: Configure httpd to block them

Totally blocking traffic from these IPs would be easier than what I decided to do. If you just wanted to totally block traffic from Tor then the easy and efficient answer would be to insert all these IPs into an nftables set or an iptables IP set.

For me, it's only some URIs on my web service that I don't want these IPs accessing and I wanted to preserve the ability of Tor's non-abusive users to otherwise use the rest of the service. An httpd access control configuration is necessary.

Inside the virtualhost configuration file I added:

    <Location /some/sensitive/thing>
        <RequireAll>
            Require all granted
            Include /etc/apache2/tor-exit-list.conf
        </RequireAll>
    </Location>

Step 3: Test configuration and reload

It's a good idea to check the correctness of the httpd configuration now. Aside from syntax errors in the list of IP addresses, this might catch if you forgot any modules necessary for these directives. Although I think they are all pretty core.

Assuming all is well then a graceful reload will be needed to make httpd see the new configuration.

$ sudo apache2ctl configtest
Syntax OK
$ sudo apache2ctl graceful

Step 4: Further improvements

Things can't be left there, but I haven't got around to any of this yet.

  1. Script the repeated download of the Tor exit node list. The list of active Tor nodes will change over time.
  2. Develop some checks on the list such as:
    1. Does it contain only valid IP addresses?
    2. Does it contain at least min number of addresses and less than max number?
  3. If the list changed, do the config test and reload again. httpd will not include the altered config file without a reload.
  4. If the list has not changed in x number of days, consider the data source stale and think about emptying the list.

Performance thoughts

I have not checked how much this impacts performance. My service is not under enough load for this to be noticeable for me.

At the moment the Tor exit node list is around 2,100 addresses and I don't know how efficient the Apache HTTP Server is about a large list of Require not ip directives. Worst case is that for every request to that URI it will be scanning sequentially through to the end of the list.

I think that using httpd's support for DBM files in RewriteMaps might be quite efficient but this comes with the significant issue that IPv6 addresses have multiple formats, while a DBM lookup will be doing a literal text comparison.

For example, all of the following represent the same IPv6 address:

  • 2001:db8::
  • 2001:0DB8::
  • 2001:Db8:0000:0000:0000:0000:0000:0000
  • 2001:db8:0:0:0:0:0:0

httpd does have built-in functions to upper- or lower-case things, but not to compress or expand an IPv6 address. httpd access control directives are also able to match the request IP against a CIDR net block, although at the moment Dan's Tor node list does only contain individual IP addresses. At a later date one might like to try to aggregate those individual IP addresses into larger blocks.

httpd's RewriteMaps can also query an SQL server. Querying a competent database implementation like PostgreSQL could be made to alleviate some of those concerns if the data were represented properly, though this does start to seem like an awful lot of work just for an access control list!

Over on Fedi, it was suggested that a firewall rule — presumably using an nftables set or iptables IP set, which are very efficient — could redirect matching source IPs to a separate web server on a different port, which would then do the URI matching as necessary.

<nerdsnipe>There does not seem to be an Apache HTTP Server authz module for IP sets. That would be the best of both worlds!</nerdsnipe>

BitFolk WikiIPv6/VPNs

Using WireGuard

← Older revision Revision as of 16:31, 31 October 2024
Line 19: Line 19:
== Using WireGuard ==
== Using WireGuard ==
Probably the more sensible choice in the 2020s, but, help?
Probably the more sensible choice in the 2020s, but, help?
[[User:Equinox/WireGuard|Not ready yet, but it's a start]]


== Using tincd ==
== Using tincd ==

Chris WallaceMoving from exist-db 3.0.1 to 6.0.1 6.2.0

Moving from exist-db 3.0.1 to 6.0.1 6.2.0That’s an awful lot of release notes to read through...

David LeadbeaterRestrict sftp with Linux user namespaces

A script to restrict SFTP to some directories, without needing chroot or other privileged configuration.

Andy SmithGenerating a link-local address from a MAC address in Perl

Example

On the host

$ ip address show dev eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether aa:00:00:4b:a0:c1 brd ff:ff:ff:ff:ff:ff
[…]
    inet6 fe80::a800:ff:fe4b:a0c1/64 scope link 
       valid_lft forever preferred_lft forever

Generated by script

$ lladdr.pl aa:00:00:4b:a0:c1
fe80::a800:ff:fe4b:a0c

Code

#!/usr/bin/env perl

use warnings;
use strict;
use 5.010;

if (not defined $ARGV[0]) {
    die "Usage: $0 MAC-ADDRESS"
}

my $mac = $ARGV[0];

if ($mac !~ /^
    \p{PosixXDigit}{2}:
    \p{PosixXDigit}{2}:
    \p{PosixXDigit}{2}:
    \p{PosixXDigit}{2}:
    \p{PosixXDigit}{2}:
    \p{PosixXDigit}{2}
    /ix) {
    die "'$mac' doesn't look like a MAC address";
}

my @octets = split(/:/, $mac);

# Algorithm:
# 1. Prepend 'fe80::' for the first 64 bits of the IPv6
# 2. Next 16 bits: Use first octet with 7th bit flipped, and second octet
#    appended
# 3. Next 16 bits: Use third octet with 'ff' appended
# 4. Next 16 bits: Use 'fe' with fourth octet appended
# 5. Next 16 bits: Use 5th octet with 6th octet appended
# = 128 bits.
printf "fe80::%x%02x:%x:%x:%x\n",
    hex($octets[0]) ^ 2,
    hex($octets[1]),
    hex($octets[2] . 'ff'),
    hex('fe' . $octets[3]),
    hex($octets[4] . $octets[5]);

See also

Alan PopeWhere are Podcast Listener Communities

Parasocial chat

On Linux Matters we have a friendly and active, public Telegram channel linked on our Contact page, along with a Discord Channel. We also have links to Mastodon, Twitter (not that we use it that much) and email.

At the time of writing there are roughly this ⬇ï¸� number of people (plus bots, sockpuppets and duplicates) in or following each Linux Matters “official” presence:

Channel Number
Telegram 796
Discord 683
Mastodon 858
Twitter 9919

Preponderance of chat

We chose to have a presence in lots of places, but primarily the talent presenters (Martin, Mark, and myself (and Joe)) only really hang out to chat on Telegram and Mastodon.

I originally created the Telegram channel on November 20th, 2015, when we were publishing the Ubuntu Podcast (RIP in Peace) A.K.A. Ubuntu UK Podcast. We co-opted and renamed the channel when Linux Matters launched in 2023.

Prior to the channel’s existence, we used the Ubuntu UK Local Community (LoCo) Team IRC channel on Freenode (also, RIP in Peace).

We also re-branded our existing Mastodon accounts from the old Ubuntu Podcast to Linux Matters.

We mostly continue using Telegram and Mastodon as our primary methods of communication because on the whole they’re fast, reliable, stay synced across devices, have the features we enjoy, and at least one of them isn’t run by a weird billionaire.

Other options

We link to a lot of other places at the top of the Linux Matters home page, where our listeners can chat, mostly to eachother and not us.

Being over 16, I’m not a big fan of Discord, and I know Mark doesn’t even have an account there. None of us use Twitter much anymore, either.

Periodically I ponder if we (Linux Matters) should use something other than Telegram. I know some listeners really don’t like the platform, but prefer other places like Signal, Matrix or even IRC. I know for sure some non-listeners don’t like Telegram, but I care less about their opinions.

Part of the problem is that I don’t think any of us really enjoy the other realtime chat alternatives. Both Matrix and Signal have terrible user experience, and other flaws. Which is why you don’t tend to find us hanging out in either of those places.

There are further options I haven’t even considered, like Wire, WhatsApp, and likely more I don’t even know or care about.

So we kept using Telegram over any of the above alternative options.

Pondering Posting Polls

I have repeatedly considered asking the listeners about their preferred chat platforms via our existing channels. But that seems flawed, because we use what we like, and no matter how many people prefer something else, we’re unlikely to move. Unless something strange happens 👀 .

Plus, often times, especially on decentralised platforms, the audience can be somewhat “over-enthusiastic” about their preferred way being The Wayâ„¢ï¸� over the alternatives. It won’t do us any favours to get data saying 40% report we should use Signal, 40% suggest Matrix and 20% choose XMPP, if the four of us won’t use any of them.

Pursue Podcast Palaver Proposals

So rather than ask our audience, I thought I’d see what other podcasters promote for feedback and chatter on their websites.

I picked a random set from shows I have heard of, and may have listened to, plus a few extra ones I haven’t. None of this is endorsement or approval, I wanted the facts, just the fax, ma’am.

I collated the data in a json file for some reason, then generated the tables below. I don’t know what to do with this information, but it’s a bit of data we may use if we ever decide to move away from Telegram.

Presenting Pint-Sized Payoff

The table shows some nerdy podcasts along with their primary means (as far as I can tell) of community engagement. Data was gathered manually from podcast home pages and “about” pages. I generally didn’t go into the page content for each episode. I made an exception for “Dot Social” and “Linux OTC” because there’s nothing but episodes on their home page.

It doesn’t matter for this research, I just thought it was interesting that some podcasters don’t feel the need to break out their contact details to a separate page, or make it more obvious. Perhaps they feel that listeners are likely to be viewing an episode page, or looking at a specific show metadata, so it’s better putting the contact details there.

I haven’t included YouTube, where many shows publish and discuss, in addition to a podcast feed.

I am also aware that some people exclusively, or perhaps primarily publish on YouTube (or other video platforms). Those aren’t podcasts IMNSHO.

Key to the tables below. Column names have been shorted because it’s a w i d e table. The numbers indicate how many podcasts use that communication platform.

  • EM - Email address (13/18)
  • MA - Mastodon account (9/18)
  • TW - Twitter account (8/18)
  • DS - Discord server (8/18)
  • TG - Telegram channel (4/18)
  • IR - IRC channel (5/18)
  • DW - Discourse website (2/18)
  • SK - Slack channel (3/18)
  • LI - LinkedIn (2/18)
  • WF - Web form (2/18)
  • SG - Signal group (3/18)
  • WA - WhatsApp (1/18)
  • FB - FaceBook (1/18)

Linux

Show EM MA TW DS TG IR DW SK MX LI WF SG WA FB
Linux Matters ✅ ✅ ✅ ✅ ✅ ✅
Ask The Hosts ✅ ✅ ✅ ✅ ✅
Destination Linux ✅ ✅ ✅ ✅ ✅
Linux Dev Time ✅ ✅ ✅ ✅ ✅
Linux After Dark ✅ ✅ ✅ ✅ ✅
Linux Unplugged ✅ ✅ ✅ ✅
This Week in Linux ✅ ✅ ✅ ✅ ✅
Ubuntu Security Podcast ✅ ✅ ✅ ✅ ✅
Linux OTC ✅ ✅ ✅

Open Source Adjunct

Show EM MA TW DS TG IR DW SK MX LI WF SG WA FB
2.5 Admins ✅ ✅
Bad Voltage ✅ ✅ ✅ ✅
Coffee and Open Source ✅
Dot Social ✅ ✅
Open Source Security ✅ ✅ ✅
localfirst.fm ✅

Other Tech

Show EM MA TW DS TG IR DW SK MX LI WF SG WA FB
ATP ✅ ✅ ✅ ✅
BBC Newscast ✅ ✅ ✅
The Rest is Entertainment ✅

Point

Not entirely sure what to do with this data. But there it is.

Is Linux Matters going to move away from Telegram to something else? No idea.

Alun JonesMessing with web spiders

Yesterday I read a Mastodon posting. Someone had noticed that their web site was getting huge amounts of traffic. When they looked into it, they discovered that it was OpenAI's - about 422 words

Alan PopeWindows 3.11 on QEMU 5.2.0

This is mostly an informational PSA for anyone struggling to get Windows 3.11 working in modern versions of QEMU. Yeah, I know, not exactly a massively viral target audience.

Anyway, short answer, use QEMU 5.2.0 from December 2020 to run Windows 3.11 from November 1993.

Windows 3.11, at 1280x1024, running Internet Explorer 5, looking at a GitHub issue

An innocent beginning

I made a harmless jokey reply to a toot from Thom at OSNews, lamenting the lack of native Mastodon client for Windows 3.11.

When I saw Thom’s toot, I couldn’t resist, and booted a Windows 3.11 VM that I’d installed six weeks ago, manually from floppy disk images of MSDOS and Windows.

I already had Lotus Organiser installed to post a little bit of nostalgia-farming on threads - it’s what they do over there.

Post by @popey
View on Threads

I thought it might be fun to post a jokey diary entry. I hurriedly made my silly post five minutes after Thom’s toot, expecting not to think about this again.

Incorrect, brain

I shut the VM down, then went to get coffee, chuckling to my smart, smug self about my successful nerdy rapid-response. While the kettle boiled, I started pondering - “Wait, if I really did want to make a Mastodon client for Windows 3.11, how would I do it?

I pondered and dismissed numerous shortcuts, including, but not limited to:

  • Fake it with screenshots doctored in MS Paint
  • Run an existing DOS Mastodon Client in a Window
  • Use the Windows Telnet client to connect insecurely to my laptop running the Linux command-line Mastodon client, Toot
  • Set up a proxy through which I could get to a Mastodon web page

I pondered a different way, in which I’d build a very simple proof of concept native Windows client, and leverage the Mastodon API. I’m not proficient in (m)any programming languages, but felt something like Turbo Pascal was time-appropriate and roughly within my capabilities.

Diversion

My mind settled on Borland Delphi, which I’d never used, but looked similar enough for a silly project to Borland Turbo Pascal 7.0 for DOS, which I had. So I set about installing Borland Delphi 1.0 from fifteen (virtual) floppy disks, onto my Windows 3.11 “Workstation” VM.

Windows 3.11, with a Borland Delphi window open

Thank you, whoever added the change floppy0 option to the QEMU Monitor. That saved a lot of time, and was reduced down to a loop of this fourteen times:

"Please insert disk 2"
CTRL+ALT+2
(qemu) change floppy 0 Disk02.img
CTRL+ALT+1
[ENTER]

During my research for this blog, I found a delightful, nearly decade-old video of David Intersimone (“David I”) running Borland Delphi 1 on Windows 3.11. David makes it all look so easy. Watch this to get a moving-pictures-with-sound idea of what I was looking at in my VM.

Once Delphi was installed, I started pondering the network design. But that thought wasn’t resident in my head for long, because it was immediately replaced with the reason why I didn’t use that Windows 3.11 VM much beyond the original base install.

The networking stack doesn’t work. Or at least, it didn’t.

That could be a problem.

Retro spelunking

I originally installed the VM by following this guide, which is notable as having additional flourishes like mouse, sound, and SVGA support, as well as TCP/IP networking. Unfortunately I couldn’t initially get the network stack working as Windows 3.11 would hang on a black screen after the familiar OS splash image.

Looking back to my silly joke, those 16-bit Windows-based Mastodon dreams quickly turned to dust when I realised I wouldn’t get far without an IP address in the VM.

Hopes raised

After some digging in the depths of retro forums, I stumbled on a four year-old repo maintained by Jaap Joris Vens.

Here’s a fully configured Windows 3.11 machine with a working internet connection and a load of software, games, and of course Microsoft BOB 🤓

Jaap Joris published this ready-to-go Windows 3.11 hard disk image for QEMU, chock full of games, utilities, and drivers. I thought that perhaps their image was configured differently, and thus worked.

However, after downloading it, I got the same “black screen after splash” as with my image. Other retro enthusiasts had the same issue, and reported the details on this issue, about a year ago.

does not work, black screen.

It works for me and many others. Have you followed the instructions? At which point do you see the black screen?

The key to finding the solution was a comment from Jaap Joris pointing out that the disk image “hasn’t changed since it was first committed 3 years ago”, implying it must have worked back then, but doesn’t now.

Joy of Open Source

I figured that if the original uploader had at least some success when the image was created and uploaded, it is indeed likely QEMU or some other component it uses may have (been) broken in the meantime.

So I went rummaging in the source archives, looking for the most recent release of QEMU, immediately prior to the upload. QEMU 5.2.0 looked like a good candidate, dated 8th December 2020, a solid month before 18th January 2021 when the hda.img file was uploaded.

If you build it, they will run

It didn’t take long to compile QEMU 5.2.0 on my ThinkPad Z13 running Ubuntu 24.04.1. It went something like this. I presumed that getting the build dependencies for whatever is the current QEMU version, in the Ubuntu repo today, will get me most of the requirements.

$ sudo apt-get build-dep qemu
$ mkdir qemu
$ cd qemu
$ wget https://download.qemu.org/qemu-5.2.0.tar.xz
$ tar xvf qemu-5.2.0.tar.xz
$ cd qemu-5.2.0
$ ./configure
$ make -j$(nproc)

That was pretty much it. The build ran for a while, and out popped binaries and the other stuff you need to emulate an old OS. I copied the bits required directly to where I already had put Jaap Joris’ hda.img and start script.

$ cd build
$ cp qemu-system-i386 efi-rtl8139.rom efi-e1000.rom efi-ne2k_pci.rom kvmvapic.bin vgabios-cirrus.bin vgabios-stdvga.bin vgabios-vmware.bin bios-256k.bin ~/VMs/windows-3.1/

I then tweaked the start script to launch the local home-compiled qemu-system-i386 binary, rather than the one in the path, supplied by the distro:

$ cat start
#!/bin/bash
./qemu-system-i386 -nic user,ipv6=off,model=ne2k_pci -drive format=raw,file=hda.img -vga cirrus -device sb16 -display gtk,zoom-to-fit=on

This worked a treat. You can probably make out in the screenshot below, that I’m using Internet Explorer 5 to visit the GitHub issue which kinda renders when proxied via FrogFind by Action Retro.

Windows 3.11, at 1280x1024, running Internet Explorer 5, looking at a GitHub issue

Share…

I briefly toyed with the idea of building a deb of this version of QEMU for a few modern Ubuntu releases, and throwing that in a Launchpad PPA then realised I’d need to make sure the name doesn’t collide with the packaged QEMU in Ubuntu.

I honestly couldn’t be bothered to go through the pain of effectively renaming (forking) QEMU to something like OLDQEMU so as not to damage existing installs. I’m sure someone could do it if they tried, but I suspect it’s quite a search and replace, or move the binaries somewhere under /opt. Too much effort for my brain.

I then started building a snap of qemu as oldqemu - which wouldn’t require any “real” forking or renaming. The snap could be called oldqemu but still contain qemu-system-i386 which wouldn’t clash with any existing binaries of the same name as they’d be self-contained inside the compressed snap, and would be launched as oldqemu.qemu-system-i386.

That would make for one package to maintain rather than one per release of Ubuntu. (Which is, as I am sure everyone is aware, one of the primary advantages of making snaps instead of debs in the first place.)

Anyway, I got stuck with another technical challenge in the time I allowed myself to make the oldqemu snap. I might re-visit it, especially as I could leverage the Launchpad Build farm to make multiple architecture builds for me to share.

…or not

In the meantime, the instructions are above, and also (roughly) in the comment I left on the issue, which has kindly been re-opened.

Now, about that Windows 3.11 Mastodon client…

Alan PopeVirtual Zane Lowe for Spotify

tl;dr

I bodged together a Python script using Spotipy (not a typo) to feed me #NewMusicDaily in a Spotify playlist.

No AI/ML, all automated, “fresh” tunes every day. Tunes that I enjoy get preserved in a Keepers playlist; those I don’t like to get relegated to the Sleepers playlist.

Any tracks older than eleven days are deleted from the main playlist, so I automatically get a constant flow of new stuff.

My personal Zane Lowe in a box

Nutshell

  1. The script automatically populates this Virtual Zane Lowe playlist with semi-randomly selected songs that were released within the last week or so, no older (or newer).
  2. I listen (exclusively?) to that list for a month, signaling songs I like by hitting a button on Spotify.
  3. Every day, the script checks for ’expired’ songs whose release date has passed by more than 11 days.
  4. The script moves songs I don’t like to the Sleepers playlist for archival (and later analysis), and to stop me hearing them.
  5. It moves songs I do like to the Keepers playlist, so I don’t lose them (and later analysis).
  6. Goto 1.

I can run the script at any time to “top up” the playlist or just let it run regularly to drip-feed me new music, a few tracks at a time.

Clearly, once I have stashed some favourites away in the Keepers pile, I can further investigate those artists, listen to their other tracks, and potentially discover more new music.

Below I explain at some length how and why.

NoCastAuGast

I spent an entire month without listening to a single podcast episode in August. I even unsubscribed from everything and deleted all the cached episodes.

Aside: Fun fact: The Apple Podcasts app really doesn’t like being empty and just keeps offering podcasts it knows I once listened to despite unsubscribing. Maybe I’ll get back into listening to these shows again, but music is on my mind for now.

While this is far from a staggering feat of human endeavour in the face of adversity, it was a challenge for me, given that I listened to podcasts all the time. This has been detailed in various issues of my personal email newsletter, which goes out on Fridays and is archived to read online or via RSS.

In August, instead, I re-listened to some audio books I previously enjoyed and re-listened to a lot of music already present on my existing Spotify playlists. This became a problem because I got bored with the playlists. Spotify has an algorithm that can feed me their idea of what I might want, but I decided to eschew their bot and make my own.

Note: I pay for Spotify Premium, then leveraged their API and built my “application” against that platform. I appreciate some people have Strong Opinions™️ about Spotify. I have no plans to stop using Spotify anytime soon. Feel free to use whatever music service you prefer, or self-host your 64-bit, 192 kHz Hi-Res Audio from HDTracks through an Elipson P1 Pre-Amp & DAC and Cary Audio Valve MonoBlok Power Amp in your listening room. I don’t care.

I’ll be here, listening on my Apple AirPods, or blowing the cones out of my car stereo. Anyway…

I spent the month listening to great (IMHO) music, predominantly released in the (distant) past on playlists I chronically mis-manage. On the other hand, my son is an expert playlist curator, a skill he didn’t inherit from me. I suspect he “gets the aux” while driving with friends, partly due to his Spotify playlist mastery.

As I’m not a playlist charmer, I inevitably got bored of the same old music during August, so I decided it was time for a change. During the month of September, my goal is to listen to as much new (to me) music as I can and eschew the crusty playlists of 1990s Brit-pop and late-70s disco.

How does one discover new music though?

Novel solutions

I wrote a Python script.

Hear me out. Back in the day, there was an excellent desktop music player for Linux called Banshee. One of the great features Banshee users loved was “Smart Playlists.” This gave users a lot of control over how a playlist was generated. There was no AI, no cloud, just simple signals from the way you listen to music that could feed into the playlist.

Watch a youthful Jorge Castro from 13 years ago do a quick demo.

Jorge Demonstrating the awesome power of Smart Playlists in Banshee (RIP in Peace)

Aside: Banshee was great, as were many other Mono applications like Tomboy and F-Spot. It’s a shame a bunch of blinkered, paranoid, noisy, and wrong Linux weirdos chased the developers away, effectively killing off those excellent applications. Good job, Linux community.

Hey ho. Moving on. Where was I…

Spotify clearly has some built-in, cloud-based “smarts” to create playlists, recommendations, and queues of songs that its engineers and algorithm think I might like. There’s a fly in the ointment, though, and her name is Alexa.

No, Alexa, NO!

We have a “Smart” speaker in the kitchen; the primary music consumers are not me. So “my” listening history is now somewhat tainted by all the Chase Atlantic & Central Cee my son listens to and Michael (fucking) Bublé, my wife, enjoys. She enjoys it so much that Bublé has featured on my end-of-year “Spotify Unwrapped” multiple times.

I’m sure he’s a delightful chap, but his stuff differs from my taste.

I had some ideas to work around all this nonsense. My goals here are two-fold.

  1. I want to find and enjoy some new music in my life, untainted by other house members.
  2. Feed the Spotify algorithm with new (to me) artists, genres and songs, so it can learn what else I may enjoy listening to.

Obviously, I also need to do something to muzzle the Amazon glossy screen of shopping recommendations and stupid questions.

The bonus side-quest is learning a bit more Python, which I completed. I spent a few hours one evening on this project. It was a fun and educational bit of hacking during time I might otherwise use for podcast listening. The result is four hundred or so lines of Python, including comments. My code, like my blog, tends to be a little verbose because I’m not an expert Python developer.

I’m pretty positive primarily professional programmers potentially produce petite Python.

Not me!

Noodling

My script uses the Spotify API via Spotipy to manage an initially empty, new, “dynamic” playlist. In a nutshell, here’s what the python script does with the empty playlist over time:

  • Use the Spotify search API to find tracks and albums released within the last eleven days to add to the playlist. I also imposed some simple criteria and filters.
    • Tracks must be accessible to me on a paid Spotify account in Great Britain.
    • The maximum number of tracks on the playlist is currenly ninety-four, so there’s some variety, but not too much as to be unweildy. Enough for me to skip some tracks I don’t like, but still have new things to listen to.
    • The maximum tracks per artist or album permitted on the playlist is three, again, for variety. Initially this was one, but I felt it’s hard to fully judge the appeal of an artist or album based off one song (not you: Black Lace), but I don’t want entire albums on the list. Three is a good middle-ground.
    • The maximum number of tracks to add per run is configurable and was initially set at twenty, but I’ll likely reduce that and run the script more frequently for drip-fed freshness.
  • If I use the “favourite” or “like” button on any track in the list before it gets reaped by the script after eleven days, the song gets added to a more permanent keepers playlist. This is so I can quickly build a collection of newer (to me) songs discovered via my script and curated by me with a single button-press.
  • Delete all tracks released more than eleven days ago if I haven’t favourited them. I chose eleven days to keep it modern (in theory) and fresh (foreshadowing). Technically, the script does this step first to make room for additional new songs.

None of this is set in stone, but it is configurable with variables at the start of the script. I’ll likely be fiddling with these through September until I get it “right,” whatever that means for me. Here’s a handy cut-out-and-keep block diagram in case that helps, but I suspect it won’t.

 +-----------------------------+
 | Spotify (Cloud) |
 | +---------------------+ |
 | | Main Playlist | |
 | +---------------------+ |
 | | | |
 | Like | | Dislike |
 | v | |
 | +---------------------+ |
 | | Keeper Playlist | |
 | +---------------------+ |
 | | |
 | v |
 | +---------------------+ |
 | | Sleeper Playlist | |
 | +---------------------+ |
 +-------------+---------------+
 ^
 |
 v
 +---------------------------+
 | Python Script |
 | +---------------------+ |
 | | Calls Spotify API | |
 | | and Manages Songs | |
 | +---------------------+ |
 +---------------------------+

Next track

The expectation is to run this script automatically every day, multiple times a day, or as often as I like, and end up with a frequently changing list of songs to listen to in one handy playlist. If I don’t like a song, I’ll skip it, and when I do like a song, I’ll likely play it more than once. and maybe click the “Like” icon.

My theory is that the list becomes a mix between thirty and ninety artists who have released albums over the previous rolling week. After the first test search on Tuesday, the playlist contained 22 tracks, which isn’t enough. I scaled the maximum up over the next few days. It’s now at ninety-four. If I exhaust all the music and get bored of repeats, I can always up the limit to get a few new songs.

In fact, on the very first run of the script, the test playlist completely filled with songs from one artist who had just released a new album. That triggered the implementation of the three songs per artist/album rule to reduce that happening.

I appreciate listening to tracks out of sequence, and a full album is different from the artist intended. But thankfully, I don’t listen to a lot of Adele, and the script no longer adds whole albums full of songs to the list. So, no longer a “me” problem.

No AI

I said at the top I’m not using any “AI/ML” in my script, and while that’s true, I don’t control what goes on inside the Spotify datacentre. The script is entirely subject to the whims of the Spotify API as to which tracks get returned to my requests. There are some constraints to the search API query complexity, and limits on what the API returns.

The Spotify API documentation has been excellent so far, as has the Spotipy docs.

Popular songs and artists often organically feature prominently in the API responses. Plus (I presume) artists and labels have financial incentives or an active marketing campaign with Spotify, further skewing search results. Amusingly, the API has an optional (and amusing) “hipster” tag to show the bottom 10% of results (ranked by popularity). I did that once, didn’t much like it, and won’t do it again.

It’s also subject to the music industry publishing music regularly, and licensing it to be streamed via Spotify where I live.

Not quite

With the script as-is, initially, I did not get fresh new tunes every single day as expected, so I had a further fettle to increase my exposure to new songs beyond what’s popular, trending, or tagged “new”. I changed the script to scan the last year of my listening habits to find genres of music I (and the rest of the family) have listened to a lot.

I trimmed this list down (to remove the genre taint) and then fed these genres to the script. It then randomly picks a selection of those genres and queries the API for new releases in those categories.

With these tweaks, I certainly think this script and the resulting playlist are worth listening to. It’s fresher and more dynamic than the 14-year-old playlist I currently listen to. Overall, the script works so that I now see songs and artists I’ve not listened to—or even heard of—before. Mission (somewhat) accomplished.

Indeed, with the genres feature enabled, I could add a considerable amount of new music to the list, but I am trying to keep it a manageable size, under a hundred tracks. Thankfully, I don’t need to worry about the script pulling “Death Metal,” “Rainy Day,” and “Disney” categories out of thin air because I can control which ones get chosen. Thus, I can coerce the selection while allowing plenty of randomness and newness.

I have limited the number of genre-specific songs so I don’t get overloaded with one music category over others.

Not new

There are a couple of wrinkles. One song that popped into the playlist this week is “Never Going Back Again” by Fleetwood Mac, recorded live at The Forum, Inglewood, in 1982. That’s older than the majority of what I listened to in all of August! It looks like Warner Records Inc. released that live album on 21st August 2024, well within my eleven-day boundary, so it’s technically within “The Rules” while also not being fresh, new music.

There’s also the compilation complication. Unfresh songs from the past re-released on “TOP HITS 2024” or “DANCE 2024 100 Hot Tracks” also appeared in my search criteria. For example, “Talk Talk” by Charli XCX, from her “Brat” album, released in June, is on the “DANCE 2024 100 Hot Tracks” compilation, released on 23rd August 2024, again, well within my eleven-day boundary.

I’m in two minds about these time-travelling playlist interlopers. I have never knowingly listened to Charli XCX’s “Brat” album by choice, nor have I heard live versions of Fleetwood Mac’s music. I enjoy their work, but it goes against the “new music” goal. But it is new to me which is the whole point of this exercise.

The further problem with compilations is that they contain music by a variety of artists, so they don’t hit the “max-per-artist” limit but will hit the “max-per-album” rule. However, if the script finds multiple newly released compilations in one run, I might end up with a clutch of random songs spread over numerous “Various Artists” albums, maxing out the playlist with literal “filler.”

I initially allowed compilations, but I’m irrationally bothered that one day, the script will add “The Birdie Song” by Black Lace as part of “DEUTSCHE TOP DISCO 3000 POP GEBURTSTAG PARTY TANZ SONGS ZWANZIG VIERUNDZWANZIG”.

Nein.

I added a filter to omit any “album type: compilation,” which knocks that bopping-bird-based botherer squarely on the bonce.

No more retro Europop compilation complications in my playlist. Alles klar.

Not yet

Something else I had yet to consider is that some albums have release dates in the future. Like a fresh-faced newborn baby with an IDE and API documentation, I assumed that albums published would generally have release dates of today or older. There may be a typo in the release_date field, or maybe stuff gets uploaded and made public ahead of time in preparation for a big marketing push on release_date.

I clearly do not understand the music industry or publishing process, which is fine.

Nuke it from orbit

I’ve been testing the script while I prototyped it, this week, leading up to the “Grand Launch” in September 2024 (next month/week). At the end of August I will wipe the slate (playlist) clean, and start again on 1st September with whatever rules and optimisations I’ve concocted this week. It will almost certainly re-add some of the same tracks back-in after the 31st August “Grand Purge”, but that’s as expected, working as designed. The rest will be pseudo-random genre-specific tracks.

I hope.

Newsletter

I will let this thing go mad each day with the playlist and regroup at the end of September to evaluate how this scheme is going. Expect a follow-up blog post detailing whether this was a fun and interesting excursion or pure folly. Along the way, I did learn a bit more Python, the Spotify API, and some other interesting stuff about music databases and JSON.

So it’s all good stuff, whether I enjoy the music or not.

You can get further, more timely updates in my weekly email newsletter, or view it in the newsletter archive, and via RSS, a little later.

Ken said he got “joy out of reading your newsletter”. YMMV. E&OE. HTH. HAND.

Nomenclature

Every good project needs a name. I initially called it my “Personal Dynamic Playlist of Sixty tracks over Eleven days,” or PDP-11/60 for short, because I’m a colossal nerd. Since bumping the max-tracks limit for the playlist, it could be re-branded PDP-11/94. However, this is a relatively niche and restrictive playlist naming system, so I sought other ideas.

My good friend Martin coined the term “Virtual Zane Lowe” (Zane is a DJ from New Zealand who is apparently renowned for sharing new music). That’s good enough for me. Below are links to all three playlists if you’d like to listen, laugh, live, love, or just look at them.

The “Keepers” and “Sleepers” lists will likely be relatively empty for a few days until the script migrates my preferred and disliked tracks over for safe-keeping & archival, respectively.

November approaches

Come back at the end of the month to see if: My script still works. The selections are good. I’m still listening to this playlist, and most importantly. Whether I enjoy doing so!

If it works, I’ll probably continue using it through October and into November as I commute to and from the office. If that happens, I’ll need to update the playlist artwork. Thankfully, there’s an API for that, too!

I may consider tidying up the script and sharing it online somewhere. It feels a bit niche and requires a paid Spotify account to even function, so I’m not sure what value others would get from it other than a hearty chuckle at my terribad Python “skills.”

One potentially interesting option would be to map the songs in Spotify to another, such as Apple Music or even videos on YouTube. The YouTube API should enable me to manage video playlists that mirror the ones I manage directly on Spotify. That could be a fun further extension to this project.

Another option I considered was converting it to a web app, a service I (and other select individuals) can configure and manage in a browser. I’ll look into that at the end of the month. If the current iteration of the script turns out to be a complete bust, then this idea likely won’t go far, either.

Thanks for reading. AirPods in. Click “Shuffle”.

Ross YoungerBroadcast graphics for fencing

I created a TV graphics package for fencing tournaments.

Earlier this year, Christchurch played host to the Commonwealth Junior & Cadet fencing tournament.

Selected parts of the tournament were livestreamed, with a package broadcast on Sky TV (NZ). The broadcast and finals streams had a live graphics package fed from the scoreboard.

The programmes were produced using a broadcast-spec OB truck supplied by Kiwi Outside Broadcast. The truck graphics PC used Captivate to generate its graphics, which output as key+fill SDI signals. These were fed to the vision mixer and keyed onto the picture in the usual way.

The package can be seen in action on the Commonwealth Junior & Cadet 2024 programmes.

The details read from the scoreboard cover the “hit” lamps, scores, clock (including fractional seconds in the last 10s), period, red/yellow cards, and the priority indicator. On top of that, the package provides a place to enter the fencer names and nationalities, and set colours for them.

This is all made possible by the scoreboard, a Favero FA-07, offering a data feed over an RS-422 interface. I wrote a Python script to parse the data feed, turn it into a JSON dictionary and pass on to Captivate.

Alan PopeText Editors with decent Grammar Tools

This is another blog post lifted wholesale out of my weekly newsletter. I do this when I get a bit verbose to keep the newsletter brief. The newsletter is becoming a blog incubator, which I’m okay with.

A reminder about that newsletter

The newsletter is emailed every Friday - subscribe here, and is archived and available via RSS a few days later.

I talked a bit about the process of setting up the newsletter on episode 34 of Linux Matters Podcast. Have a listen if you’re interested.

Linux Matters 34

Patreon supporters of Linux Matters can get the show a day or so early and without adverts. �

Multiple kind offers

Good news, everyone! I now have a crack team of fine volunteers who proofread the text that lands in your inbox/browser cache/RSS reader. Crucially, they’re doing that review before I send the mail, not after, as was previously the case. Thank you for volunteering, mystery proofreaders.

popey dreamland

Until now, my newsletter “workflow” (such as it was) involved hoping that I’d get it done and dusted by Friday morning. Then, ideally, it would spend some time “in review”, followed by saving to disk. But if necessary, it would be ready to be opened in an emergency text editor at a moment’s notice before emails were automatically sent by lunchtime.

I clearly don’t know me very well.

popey reality

What actually happened is that I would continue editing right up until the moment I sent it out, then bash through the various “post-processing” steps and schedule the emails for “5 minutes from now.” Boom! Done.

This often resulted in typos or other blemishes in my less-than-lovingly crafted emails to fabulous people. A few friends would ping me with corrections. But once the emails are sent, reaching out and fixing those silly mistakes is problematic.

Someone should investigate over-the-air updates to your email. Although zero-day patches and DLC for your inbox sound horrendous. Forget that.

In theory, I could tweak the archived version, but that is not straightforward.

Tool refresh?

Aside: Yes, I know it’s not the tools, but I should slow down, be more methodical and review every change to my document before publishing. I agree. Now, let’s move on.

While preparing the newsletter, I would initially write in Sublime Text (my desktop text editor of choice), with a Grammarly† (affiliate link) LSP extension, to catch my numerous blunders, and re-word my clumsy English.

Unfortunately, the Grammarly extension for Sublime broke a while ago, so I no longer have that available while I prepare the newsletter.

I could use Google Docs, I suppose, where Grammarly still works, augmenting the built-in spell and grammar checker. But I really like typing directly as Markdown in a lightweight editor, not a big fat browser. So I guess I need to figure something else out to check my spelling and grammar prior to the awesome review team getting it to save at least some of my blushes.

I’m not looking for suggestions for a different text editor—or am I? Maybe I am. I might be.

Sure, that’ll fix it.

ZX81 -> Spectrum -> CPC -> edlin -> Edit -> Notepad -> TextPad -> Sublime -> ?

I’ve used a variety of text editors over the years. Yes, the ZX81 and Sinclair Spectrum count as text editors. Yes, I am old.

I love Sublime’s minimalism, speed, and flexibility. I use it for all my daily work notes, personal scribblings, blog posts, and (shock) even authoring (some) code.

I also value Sublime’s data-recovery features. If the editor is “accidentally” terminated or a power-loss event occurs, Sublime reliably recovers its state, retaining whatever you were previously editing.

I regularly use Windows, Linux, and macOS on any given day across multiple computers. So, a cross-platform editor is also essential for me, but only on the laptop/desktop, as I never edit on mobile‡ devices.

I typically just open a folder as a “workspace” in a window or an additional tab in one window. I frequently open many folders, each full of files across multiple displays and machines.

All my notes are saved in folders that use Syncthing to keep in sync across machines. I leave all of those notes open for days, perhaps weeks, so having a robust sync tool combined with an editor that auto-reloads when files change is key.

Their notes are separately backed up, so cloud storage isn’t essential for my use case.

Something else?

Whatever else I pick, it’s really got to fit that model and requirements, or it’ll be quite a stretch for me to change. One option I considered and test-drove is NotepadNext. It’s an open-source re-implementation of Notepad++, written in C++ and Qt.

A while back, I packaged up and published it as a snap, to make it easy to install and update. It fits many of the above requirements already, with the bonus of being open-source, but sadly, there is no Grammarly support there either.

I’d prefer no :::: W I D E - L O A D :::: electron monsters. Also, not Notion or Obsidian, as I’ve already tried them, and I’m not a fan. In addition, no, not Vim or Emacs.

Bonus points if you have a suggestion where one of the selling points isn’t “AI”§.

Perhaps there isn’t a great plain text editor that fulfills all my requirements. I’m open to hearing suggestions from readers of this blog or the newsletter. My contact details are here somewhere.


† - Please direct missives about how terrible Grammarly is to /dev/null. Thanks. Further, suggestions that I shouldn’t rely on Grammarly or other tools and should just “Git Gud” (as the youths say) may be inserted into the A1481 on the floor.

‡ - I know a laptop is technically a “mobile” device.

§ - Yes, I know that “Not wanting AI” and “Wanting a tool like Grammarly” are possibly conflicting requirements.

â—‡ - For this blog post I copy and pasted the entire markdown source into a Google doc, used all the spelling and grammar tools, then pasted it back into Sublime, pushed to git, and I’m done. Maybe that’s all I need to do? Keep my favourite editor, and do all the grammar in one chunk at the end in a tab of a browser I already had open anyway. Beat that!

Andy SmithDaniel Kitson – Collaborator (work in progress)

Collaborators

Last night we went to see Daniel Kitson's "Collaborator" (work in progress). I'd no idea what to expect but it was really good!

A photo of the central area of a small theatre in the round. There are          four tiers of seating and then an upper balcony.Most seats are filled.          The central stage area is empty except for four large stacks of          paper.
The in-the-round setup of Collaborator at The Albany Theatre, Deptford, London

It has been reviewed at 4/5 stars in Chortle and positively in the Guardian, but I don't recommend reading any reviews because they'll spoil what you will experience. We went in to it blind as I always prefer that rather than thorough research of a show. I think that was the correct decision. I've been on Daniel's fan newsletter for ages but hadn't had chance to see him live until now.

While I've seen some comedy gigs that resembled this, I've never seen anything quite like it.

At £12 a ticket this is an absolute bargain. We spent more getting there by public transport!

Shout out to the nerds

If you're a casual comedy enjoyer looking for something a bit different then that's all you need to know. If like me however you consider yourself a bit of a wanky appreciator of comedy as an art form, I have some additional thoughts!

Collaborator wasn't rolling-on-the-floor-in-tears funny, but was extremely enjoyable and Jenny and I spent the whole way home debating how Kitson designed it and what parts of it really meant. Not everyone wants that in comedy, and that's fine. I don't always want it either. But to get it sometimes is a rare treat.

It's okay to enjoy a McIntyre or Peter Kay crowd-pleaser about "do you have a kitchen drawer full of junk?" or "do you remember white dog poo?" but it's also okay to appreciate something that's very meta and deconstructive. Stewart Lee for example is often accused of being smug and arrogant when he critiques the work of other comedians, and his fans to some extent are also accused of enjoying feeling superior more than they enjoy a laugh - and some of them who miss the point definitely are like this.

But acts like Kitson and Lee are constructed personalities where what they claim to think and how they behave is a fundamental part of the performance. You are to some extent supposed to disagree with and be challenged by their views and behaviours — and I don't just mean they are edgelording with some "saying the things that can't be said" schtick. Sometimes it's fun to actually have thoughts about it. It's a different but no less valid (or more valid!) experience. A welcome one in this case!

I mean, I might have to judge you if you enjoy Mrs Brown's Boys, but I accept it has an audience as an art form.

White space

There was a comment on Fedi about how the crowd pictured here appears to be a sea of white faces, despite London being a fairly diverse city. This sort of thing hasn't escaped me. I've found it to be the case in most of the comedy gigs I've attended in person, where the performer is white. I don't know why. In fact, acts like Stewart Lee and Richard Herring will frequently make reference to the fact that their stereotypical audience member is a middle aged white male computer toucher with lefty London sensibilities. So, me then.

Don't get me wrong, I do try to see some diverse acts and have been in a demographic minority a few times. Sadly enough, just going to see a female act can be enough to put you in an audience of mostly women. That happened when we went to see Bridget Christie's Who Am I? ("a menopause laugh a minute with a confused, furious, sweaty woman who is annoyed by everything", 4 stars, Chortle), and it's a shame that people seem to stick in their lanes so much.

References

Josh HollandEven more on git scratch branches: using Jujutsu

Even more on git scratch branches: using Jujutsu

This is the third post in an impromptu series:

  1. Use a scratch branch in git
  2. More on git scratch branches: using stgit

It seems the main topic of this blog is now git scratch branches and ways to manage them, although the main prompt for this one is discovering someone else had exactly the same idea, as I found from a blog post extolling Jujutsu.

I don’t have much to add to the posts from qword and Sandy, beyond the fact that Jujutsu really is the perfect tool to make this workflow straightforward. The default change selection logic in jj rebase means that 9 times out of 10 it’s enough just to run jj rebase -d master to get everything up to date with the master branch, and the Jujutsu workflow as a whole really is a great experience.

So go forth, use Jujutsu to manage your dev branch, and hopefully I’ll never have to write another post on this, and you can have the traditional “I rewrote my blogging engine from scratch again” post that I’ve been owing for a month or two now.

Ross YoungerFault-finding at the ends of the earth

This is a tale from many months ago, working on an embedded ARM target.

In my private journal I wrote:

Today I feel like I saddled up and rode my horse to the literal ends of the earth. I was fault-finding in the setting-up-of-the-universe that happens before your program starts up, and in the tearing-it-down-again that happens after you declare you’re finished.

If you know C++, you might guess that this was a story about static object allocation and deallocation. You’d be right. So, destructors belonging to static-allocated objects. You’d never think they’d run on a bare-metal embedded target.

Well, they can. If your target supports exit() - e.g. if you are running with newlib - then an atexit handler is set up for you, and that will be set up to run the static destructors. If your program then calls exit() (as, say, your on-silicon unit tests might, at the end of a test run) then things are at risk of turning to custard.

You might have enabled an interrupt for some peripheral on the silicon. In order to do anything really useful, the ISR might reference a static object. If you do this, you’d damn well better make sure the object has a static destructor that disables the interrupt, or hilarity is one day going to ensue. You know, the sort of hilarity that involves being savaged by a horde of angry rampaging badgers, or your socks catching fire.

But wait, I hear you say, it called exit! The program no longer exists! Well, sure it doesn’t; but what happens on exit? On this particular ARM target, running tests via a debugger as part of a CI chain, when the atexit handlers have run the process signals final completion with a semihosting call, which is a special flavour of debug breakpoint. It is… not fast. If your interrupt happens regularly, the goblins are going to get you before the pseudo-system-call completes. Your test framework will fail the test executable for hanging, despite somehow passing all of its tests.

There was an actual bug in there, and it was mine. Class X, which contained an RTOS queue and enabled an interrupt, only had a default destructor. On exit, somewhen between static destructors and completion of the semihosting exit call, the ISR fired. It duly failed to insert an item into the now-destroyed queue, so jumped to the internal panic routine. That routine contained a breakpoint and then went nowhere fast, waiting for a debugger command that was never going to arrive — hence the time-out. Maybe it would have been useful to have a library option to skip the static destructors, but I probably wouldn’t have been aware of it ahead of time anyway.

The static destructor ordering fiasco can also be yours for the taking, but thankfully that hadn’t bitten me. Nevertheless, it was a rough day.

Cover image: Cyber Bug Search, by juicy_fish on Freepik

Chris WallaceNon-Eulerian paths

I’ve been doing a bit of work on Non-Eulerian paths.  I haven't made any algorithmic progress...

Chris WallaceMore Turtle shapes

I’ve come across some classic curves using arcs which can be described in my variant of Turtle...

Phil SpencerWho’s sick of this shit yet?

I find some headlines just make me angry these days, especially ones centered around hyper late stage capitalism.


This one about Apple and Microsoft just made me go “Who the fuck cares?” and seriously, why should I care. those two idiot companies having insane and disgustingly huge market caps isn’t something I’m impressed by.

If anything it makes me furious.

Do something useful besides making iterations of the same ol junk. Make a few thousand houses, make an affordable grocery supply chain.

If you’re doing anything else you’re a waste of everyones time….as I type this on my Apple computer. Still, that bit of honesty aside I don’t give a fuck about either companies made up valuation.

Ross YoungerThe Road Less Travelled

I’ve started an occasional YouTube series about quirky, off-the-beaten-track places.

At the time of writing there are 5 episodes online. I’m still experimenting with style and techniques. My inspiration is a combination of The Tim Traveller and Tom Scott.

So far it consists of a few random places around South Island. There’s no particular timescale for publishing new episodes; I have many demands on my time, and this project isn’t yielding an income at the moment. (It may never; I’m doing it for fun.) Presenting is taking some getting used to!

Here’s the playlist:

Phil SpencerNew year new…..This

I have made a new years goal to retire this server before March, the OS has been upgraded many many times over the years and various software I’ve used has come and gone so there is lots of cruft. This server/VM started in San Fransisco and then my provider stopped offering VMs in CA and moved my VM to the UK which is where it has been ever since. This VM started its life in Jan 2008 and it is time to die.

During my 2 week xmas break I have been updating web facing software as much as I could so that when I do put the bullet in the head of this thing I can transfer my blog, wiki, and a couple other still active sites to the new OS without minimal tweaking in the new home.

So far the biggest issues I ran into were with my mediawiki, that entire site is very old, from around 2006 2 years before I started hosting it for someone and then I inherited it entirely around 2009 so the database is very finicky to upgrade and some of the extensions are no longer maintained. What I ended up doing was setting up a docker instance at home to test upgrading and working through the kinks and I have put together a solid step by step on how to move/upgrade it to latest.

I have also gotten sick of running my own e-mail servers, the spam management, certificates, block lists…..it’s annoying. I found out recently that iCloud which I already have a subscription to allows up to 5 custom e-mail domains so I retired my Philtopia e-mail to it early in December and as of today I moved the vo-wiki domain to it as well. Much less hassle for me, I already work enough for work I don’t need to work at home as well.

The other work continues, site by site but I think I am on track to put an end to this ol server early in the year.

Phil Spencer8bit party

It’s been a few years…four? since my Commodore 64 collection started and I’ve now got 2 working C64’s and a C128 that functions along with 2 disk drives, a tape drive and a collection of addon hardware and boxed games.

That isn’t all I am collecting however I also have my Nintendo Entertainment System and even more recently I acquired a Sega Master System. The 8bit era really seems to catch my eye far more than anything that came after. I suppose it’s because the whole era made it on hacks and luck.

In any case here are some pictures of my collection, I don’t collect for the sake of collecting. Everything I have I use or play cause otherwise why bother having it?

Enjoy

My desk
NES
Sega Master System
Commodore 64
Games

Phil SpencerI think it’s time the blog came back

It’s been a while since I’ve written a blog post, almost 4 years in fact but I think it is time for a comeback.

The reason for this being that social media has become so locked down you can’t actually give a valid opinion about something without someone flagging your comment or it being caught by a robot. Oddly enough it seems the right wing folks can say whatever they want against the immigrant villain of the month or LGTBQIA+ issues without being flagged but if you dare stand up to them or offer an opposing opinion. 30 day ban!

So it is time to dust off the ol blog and put my opinions to paper somewhere else just like the olden days before social media! It isn’t all bad of course, I’ve found mastodon quite open to opinions but the fediverse is getting a lot of corporate attention these days and i’m sure it’s only a year or two before even that ends up a complete mess.

Crack open the blogs and let those opinions fly

Paul RaynerPrint (only) my public IP

Every now and then, I need to know my public IP. The easiest way to find it is to visit one of the sites which will display it for you, such as https://whatismyip.com. Whilst useful, all of the ones I know (including that one) are chock full of adverts, and can’t easily be scraped as part of any automated scripting.

This has been a minor irritation for years, so the other night I decided to fix it.

http://ip.pr0.uk is my answer. It’s 50 lines of rust, and is accessible via tcp on port 11111, and via http on port 8080.

use std::io::Write;

use std::net::{IpAddr, Ipv4Addr, Ipv6Addr, SocketAddr, TcpListener, TcpStream};
use chrono::Utc;
use threadpool::ThreadPool;

fn main() {
    let worker_count = 4;
    let pool = ThreadPool::new(worker_count);
    let tcp_port = 11111;
    let socket_v4_tcp = SocketAddr::new(IpAddr::V4(Ipv4Addr::new(0, 0, 0, 0)), tcp_port);

    let http_port = 8080;
    let socket_v4_http = SocketAddr::new(IpAddr::V4(Ipv4Addr::new(0, 0, 0, 0)), http_port);

    let socket_addrs = vec![socket_v4_tcp, socket_v4_http];
    let listener = TcpListener::bind(&socket_addrs[..]);
    if let Ok(listener) = listener {
        println!("Listening on {}:{}", listener.local_addr().unwrap().ip(), listener.local_addr().unwrap().port());
        for stream in listener.incoming() {
            let stream = stream.unwrap();
            let addr =stream.peer_addr().unwrap().ip().to_string();
            if stream.local_addr().unwrap_or(socket_v4_http).port() == tcp_port {
                pool.execute(move||send_tcp_response(stream, addr));
            } else {
                //http might be proxied via https so let anything which is not the tcp port be http
                pool.execute(move||send_http_response(stream, addr));
            }
        }
    } else {
        println!("Unable to bind to port")
    }
}

fn send_tcp_response(mut stream:TcpStream, addr:String) {
    stream.write_all(addr.as_bytes()).unwrap();
}

fn send_http_response(mut stream:TcpStream, addr:String) {

    let html = format!("<html><head><title>{}</title></head><body><h1>{}</h1></body></html>", addr, addr);
    let length = html.len();
    let response = format!("HTTP/1.1 200 OK\r\nContent-Length: {length}\r\n\r\n{html}" );
    stream.write_all(response.as_bytes()).unwrap();
    println!("{}\tHTTP\t{}",Utc::now().to_rfc2822(),addr)
}

A little explanation is needed on the array of SocketAddr. This came from an initial misreading of the docs, but I liked the result and decided to keep it that way. Calls to listen() will only listen on one port - the first one in the array which is free. The result is that when you run this program, it listens on port 11111. If you keep it running and start another copy, that one listens on port 80 (because it can’t bind to port 11111). So to run this on my server, I just have systemd keep 2 copies alive at any time.

The code and binaries for Linux and Windows are available on Github.

Next steps

I might well leave it there. It works for me, so it’s done. Here are some things I could do though:

1) Don’t hard code the ports 2) Proxy https 3) make a client 4) make it available as a binary for anyone to run on crates.io 5) Optionally print the ttl. This would be mostly useful to people running their own instance.

Boring Details

Logging

I log the IP, port, and time of each connection. This is just in case it ever gets flooded and I need to block an IP/range. The code you see above is the code I run. No browser detection, user agent or anythign like that is read or logged. Any data you send with the connection is discarded. If I proxied https via nginx, that might log a bit of extra data as a side effect.

Systemd setup

There’s not much to this either. I have a template file:

[Unit]
Description=Run the whatip binary. Instance %i
After=network.target

[Service]
ExecStart=/path/to/whatip
Restart=on-failure

StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=whatip%i

[Install]
WantedBy=multi-user.target

stored at /etc/systemd/system/whatip@.service and then set up two instances to run:

systemctl enable whatip@1
systemctl enable whatip@2

Thanks for reading

David Leadbeater"[31m"?! ANSI Terminal security in 2023 and finding 10 CVEs

A paper detailing how unescaped output to a terminal could result in unexpected code execution, on many terminal emulators. This research found around 10 CVEs in terminal emulators across common client platforms.

Alun JonesMessing with web spiders

You've surely heard of ChatGPT and its ilk. These are massive neural networks trained using vast swathes of text. The idea is that if you've trained a network on enough - about 467 words

Alun JonesI wrote a static site generator

Back in 2019, when Google+ was shut down, I decided to start writing this blog. It seemed better to take my ramblings under my own control, rather than posting content - about 716 words

Alex HudsonJobs in the AI Future

Everyone is talking about what AI can do right now, and the impact that it is likely to have on us. This weekends’s Semafor Flagship (which is an excellent newsletter; I recommend subscribing!) asks a great question: “What do we teach the AI generation?”. As someone who grew up with computers, knowing he wanted to write software, and knowing that tech was a growth area, I never had to grapple with this type of worry personally. But I do have kids now. And I do worry. I’m genuinely unsure what I would recommend a teenager to do today, right now. But here’s my current thinking.

Paul RudkinYour new post

Your new post

This is a new blog post. You can author it in Markdown, which is awesome.

David LeadbeaterNAT-Again: IRC NAT helper flaws

A Linux kernel bug allows unencrypted NAT'd IRC sessions to be abused to access resources behind NAT, or drop connections. Switch to TLS right now. Or read on.

Paul RaynerPutting dc in (chroot) jail

A little over 4 years ago, I set up a VM and configured it to offer dc over a network connection using xinetd. I set it up at http://dc.pr0.uk and made it available via a socket connection on port 1312.

Yesterday morning I woke to read a nice email from Sylvan Butler pointing out that users could run shell commands from dc…

I had set up the dc command to run as a user “dc”, but still, if someone could run a shell command they could, for example, put a key in the dc user’s .ssh config, run sendmail (if it was set up), try for privelidge escalations to get root etc.

I’m not sure what the 2017 version of me was thinking (or wasn’t), but the 2022 version of me is not happy to leave it like this. So here’s how I put dc in jail.

Firstly, how do you run shell commands from dc? It’s very easy. Just prefix with a bang:

$ dc
!echo "I was here" > /tmp/foo
!cat /tmp/foo
I was here

So, really easy. Even if it was hard, it would still be bad.

This needed to be fixed. Firstly I thought about what else was on the VM - nothing that matters. This is a good thing because the helpful Sylvan might not have been the first person to spot the issue (although network dc is pretty niche). I still don’t want this vulnerability though as someone else getting access to this box could still use it to send spam, host malware or anything else they wanted to do to a cheap tiny vm.

I looked at restricting the dc user further (it had no login shell, and no home directory already), but it felt like I would always be missing something, so I turned to chroot jails.

A chroot jail lets you run a command, specifying a directory which is used as / for that command. The command (In theory) can’t escape that directory, so can’t see or touch anything outside it. Chroot is a kernel feature, and forms a basic security feature of linux, so should be good enough to protect network dc if set up correctly, even if it’s not perfect.

Firstly, let’s set up the directory for the jail. We need the programs to run inside the jail, and their dependent libraries. The script to run a networked dc instance looks like this:

#!/bin/bash
dc --version
sed -u -e 's/\r/\n/g' | dc

Firstly, I’ve used bash here, but this script is trivial, so it can use sh instead. We also need to keep the sed (I’m sure there are plenty of ways to do the replace not using sed, but it’s working fine as it is). For each of the 3 programs we need to run the script, I ran ldd to get their dependencies:

$ ldd /usr/bin/dc
	linux-vdso.so.1 =>  (0x00007fffc85d1000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fc816f8d000)
	/lib64/ld-linux-x86-64.so.2 (0x0000555cd93c8000)
$ ldd /bin/sh
	linux-vdso.so.1 =>  (0x00007ffdd80e0000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fa3c4855000)
	/lib64/ld-linux-x86-64.so.2 (0x0000556443a1e000)
$ ldd /bin/sed
	linux-vdso.so.1 =>  (0x00007ffd7d38e000)
	libselinux.so.1 => /lib/x86_64-linux-gnu/libselinux.so.1 (0x00007faf5337f000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007faf52fb8000)
	libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x00007faf52d45000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007faf52b41000)
	/lib64/ld-linux-x86-64.so.2 (0x0000562e5eabc000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007faf52923000)
$

So we copy those files to the exact directory structure inside the jail directory. Afterwards it looks like this:

$ ls -alR
.:
total 292
drwxr-xr-x 4 root root   4096 Feb  5 10:13 .
drwxr-xr-x 4 root root   4096 Feb  5 09:42 ..
-rwxr-xr-x 1 root root  47200 Feb  5 09:50 dc
-rwxr-xr-x 1 root root     72 Feb  5 10:13 dctelnet
drwxr-xr-x 3 root root   4096 Feb  5 09:49 lib
drwxr-xr-x 2 root root   4096 Feb  5 09:50 lib64
-rwxr-xr-x 1 root root  72504 Feb  5 09:58 sed
-rwxr-xr-x 1 root root 154072 Feb  5 10:06 sh

./lib:
total 12
drwxr-xr-x 3 root root 4096 Feb  5 09:49 .
drwxr-xr-x 4 root root 4096 Feb  5 10:13 ..
drwxr-xr-x 2 root root 4096 Feb  5 10:01 x86_64-linux-gnu

./lib/x86_64-linux-gnu:
total 2584
drwxr-xr-x 2 root root    4096 Feb  5 10:01 .
drwxr-xr-x 3 root root    4096 Feb  5 09:49 ..
-rwxr-xr-x 1 root root 1856752 Feb  5 09:49 libc.so.6
-rw-r--r-- 1 root root   14608 Feb  5 10:00 libdl.so.2
-rw-r--r-- 1 root root  468920 Feb  5 10:00 libpcre.so.3
-rwxr-xr-x 1 root root  142400 Feb  5 10:01 libpthread.so.0
-rw-r--r-- 1 root root  146672 Feb  5 09:59 libselinux.so.1

./lib64:
total 168
drwxr-xr-x 2 root root   4096 Feb  5 09:50 .
drwxr-xr-x 4 root root   4096 Feb  5 10:13 ..
-rwxr-xr-x 1 root root 162608 Feb  5 10:01 ld-linux-x86-64.so.2
$

and here is the modified dctelnet command:

#!/sh
#dc | dos2unix 2>&1
./dc --version
./sed -u -e 's/\r/\n/g' | ./dc

I’ve switched to using sh instead of bash, and all of the commands are now relative paths, as they are just in the root directory.

First attempt

Now I have a directory that I can use for a chrooted dc network dc. I need to set up the xinetdconfig to use chroot and the jail I have set up:

service dc
{
	disable		= no
	type		= UNLISTED
	id		= dc-stream
	socket_type	= stream
	protocol	= tcp
	server		= /usr/sbin/chroot
	server_args	= /home/dc/ ./dctelnet
	user		= root
	wait		= no
	port		= 1312
	rlimit_cpu	= 60
	env		= HOME=/ PATH=/
}

I needed to set the HOME and PATH environment variables otherwise (not sure whether it was sh,sed or dc causing it) I got a segfault, and to run chroot, you need to be root, so I could no longer run the service as the user dc. This shouldn’t be a problem because the resulting process is constrained.

A bit more security

Chroot jails have a reputation for being easy to get wrong, and they are not something I have done a lot of work with, so I want to take a bit of time to think about whether I’ve left any glaring holes, and also try to improve on the simple option above a bit if I can.

Firstly, can dc still execute commands with the ! operation?

 ~> nc -v dc.pr0.uk 1312
Connection to dc.pr0.uk 1312 port [tcp/*] succeeded!
dc (GNU bc 1.06.95) 1.3.95

Copyright 1994, 1997, 1998, 2000, 2001, 2004, 2005, 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE,
to the extent permitted by law.
!ls
^C⏎

Nope. Ok, that’s good. The chroot jail has sh though, and has it in the PATH, so can it still get a shell and call dc, sh and sed?

 ~> nc -v dc.pr0.uk 1312
Connection to dc.pr0.uk 1312 port [tcp/*] succeeded!
dc (GNU bc 1.06.95) 1.3.95

Copyright 1994, 1997, 1998, 2000, 2001, 2004, 2005, 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE,
to the extent permitted by law.
!pwd
^C⏎

pwd is a builtin, so it looks like the answer is no, but why? Running strings on my version of dc, there is no mention of sh or exec, but there is a mention of system. From the man page of system:

The system() library function uses fork(2) to create a child process that executes the shell  command  specified in command using execl(3) as follows:

           execl("/bin/sh", "sh", "-c", command, (char *) 0);

So dc calls system() when you use !, which makes sense. system() calls /bin/sh, which does not exist in the jail, breaking the ! call.

For a system that I don’t care about, that is of little value to anyone else, that sees very little traffic, that’s probably good enough, but I want to make it a bit better - if there was a problem with the dc program, or you could get it to pass something to sed, and trigger an issue with that, you could mess with the jail file system, overwrite the dc application, and likely break out of jail as the whole thing is running as root.

So I want to do two things. Firstly, I don’t want dc running as root in the jail. Secondly, I want to throw away the environment after each use, so if you figure out how to mess with it you don’t affect anyone else’s fun.

Here’s a bash script which I think does both of these things:

#!/bin/bash
set -e
DCDIR="$(mktemp -d /tmp/dc_XXXX)"
trap '/bin/rm -rf -- "$DCDIR"' EXIT
cp -R /home/dc/ $DCDIR/
cd $DCDIR/dc
PATH=/
HOME=/
export PATH
export HOME
/usr/sbin/chroot --userspec=1001:1001 . ./dctelnet
  • Line 2 - set -e causes the script to exit on the first error
  • Lines 3 & 4 - make a temporary directory to run in, then set a trap to clean it up when the script exits.
  • I then copy the required files for the jail to the new temp directory, set $HOME and SPATH and run the jail as an unprivileged user (uid 1001).

Now to make some changes to the xinetd file:

service dc
{
        disable         = no
        type            = UNLISTED
        id              = dc-stream
        socket_type     = stream
        protocol        = tcp
        server          = /usr/local/bin/dcinjail
        user            = root
        wait            = no
        port            = 1312
        rlimit_cpu      = 60
        log_type        = FILE /var/log/dctelnet.log
        log_on_success  = HOST PID DURATION
        log_on_failure  = HOST
}

The new version just runs the script from above. It still needs to run as root to be able to chroot.

I’ve also added some logging as this has piqued my interest and I want to see how many people (other than me) ever connect, and for how long.

As always, I’m interested in feedback or questions. I’m no expert in this setup so may not be able to answer questions, but if you see something that looks wrong (or that you know is wrong), please let me know. I’m also interested to hear other ways of process isolation - I know I could have used containers, and think I could have used systemd or SELinux features (or both) to further lock down the dc user and achive a similar result.

Thanks for reading.

Christopher RobertsFixing SVG Files in DokuWiki

Featured Image

Having upgraded a DokuWiki server from 16.04 to 18.04, I found that SVG images were no longer displaying in the browser. As I was unable to find any applicable answers on-line, I thought I should break my radio silence by detailing my solution.

Inspecting the file using browser tools, Network and refreshing the page showed that the file was being downloaded as octet-stream. Sure enough using curl showed the same.

curl -Ik https://example.com/file.svg

All the advice on-line is to ensure that /etc/nginx/mime-types includes the line:

image/svg+xml   svg svgz;

But that was already in place.

I decided to try uploading the SVG file again, in case the Inkscape format was causing breakage. Yes, a long-shot indeed.

The upload was rejected by DokuWiki, as SVG was not in the list of allowed file extensions; so I added the following line to /var/www/dokuwiki/conf/mime.local.conf:

svg   image/svg_xml

Whereon the images started working again. Presumably Dokuwiki was seeing the mime-type as image/svg instead of image/svg+xml and this mismatch was preventing nginx serving up the correct content-type.

Hopefully this will help others, do let me know if it has helped you.

Paul RaynerSnakes and Ladders, Estimation and Stats (or 'Sometimes It Takes Ages')

Snakes And Ladders

Simple kids game, roll a dice and move along a board. Up ladders, down snakes. Not much to it?

We’ve been playing snakes and ladders a bit (lot) as a family because my 5 year old loves it. Our board looks like this:

Some games on this board take a really long time. My son likes to play games till the end, so until all players have finished. It’s apparently really funny when everyone else has finished and I keep finding the snakes over and over. Sometimes one player finishes really quickly - they hit some good ladders, few or no snakes and they are done in no time.

This got me thinking. What’s the distribution of game lengths for snakes and ladders? How long should we expect a game to take? How long before we typically have a winner?

Fortunately for me, snakes and ladders is a very simple game to model with a bit of python code.

Firstly, here are the rules we play:

1) Each player rolls a normal 6 sided dice and moves their token that number of squares forward. 2) If a player lands on the head of a snake, they go down the snake 3) If a player lands on the bottom of a ladder, they go up to the top of the ladder. 4) If a player rolls a 6, they get another roll 5) On this board, some ladders and snakes interconnect - the bottom of a snake is the head of another, or the top of a ladder is also the head of a snake. When this happens, you do all of the actions in turn, so down both snakes or up the ladder, down the snake. 6) You don’t need an exact roll to finish, once you get 100 or more, you are done.

To model the board in python, all we really need are the coordinates of the snakes and the ladders - their starting and ending squares.

def get_snakes_and_ladders():

   snakes = [
        (96,27),
        (88,66),
        (89,46),
        (79,44),
        (76,19),
        (74,52),
        (57,3),
        (60,39),
        (52,17),
        (50,7),
        (32,15),
        (30,9)
    ]
    ladders = [
        (6,28),
        (10,12),
        (18,37),
        (40,42),
        (49,67),
        (55,92),
        (63,76),
        (61,81),
        (86,94)
    ]
    return snakes + ladders

Since snakes and ladders are both mappings from one point to another, we can combine them in one array as above.

The game is moddeled with a few lines of python:

class Game:

    def __init__(self) -> None:
        self.token = 1
        snakes_and_ladders_list = get_snakes_and_ladders()
        self.sl = {}
        for entry in snakes_and_ladders_list:
            self.sl[entry[0]] = entry[1]

    def move(self, howmany):
        self.token += howmany
        while (self.token in self.sl):
            self.token = self.sl[self.token]
        return self.token

    def turn(self):
        num = self.roll()
        self.move(num)
        if num == 6:
            self.turn()
        if self.token>=100:
            return True
        return False

    def roll(self):
        return randint(1,6)

A turn consists of all the actions taken by a player before the next player gets their turn. This can consist of multiple moves if the player rolles one or more sixes, as rolling a six gives you another move.

With this, we can run some games and plot them. Here’s what a sample looks like.

The Y axis is the position on the board, and the X axis is the number of turns. This small graphical representation of the game shows how variable it can be. The red player finishes in under 20 moves, whereas the orange player takes over 80.

To see how variable it is, we can run the simulation a large number of times and look at the results. Running for 10,000 games we get the following:

function result
min 5
max 918
mean 90.32
median 65

So the fastest finish in 10,000 games was just 5 turns, and the slowest was an awful (if you were rolling the dice) 918 turns.

Here are some histograms for the distribution of game lengths, the distribution of number of turns for a player to win in a 3 person game, and the number of turns for all players to finish in a 3 person game.

The python code for this post is at snakes.py

Alex HudsonIntroduction to the Metaverse

You’ve likely heard the term “metaverse” many times over the past few years, and outside the realm of science fiction novels, it has tended to refer to some kind of computer-generated world. There’s often little distinction between a “metaverse” and a relatively interactive virtual reality world.

There are a huge number of people who think this simply a marketing term, and Facebook’s recent rebranding of its holding company to “Meta” has only reinforced this view. However, I think this view is wrong, and I hope to explain why.

Alex HudsonIt's tough being an Azure fan

Azure has never been the #1 cloud provider - that spot continues to belong to AWS, which is the category leader. However, in most people’s minds, it has been a pretty reasonable #2, and while not necessarily vastly differentiated from AWS there are enough things to write home about.

However, even as a user and somewhat of a fan of the Azure technology, it is proving increasing difficult to recommend.

Josh HollandMore on git scratch branches: using stgit

More on git scratch branches: using stgit

I wrote a short post last year about a useful workflow for preserving temporary changes in git by using a scratch branch. Since then, I’ve come across stgit, which can be used in much the same way, but with a few little bells and whistles on top.

Let’s run through a quick example to show how it works. Let’s say I want to play around with the cool new programming language Zig and I want to build the compiler myself. The first step is to grab a source code checkout:

$ git clone https://github.com/ziglang/zig
        Cloning into 'zig'...
        remote: Enumerating objects: 123298, done.
        remote: Counting objects: 100% (938/938), done.
        remote: Compressing objects: 100% (445/445), done.
        remote: Total 123298 (delta 594), reused 768 (delta 492), pack-reused 122360
        Receiving objects: 100% (123298/123298), 111.79 MiB | 6.10 MiB/s, done.
        Resolving deltas: 100% (91169/91169), done.
        $ cd zig
        

Now, according to the instructions we’ll need to have CMake, GCC or clang and the LLVM development libraries to build the Zig compiler. On NixOS it’s usual to avoid installing things like this system-wide but instead use a file called shell.nix to specify your project-specific dependencies. So here’s the one ready for Zig (don’t worry if you don’t understand the Nix code, it’s the stgit workflow I really want to show off):

$ cat > shell.nix << EOF
        { pkgs ? import <nixpkgs> {} }:
        pkgs.mkShell {
          buildInputs = [ pkgs.cmake ] ++ (with pkgs.llvmPackages_12; [ clang-unwrapped llvm lld ]);
        }
        EOF
        $ nix-shell
        

Now we’re in a shell with all the build dependencies, and we can go ahead with the mkdir build && cd build && cmake .. && make install steps from the Zig build instructions1.

But now what do we do with that shell.nix file?

$ git status
        On branch master
        Your branch is up to date with 'origin/master'.
        
        Untracked files:
          (use "git add <file>..." to include in what will be committed)
                shell.nix
        
        nothing added to commit but untracked files present (use "git add" to track)
        

We don’t really want to add it to the permanent git history, since it’s just a temporary file that is only useful to us. But the other options of just leaving it there untracked or adding it to .git/info/exclude are unsatisfactory as well: before I started using scratch branches and stgit, I often accidentally deleted my shell.nix files which were sometimes quite annoying to have to recreate when I needed to pin specific dependency versions and so on.

But now we can use stgit to take care of it!

$ stg init # stgit needs to store some metadata about the branch
        $ stg new -m 'add nix config'
        Now at patch "add-nix-config"
        $ stg add shell.nix
        $ stg refresh
        Now at patch "add-nix-config"
        

This little dance creates a new commit adding our shell.nix managed by stgit. You can stg pop it to unapply, stg push2 to reapply, and stg pull to do a git pull and reapply the patch back on top. The main stgit documentation is helpful to explain all the possible operations.

This solves all our problems! We have basically recreated the scratch branch from before, but now we have pre-made tools to apply, un-apply and generally play around with it. The only problem is that it’s really easy to accidentally push your changes back to the upstream branch.

Let’s have another example. Say I’m sold on the stgit workflow, I have a patch at the bottom of my stack adding some local build tweaks and, on top of that, a patch that I’ve just finished working on that I want to push upstream.

$ cd /some/other/project
        $ stg series # show all my patches
        + add-nix-config
        > fix-that-bug
        

Now I can use stg commit to turn my stgit patch into a real immutable git commit that stgit isn’t going to mess around with any more:

$ stg commit fix-that-bug
        Popped fix-that-bug -- add-nix-config
        Pushing patch "fix-that-bug" ... done
        Committed 1 patch
        Pushing patch "add-nix-config ... done
        Now at patch "add-nix-config"
        

And now what we should do before git pushing is stg pop -a to make sure that we don’t push add-nix-config or any other local stgit patches upstream. Sadly it’s all too easy to forget that, and since stgit updates the current branch to point at the current patch, just doing git push here will include the commit representing the add-nix-config patch.

The way to prevent this is through git’s hook system. Save this as pre-push3 (make sure it’s executable):

#!/bin/bash
        # An example hook script to verify what is about to be pushed.  Called by "git
        # push" after it has checked the remote status, but before anything has been
        # pushed.  If this script exits with a non-zero status nothing will be pushed.
        #
        # This hook is called with the following parameters:
        #
        # $1 -- Name of the remote to which the push is being done
        # $2 -- URL to which the push is being done
        #
        # If pushing without using a named remote those arguments will be equal.
        #
        # Information about the commits which are being pushed is supplied as lines to
        # the standard input in the form:
        #
        #   <local ref> <local sha1> <remote ref> <remote sha1>
        
        remote="$1"
        url="$2"
        
        z40=0000000000000000000000000000000000000000
        
        while read local_ref local_sha remote_ref remote_sha
        do
            if [ "$local_sha" = $z40 ]
            then
                # Handle delete
                :
            else
                # verify we are on a stgit-controlled branch
                git show-ref --verify --quiet "${local_ref}.stgit" || continue
                if [ $(stg series --count --applied) -gt 0 ]
                then
                    echo >&2 "Unapplied stgit patch found, not pushing"
                    exit 1
                fi
            fi
        done
        
        exit 0
        

Now we can’t accidentally4 shoot ourselves in the foot:

$ git push
        Unapplied stgit patch found, not pushing
        error: failed to push some refs to <remote>
        

Happy stacking!


  1. At the time of writing, Zig depends on the newly-released LLVM 12 toolchain, but this hasn’t made it into the nixos-unstable channel yet, so this probably won’t work on your actual NixOS machine.↩︎

  2. an unfortunate naming overlap between pushing onto a stack and pushing a git repo↩︎

  3. A somewhat orthogonal but also useful tip here so that you don’t have to manually add this to every repository is to configure git’s core.hooksDir to something like ~/.githooks and put it there.↩︎

  4. You can always pass --no-verify if you want to bypass the hook.↩︎

Jon FautleyUsing the Grafana Cloud Agent with Amazon Managed Prometheus, across multiple AWS accounts

Observability is all the rage these days, and the process of collecting metrics is getting easier. Now, the big(ger) players are getting in on the action, with Amazon releasing a Managed Prometheus offering and Grafana now providing a simplified “all-in-one” monitoring agent. This is a quick guide to show how you can couple these two together, on individual hosts, and incorporating cross-account access control. The Grafana Cloud Agent Grafana Labs have taken (some of) the best bits of the Prometheus monitoring stack and created a unified deployment that wraps the individual moving parts up into a single binary.

Footnotes