Planet BitFolk

BitFolk WikiMonitoring

Setup: NRPE example config

← Older revision Revision as of 11:17, 19 November 2024
Line 21: Line 21:


These sorts of checks can work without an agent (i.e. without anything installed on your VPS). More complicated checks such as disk space, load or anything else that you can check with a script will need some sort of agent such as an [https://exchange.nagios.org/directory/Addons/Monitoring-Agents/NRPE--2D-Nagios-Remote-Plugin-Executor/details NRPE] daemon or [[Wikipedia:SNMP|SNMP]] daemon.
These sorts of checks can work without an agent (i.e. without anything installed on your VPS). More complicated checks such as disk space, load or anything else that you can check with a script will need some sort of agent such as an [https://exchange.nagios.org/directory/Addons/Monitoring-Agents/NRPE--2D-Nagios-Remote-Plugin-Executor/details NRPE] daemon or [[Wikipedia:SNMP|SNMP]] daemon.
===NRPE===
NRPE is a typical agent you would run that would allow BitFolk's monitoring system to execute health checks on your VPS. On Debian/Ubuntu systems it can be installed from the package '''nagios-nrpe-server'''. This will normally pull in the package '''monitoring-plugins-basic''' which contains the check plugins.
Check plugins end up in the '''/usr/lib/nagios/plugins/''' directory. NRPE can run any of these when asked and feed the info back to BitFolk's Icinga. All of the existing ones should support a <code>--help</code> argument to let you know how to use them, e.g.
<syntaxhighlight lang="text">
$ /usr/lib/nagios/plugins/check_tcp --help
</syntaxhighlight>
You can run check plugins from the command line:
<syntaxhighlight lang="text">
$ /usr/lib/nagios/plugins/check_tcp -H 85.119.82.70 -p 443
TCP OK - 0.000 second response time on 85.119.82.70 port 443|time=0.000322s;;;0.000000;10.000000
</syntaxhighlight>
There are a large number of Nagios-compatible check plugins in existence so you should be able to find one that does what you. If there's not, it's easy to write one. Here's an example of using '''check_disk''' to check the disk space of your root filesystem.
<syntaxhighlight lang="text">
$ /usr/lib/nagios/plugins/check_disk -w '10%' -c '4%' -p /
DISK OK - free space: / 631 MB (11% inode=66%);| /=5035MB;5381;5739;0;5979
</syntaxhighlight>
Once you have that working, you put it in an NRPE config file such as '''/etc/nagios/nrpe.d/xvda1.cfg'''.
<syntaxhighlight lang="text">
command[check_xvda1]=/usr/lib/nagios/plugins/check_disk -w '10%' -c '4%' -p /
</syntaxhighlight>
You should then tell BitFolk (in a support ticket) what the name of it is ("'''check_xvda1'''"). It will then get added to BitFolk's Icinga.
By this means you can check anything you can script.


==Alerts==
==Alerts==

BitFolk WikiUser:Equinox/WireGuard

← Older revision Revision as of 20:33, 13 November 2024
Line 1: Line 1:


'''STOP !!!!!!!!'''


'''This is NOT READY ! I guarantee it won't work yet. (Mainly the routing, also table=off needs research, I can't remember exactly what it does).'''
'''STOP !!!!!'''


'''Is is untried / untested. It's a first draft fished out partly from my running system and partly my notes.'''
'''This is a first draft! If you're a hardy network type who can recover from errors / omissions in this page then go for it (and fix this page!)'''
 
'''HOWEVER...''' If you are a hardy network type and you want a go ... Have at it.




Line 32: Line 29:
PrivateKey = # Insert the contents of the file server-private-key generated above
PrivateKey = # Insert the contents of the file server-private-key generated above
ListenPort = # Pick an empty UDP port to listen on. Remember to open it in your firewall
ListenPort = # Pick an empty UDP port to listen on. Remember to open it in your firewall
Address = 10.254.1.254/24
Address = 10.254.1.254/32
Address = 2a0a:1100:1018:1::fe/64
Address = 2a0a:1100:1018:1::fe/128
Table = off
</syntaxhighlight>
</syntaxhighlight>


Line 53: Line 49:
Address = 10.254.1.1/24
Address = 10.254.1.1/24
Address = 2a0a:1100:1018:1::1/64
Address = 2a0a:1100:1018:1::1/64
Table = off
</syntaxhighlight>
</syntaxhighlight>


Line 60: Line 55:
=== Add the Client Information to the Server ===
=== Add the Client Information to the Server ===


Append the following to the server /etc/wireguard/wg0.conf, inserting your generated information where appropriate:
Append the following to the '''server''' /etc/wireguard/wg0.conf, inserting your generated information where appropriate:


<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
Line 73: Line 68:
=== Add the Server Information to the Client ===
=== Add the Server Information to the Client ===


Append the following to the server /etc/wireguard/wg0.conf, inserting your generated information where appropriate:
Append the following to the '''client''' /etc/wireguard/wg0.conf, inserting your generated information where appropriate:


<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
Line 93: Line 88:
# systemctl start wg-quick@wg0
# systemctl start wg-quick@wg0
# systemctl enable wg-quick@wg0      # Optional - start VPN at startup
# systemctl enable wg-quick@wg0      # Optional - start VPN at startup
</syntaxhighlight>
If all went well you should now have a working tunnel. Confirm by running:
<syntaxhighlight lang="text">
# wg
</syntaxhighlight>
If both sides have a reasonable looking "latest handshake" line then the tunnel is up.
The wg-quick scripts automatically set up routes / default routes based on the contents of the wg0.conf files, so at this point you can test the link by pinging addresses from either side.
Two further things may/will need to be configured to allow full routing...
==== Enable IP Forwarding ====
Edit /etc/sysctl.conf, or a local conf file in /etc/sysctl.d/ and enable IPv4 and/or IPv6 forwarding
<syntaxhighlight lang="text">
net.ipv4.ip_forward=1
net.ipv6.conf.all.forwarding=1
</syntaxhighlight>
Reload the kernel variables:
<syntaxhighlight lang="text">
systemctl reload procps
</syntaxhighlight>
==== WireGuard Max MTU Size ====
If some websites don't work properly over IPv6 (Netflix) you may be running into MTU size problems. If using nftables, this can be entirely fixed with the following line in the forward table:
<syntaxhighlight lang="text">
oifname "wg0" tcp flags syn tcp option maxseg size set rt mtu
</syntaxhighlight>
==== Enable Forwarding In Your Firewall ====
You will have to figure out how to do this for your flavour of firewall. If you happen to be using nftables, the following snippet is an example (by no means a full config!) of how to forward IPv6 back and forth. This snippet allows all outbound traffic but throws incoming traffic to the table "ipv6-incoming-firewall" for further filtering. (For testing you could just "accept" but don't leave it like that!)
<syntaxhighlight lang="text">
chain ip6-forwarding {
    type filter hook forward priority 0; policy drop;
    oifname "wg0" tcp flags syn tcp option maxseg size set rt mtu
    ip6 saddr 2a0a:1100:1018:1::/64 accept
    ip6 daddr 2a0a:1100:1018:1::/64 jump ipv6-incoming-firewall
}
</syntaxhighlight>
</syntaxhighlight>


Jon SpriggsTalk Summary – OggCamp ’24 – Kubernetes, A Guide for Docker users

Format: Theatre Style room. ~30 attendees.

Slides: Available to view (Firefox/Chrome recommended – press “S” to see the required speaker notes)

Video: Not recorded. I’ll try to record it later, if I get a chance.

Slot: Graphine 1, 13:30-14:00

Notes: Apologies for the delay on posting this summary. The talk was delivered to a very busy room. Lots of amazing questions. The presenter notes were extensive, but entirely unused when delivered. One person asked a question, I said I’d follow up with them later, but didn’t find them before the end of the conference. One person asked about the benefits of EKS over ECS in AWS… as I’ve not used ECS, I couldn’t answer, but it sounds like they largely do the same thing.

Andy SmithProtecting URIs from Tor nodes with the Apache HTTP Server

Recently I found one of my web services under attack from clients using Tor.

For the most part I am okay with the existence of Tor, but if you're being attacked largely or exclusively through Tor then you might need to take actions like:

  • Temporarily or permanently blocking access entirely.
  • Taking away access to certain privileged functions.

Here's how I did it.

Step 1: Obtain a list of exit nodes

Tor exit nodes are the last hop before reaching regular Internet services, so traffic coming through Tor will always have a source IP of an exit node.

Happily there are quite a few services that list Tor nodes. I like https://www.dan.me.uk/tornodes which can provide a list of exit nodes, updated hourly.

This comes as a list of IP addresses one per line so in order to turn it into an httpd access control list:

$ curl -s 'https://www.dan.me.uk/torlist/?exit' |
    sed 's/^/Require not ip /' |
    sudo tee /etc/apache2/tor-exit-list.conf >/dev/null

This results in a file like:

$ head -10 /etc/apache2/tor-exit-list.conf
Require not ip 102.130.113.9
Require not ip 102.130.117.167
Require not ip 102.130.127.117
Require not ip 103.109.101.105
Require not ip 103.126.161.54
Require not ip 103.163.218.11
Require not ip 103.164.54.199
Require not ip 103.196.37.111
Require not ip 103.208.86.5
Require not ip 103.229.54.107

Step 2: Configure httpd to block them

Totally blocking traffic from these IPs would be easier than what I decided to do. If you just wanted to totally block traffic from Tor then the easy and efficient answer would be to insert all these IPs into an nftables set or an iptables IP set.

For me, it's only some URIs on my web service that I don't want these IPs accessing and I wanted to preserve the ability of Tor's non-abusive users to otherwise use the rest of the service. An httpd access control configuration is necessary.

Inside the virtualhost configuration file I added:

    <Location /some/sensitive/thing>
        <RequireAll>
            Require all granted
            Include /etc/apache2/tor-exit-list.conf
        </RequireAll>
    </Location>

Step 3: Test configuration and reload

It's a good idea to check the correctness of the httpd configuration now. Aside from syntax errors in the list of IP addresses, this might catch if you forgot any modules necessary for these directives. Although I think they are all pretty core.

Assuming all is well then a graceful reload will be needed to make httpd see the new configuration.

$ sudo apache2ctl configtest
Syntax OK
$ sudo apache2ctl graceful

Step 4: Further improvements

Things can't be left there, but I haven't got around to any of this yet.

  1. Script the repeated download of the Tor exit node list. The list of active Tor nodes will change over time.
  2. Develop some checks on the list such as:
    1. Does it contain only valid IP addresses?
    2. Does it contain at least min number of addresses and less than max number?
  3. If the list changed, do the config test and reload again. httpd will not include the altered config file without a reload.
  4. If the list has not changed in x number of days, consider the data source stale and think about emptying the list.

Performance thoughts

I have not checked how much this impacts performance. My service is not under enough load for this to be noticeable for me.

At the moment the Tor exit node list is around 2,100 addresses and I don't know how efficient the Apache HTTP Server is about a large list of Require not ip directives. Worst case is that for every request to that URI it will be scanning sequentially through to the end of the list.

I think that using httpd's support for DBM files in RewriteMaps might be quite efficient but this comes with the significant issue that IPv6 addresses have multiple formats, while a DBM lookup will be doing a literal text comparison.

For example, all of the following represent the same IPv6 address:

  • 2001:db8::
  • 2001:0DB8::
  • 2001:Db8:0000:0000:0000:0000:0000:0000
  • 2001:db8:0:0:0:0:0:0

httpd does have built-in functions to upper- or lower-case things, but not to compress or expand an IPv6 address. httpd access control directives are also able to match the request IP against a CIDR net block, although at the moment Dan's Tor node list does only contain individual IP addresses. At a later date one might like to try to aggregate those individual IP addresses into larger blocks.

httpd's RewriteMaps can also query an SQL server. Querying a competent database implementation like PostgreSQL could be made to alleviate some of those concerns if the data were represented properly, though this does start to seem like an awful lot of work just for an access control list!

Over on Fedi, it was suggested that a firewall rule — presumably using an nftables set or iptables IP set, which are very efficient — could redirect matching source IPs to a separate web server on a different port, which would then do the URI matching as necessary.

<nerdsnipe>There does not seem to be an Apache HTTP Server authz module for IP sets. That would be the best of both worlds!</nerdsnipe>

BitFolk WikiIPv6/VPNs

Using WireGuard

← Older revision Revision as of 16:31, 31 October 2024
Line 19: Line 19:
== Using WireGuard ==
== Using WireGuard ==
Probably the more sensible choice in the 2020s, but, help?
Probably the more sensible choice in the 2020s, but, help?
[[User:Equinox/WireGuard|Not ready yet, but it's a start]]


== Using tincd ==
== Using tincd ==

BitFolk WikiUser:Equinox/WireGuard

Created page with " '''STOP !!!!!!!!''' '''This is NOT READY ! I guarantee it won't work yet. (Mainly the routing, also table=off needs research, I can't remember exactly what it does).''' '''Is is untried / untested. It's a first draft fished out partly from my running system and partly my notes.''' '''HOWEVER...''' If you are a hardy network type and you want a go ... Have at it. You read the bit about STOP above, didn't you.. WireGuard is part of the Linux kernel and is minimal..."

New page


'''STOP !!!!!!!!'''

'''This is NOT READY ! I guarantee it won't work yet. (Mainly the routing, also table=off needs research, I can't remember exactly what it does).'''

'''Is is untried / untested. It's a first draft fished out partly from my running system and partly my notes.'''

'''HOWEVER...''' If you are a hardy network type and you want a go ... Have at it.





You read the bit about STOP above, didn't you.. WireGuard is part of the Linux kernel and is minimal and lightweight, gaining what can be interpreted as [https://lwn.net/ml/linux-kernel/CA+55aFz5EWE9OTbzDoMfsY2ez04Qv9eg0KQhwKfyJY0vFvoD3g@mail.gmail.com/ approval] by Linus Torvalds himself. WireGuard is almost symmetrical in operation so there aren't really servers and clients, just peers. It operates over the UDP protocol and while there is an initial handshake when "connecting", nothing is transmitted between the peers unless necessary. (There is a keep-alive option to defeat NAT gateway timeouts). WireGuard is fast: It can saturate my home broadband connection where OpenVPN could not.

This WireGuard guide section was created using Debian 12 and 'wg-quick' at both ends of the VPN. As they say, YMMV - use what you can, adapt where necessary. Let's stick to the example requirements set above for this guide. (WireGuard can be used in other configurations, for example, to route everything through the VPN except the VPN traffic itself).

=== Server Setup ===

<syntaxhighlight lang="text">
# apt install wireguard
# cd /etc/wireguard
# wg genkey > server-private-key
# wg pubkey < server-private-key > server-public-key
# wg genpsk > shared-secret-key # Optional for improved security
</syntaxhighlight>

Edit /etc/wireguard/wg0.conf

<syntaxhighlight lang="text">
[Interface]
PrivateKey = # Insert the contents of the file server-private-key generated above
ListenPort = # Pick an empty UDP port to listen on. Remember to open it in your firewall
Address = 10.254.1.254/24
Address = 2a0a:1100:1018:1::fe/64
Table = off
</syntaxhighlight>

=== Client Setup ===

<syntaxhighlight lang="text">
# apt install wireguard
# cd /etc/wireguard
# wg genkey > client-private-key
# wg pubkey < client-private-key > client-public-key
</syntaxhighlight>

Edit /etc/wireguard/wg0.conf

<syntaxhighlight lang="text">
[Interface]
PrivateKey = # Insert the contents of the file client-private-key generated above
Address = 10.254.1.1/24
Address = 2a0a:1100:1018:1::1/64
Table = off
</syntaxhighlight>

Now we have a wg0 interface defined on both ends with statically assigned IPv4 and IPv6 addresses.

=== Add the Client Information to the Server ===

Append the following to the server /etc/wireguard/wg0.conf, inserting your generated information where appropriate:

<syntaxhighlight lang="text">
# Client Name
[Peer]
PublicKey = # Insert contents of client-public-key generated in Client Setup
# PresharedKey = # Uncomment and insert contents of shared-secret-key generated in Server Setup - if you did
AllowedIPs = 10.254.1.0/24 # Source IPv4 addresses from the client side allowed to transmit to the server
AllowedIPs = 2a0a:1100:1018:1::/64 # Source IPv6 addresses from the client side allowed to transmit to the server
</syntaxhighlight>

=== Add the Server Information to the Client ===

Append the following to the server /etc/wireguard/wg0.conf, inserting your generated information where appropriate:

<syntaxhighlight lang="text">
# Your VPS Name
[Peer]
PublicKey = # Insert contents of server-public-key generated in Server Setup
# PresharedKey = # Uncomment and insert contents of shared-secret-key generated in Server Setup - if you did
Endpoint = 85.119.82.121:PORT-NUMBER # Replace PORT-NUMBER with the UDP port number you chose in Server Setup ListenPort
# PersistentKeepalive = 170 # Uncomment and choose a timeout in seconds for keepalive packets - Necessary if the client is behind NAT
AllowedIPs = 0.0.0.0/0, ::/0 # Allow any IP from the server side to transmit to the client
</syntaxhighlight>

=== Start The VPN ===

Run on both sides:

<syntaxhighlight lang="text">
# systemctl daemon-reload # Have systemd find the new wg0.conf files
# systemctl start wg-quick@wg0
# systemctl enable wg-quick@wg0 # Optional - start VPN at startup
</syntaxhighlight>

=== Useful Information / Commands ===

Status of all WireGuard interfaces:

<syntaxhighlight lang="text">
# wg
</syntaxhighlight>

wg-quick is a higher level system to manage wg, wg being the low level kernel-based part. The wg0.conf files made above are a hybrid format with some options targeted at wg-quick and some at wg. When wg-quick talks to wg it strips any wg-quick only options from the config before submitting it.

There are useful wg-quick options that can be added to wg0.conf, particularly this one:

<syntaxhighlight lang="text">
PostUp = /etc/wireguard/wg0-postup.sh
</syntaxhighlight>

... into which you can write any complex routing setup you like.

=== Android Client ===

To add an [https://play.google.com/store/apps/details?id=com.wireguard.android Android client] using a QR code to scan for configuring it, create a fully configured android-client.conf file for your new Android client just as a client wg0.conf was created above. Install qrencode and then:

<syntaxhighlight lang="text">
# qrencode -t png -o qr-code.png -r android-client.conf
</syntaxhighlight>

BitFolk WikiIPv6/VPNs

Update for nmew #48s and 2020s

Show changes

Chris WallaceMoving from exist-db 3.0.1 to 6.0.1 6.2.0

Moving from exist-db 3.0.1 to 6.0.1 6.2.0That’s an awful lot of release notes to read through...

David LeadbeaterRestrict sftp with Linux user namespaces

A script to restrict SFTP to some directories, without needing chroot or other privileged configuration.

Andy SmithGenerating a link-local address from a MAC address in Perl

Example

On the host

$ ip address show dev eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether aa:00:00:4b:a0:c1 brd ff:ff:ff:ff:ff:ff
[…]
    inet6 fe80::a800:ff:fe4b:a0c1/64 scope link 
       valid_lft forever preferred_lft forever

Generated by script

$ lladdr.pl aa:00:00:4b:a0:c1
fe80::a800:ff:fe4b:a0c

Code

#!/usr/bin/env perl

use warnings;
use strict;
use 5.010;

if (not defined $ARGV[0]) {
    die "Usage: $0 MAC-ADDRESS"
}

my $mac = $ARGV[0];

if ($mac !~ /^
    \p{PosixXDigit}{2}:
    \p{PosixXDigit}{2}:
    \p{PosixXDigit}{2}:
    \p{PosixXDigit}{2}:
    \p{PosixXDigit}{2}:
    \p{PosixXDigit}{2}
    /ix) {
    die "'$mac' doesn't look like a MAC address";
}

my @octets = split(/:/, $mac);

# Algorithm:
# 1. Prepend 'fe80::' for the first 64 bits of the IPv6
# 2. Next 16 bits: Use first octet with 7th bit flipped, and second octet
#    appended
# 3. Next 16 bits: Use third octet with 'ff' appended
# 4. Next 16 bits: Use 'fe' with fourth octet appended
# 5. Next 16 bits: Use 5th octet with 6th octet appended
# = 128 bits.
printf "fe80::%x%02x:%x:%x:%x\n",
    hex($octets[0]) ^ 2,
    hex($octets[1]),
    hex($octets[2] . 'ff'),
    hex('fe' . $octets[3]),
    hex($octets[4] . $octets[5]);

See also

Jon SpriggsThis little #bash script will make capturing #output from lots of #scripts a lot easier

A while ago, I was asked to capture a LOT of data for a support case, where they wanted lots of commands to be run, like “kubectl get namespace” and then for each namespace, get all the pods with “kubectl get pods -n $namespace” and then describe each pod with “kubectl get pod -n namespace $podname”. Then do the same with all the services, deployments, ingresses and endpoints.

I wrote this function, and a supporting script to execute the actual checks, and just found it while clearing up!

#!/bin/bash

filename="$(echo $* | sed -E -e 's~[ -/\\]~_~g').log"
echo "\$ $@" | tee "${filename}"
$@ 2>&1 | tee -a "${filename}"

This script is quite simple, it does three things

  1. Take the command you’re about to run, strip all the non-acceptable-filename characters out and replace them with underscores, and turn that into the output filename.
  2. Write the command into the output file, replacing any prior versions of that file
  3. Execute the command, and append the log to the output file.

So, how do you use this? Simple

log_result my-command --with --all --the options

This will produce a file called my-command_--with_--all_--the_options.log that contains this content:

$ my-command --with --all --the options
Congratulations, you ran my-command and turned on the options "--with --all --the options". Nice one!

… oh, and the command I ran to capture the data for the support case?

log_result kubectl get namespace
for TYPE in pod ingress service deployment endpoints
do
  for ns in $(kubectl get namespace | grep -v NAME | awk '{print $1}' )
  do
    echo $ns
    for item in $(kubectl get $TYPE -n $ns | grep -v NAME | awk '{print $1}')
    do
      log_result kubectl get $TYPE -n $ns $item -o yaml
      log_result kubectl describe $TYPE -n $ns $item
    done
  done
done

Featured image is “Travel log texture” by “Mary Vican” on Flickr and is released under a CC-BY license.

BitFolk Issue TrackerBitFolk - Feature #216 (New): Add phishing-resistant authentication for https://panel.bitfolk.com/

It would be a good thing to have the option protecting ones https://panel.bitfolk.com/ account using a phishing-resistant form of authentication such as WebAuthn/Passkeys.

In the case of WebAuthn this could either be implementerad as a second factor using "plain" WebAuthn or as the primary factor by relying on WebAuthn Discoverable keys. The latter is what has came to be referred to as Passkeys.

This request is somewhat of a follow-up to https://tools.bitfolk.com/redmine/issues/117. While TOTP 2FA offers a good protection against weak and reused passwords it tend to fall short against a modern phishing attack, where the provided TOTP value can be proxied together with the provided password.

Jon SpriggsWhy (and how) I’ve started writing my Shell Scripts in Python

I’ve been using Desktop Linux for probably 15 years, and Server Linux for more like 25 in one form or another. One of the things you learn to write pretty early on in Linux System Administration is Bash Scripting. Here’s a great example

#!/bin/bash

i = 0
until [ $i -eq 10 ]
do
  print "Jon is the best!"
  (( i += 1 ))
done

Bash scripts are pretty easy to come up with, you just write the things you’d type into the interactive shell, and it does those same things for you! Yep, it’s pretty hard not to love Bash for a shell script. Oh, and it’s portable too! You can write the same Bash script for one flavour of Linux (like Ubuntu), and it’s probably going to work on another flavour of Linux (like RedHat Enterprise Linux, or Arch, or OpenWRT).

But. There comes a point where a Bash script needs to be more than just a few commands strung together.

At work, I started writing a “simple” installer for a Kubernetes cluster – it provisions the cloud components with Terraform, and then once they’re done, it then starts talking to the Kubernetes API (all using the same CLI tools I use day-to-day) to install other components and services.

When the basic stuff works, it’s great. When it doesn’t work, it’s a bit of a nightmare, so I wrote some functions to put logs in a common directory, and another function to gracefully stop the script running when something fails, and then write those log files out to the screen, so I know what went wrong. And then I gave it to a colleague, and he ran it, and things broke in a way that didn’t make sense for either of us, so I wrote some more functions to trap that type of error, and try to recover from them.

And each time, the way I tested where it was working (or not working) was to just… run the shell script, and see what it told me. There had to be a better way.

Enter Python

Python earns my vote for a couple of reasons (and they might not be right for you!)

  • I’ve been aware of the language for some time, and in fact, had patched a few code libraries in the past to use Ansible features I wanted.
  • My preferred IDE (Integrated Desktop Environment), Visual Studio Code, has a step-by-step debugger I can use to work out what’s going on during my programming
  • It’s still portable! In fact, if anything, it’s probably more portable than Bash, because the version of Bash on the Mac operating system – OS X is really old, so lots of “modern” features I’d expect to be in bash and associate tooling isn’t there! Python is Python everywhere.
  • There’s an argument parsing tool built into the core library, so if I want to handle things like ./myscript.py --some-long-feature "option-A" --some-long-feature "option-B" -a -s -h -o -r -t --argument I can do, without having to remember how to write that in Bash (which is a bit esoteric!)
  • And lastly, for now at least!, is that Python allows you to raise errors that can be surfaced up to other parts of your program

Given all this, my personal preference is to write my shell scripts now in Python.

If you’ve not written python before, variables are written without any prefix (like you might have seen $ in PHP) and any flow control (like if, while, for, until) as well as any functions and classes use white-space indentation to show where that block finishes, like this:

def do_something():
  pass

if some_variable == 1:
  do_something()
  and_something_else()
  while some_variable < 2:
    some_variable = some_variable * 2

Starting with Boilerplate

I start from a “standard” script I use. This has a lot of those functions I wrote previously for bash, but with cleaner code, and in a way that’s a bit more understandable. I’ll break down the pieces I use regularly.

Starting the script up

Here’s the first bit of code I always write, this goes at the top of everything

#!/usr/bin/env python3
import logging
logger = logging

This makes sure this code is portable, but is always using Python3 and not Python2. It also starts to logging engine.

At the bottom I create a block which the “main” code will go into, and then run it.

def main():
  logger.basicConfig(level=logging.DEBUG)
  logger.debug('Started main')

if __name__ == "__main__":
    main()

Adding argument parsing

There’s a standard library which takes command line arguments and uses them in your script, it’s called argparse and it looks like this:

#!/usr/bin/env python3
# It's convention to put all the imports at the top of your files
import argparse
import logging
logger = logging

def process_args():
  parser=argparse.ArgumentParser(
    description="A script to say hello world"
  )

  parser.add_argument(
    '--verbose', # The stored variable can be found by getting args.verbose
    '-v',
    action="store_true",
    help="Be more verbose in logging [default: off]"
  )

  parser.add_argument(
    'who', # This is a non-optional, positional argument called args.who
    help="The target of this script"
  )
  args = parser.parse_args()

  if args.verbose:
      logger.basicConfig(level=logging.DEBUG)
      logger.debug('Setting verbose mode on')
  else:
      logger.basicConfig(level=logging.INFO)

  return args

def main():
  args=process_args()

  print(f'Hello {args.who}')
  # Using f'' means you can include variables in the string
  # You could instead do printf('Hello %s', args.who)
  # but I always struggle to remember in what order I wrote things!

if __name__ == "__main__":
    main()

The order you put things in makes a lot of difference. You need to have the if __name__ == "__main__": line after you’ve defined everything else, but then you can put the def main(): wherever you want in that file (as long as it’s before the if __name__). But by having everything in one file, it feels more like those bash scripts I was talking about before. You can have imports (a bit like calling out to other shell scripts) and use those functions and classes in your code, but for the “simple” shell scripts, this makes most sense.

So what else do we do in Shell scripts?

Running commands

This is class in it’s own right. You can pass a class around in a variable, but it has functions and properties of it’s own. It’s a bit chunky, but it handles one of the biggest issues I have with bash scripts – capturing both the “normal” output (stdout) and the “error” output (stderr) without needing to put that into an external file you can read later to work out what you saw, as well as storing the return, exit or error code.

# Add these extra imports
import os
import subprocess

class RunCommand:
    command = ''
    cwd = ''
    running_env = {}
    stdout = []
    stderr = []
    exit_code = 999

    def __init__(
      self,
      command: list = [], 
      cwd: str = None,
      env: dict = None,
      raise_on_error: bool = True
    ):
        self.command = command
        self.cwd = cwd
        
        self.running_env = os.environ.copy()

        if env is not None and len(env) > 0:
            for env_item in env.keys():
                self.running_env[env_item] = env[env_item]

        logger.debug(f'exec: {" ".join(command)}')

        try:
            result = subprocess.run(
                command,
                cwd=cwd,
                capture_output=True,
                text=True,
                check=True,
                env=self.running_env
            )
            # Store the result because it worked just fine!
            self.exit_code = 0
            self.stdout = result.stdout.splitlines()
            self.stderr = result.stderr.splitlines()
        except subprocess.CalledProcessError as e:
            # Or store the result from the exception(!)
            self.exit_code = e.returncode
            self.stdout = e.stdout.splitlines()
            self.stderr = e.stderr.splitlines()

        # If verbose mode is on, output the results and errors from the command execution
        if len(self.stdout) > 0:
            logger.debug(f'stdout: {self.list_to_newline_string(self.stdout)}')
        if len(self.stderr) > 0:
            logger.debug(f'stderr: {self.list_to_newline_string(self.stderr)}')

        # If it failed and we want to raise an exception on failure, record the command and args
        # then Raise Away!
        if raise_on_error and self.exit_code > 0:
            command_string = None
            args = []
            for element in command:
                if not command_string:
                    command_string = element
                else:
                    args.append(element)

            raise Exception(
                f'Error ({self.exit_code}) running command {command_string} with arguments {args}\nstderr: {self.stderr}\nstdout: {self.stdout}')

    def __repr__(self) -> str: # Return a string representation of this class
        return "\n".join(
            [
               f"Command: {self.command}",
               f"Directory: {self.cwd if not None else '{current directory}'}",
               f"Env: {self.running_env}",
               f"Exit Code: {self.exit_code}",
               f"nstdout: {self.stdout}",
               f"stderr: {self.stderr}" 
            ]
        )

    def list_to_newline_string(self, list_of_messages: list):
        return "\n".join(list_of_messages)

So, how do we use this?

Well… you can do this: prog = RunCommand(['ls', '/tmp', '-l']) with which we’ll get back the prog object. If you literally then do print(prog) it will print the result of the __repr__() function:

Command: ['ls', '/tmp', '-l']
Directory: current directory
Env: <... a collection of things from your environment ...>
Exit Code: 0
stdout: total 1
drwx------ 1 root  root  0 Jan 1 01:01 somedir
stderr:

But you can also do things like:

for line in prog.stdout:
  print(line)

or:

try:
  prog = RunCommand(['false'], raise_on_error=True)
catch Exception as e:
  logger.error(e)
  exit(e.exit_code)

Putting it together

So, I wrote all this up into a git repo, that you’re more than welcome to take your own inspiration from! It’s licenced under an exceptional permissive license, so you can take it and use it without credit, but if you want to credit me in some way, feel free to point to this blog post, or the git repo, which would be lovely of you.

Github: JonTheNiceGuy/python_shell_script_template

Featured image is “The Conch” by “Kurtis Garbutt” on Flickr and is released under a CC-BY license.

Alan PopeWhere are Podcast Listener Communities

Parasocial chat

On Linux Matters we have a friendly and active, public Telegram channel linked on our Contact page, along with a Discord Channel. We also have links to Mastodon, Twitter (not that we use it that much) and email.

At the time of writing there are roughly this ⬇ï¸� number of people (plus bots, sockpuppets and duplicates) in or following each Linux Matters “official” presence:

Channel Number
Telegram 796
Discord 683
Mastodon 858
Twitter 9919

Preponderance of chat

We chose to have a presence in lots of places, but primarily the talent presenters (Martin, Mark, and myself (and Joe)) only really hang out to chat on Telegram and Mastodon.

I originally created the Telegram channel on November 20th, 2015, when we were publishing the Ubuntu Podcast (RIP in Peace) A.K.A. Ubuntu UK Podcast. We co-opted and renamed the channel when Linux Matters launched in 2023.

Prior to the channel’s existence, we used the Ubuntu UK Local Community (LoCo) Team IRC channel on Freenode (also, RIP in Peace).

We also re-branded our existing Mastodon accounts from the old Ubuntu Podcast to Linux Matters.

We mostly continue using Telegram and Mastodon as our primary methods of communication because on the whole they’re fast, reliable, stay synced across devices, have the features we enjoy, and at least one of them isn’t run by a weird billionaire.

Other options

We link to a lot of other places at the top of the Linux Matters home page, where our listeners can chat, mostly to eachother and not us.

Being over 16, I’m not a big fan of Discord, and I know Mark doesn’t even have an account there. None of us use Twitter much anymore, either.

Periodically I ponder if we (Linux Matters) should use something other than Telegram. I know some listeners really don’t like the platform, but prefer other places like Signal, Matrix or even IRC. I know for sure some non-listeners don’t like Telegram, but I care less about their opinions.

Part of the problem is that I don’t think any of us really enjoy the other realtime chat alternatives. Both Matrix and Signal have terrible user experience, and other flaws. Which is why you don’t tend to find us hanging out in either of those places.

There are further options I haven’t even considered, like Wire, WhatsApp, and likely more I don’t even know or care about.

So we kept using Telegram over any of the above alternative options.

Pondering Posting Polls

I have repeatedly considered asking the listeners about their preferred chat platforms via our existing channels. But that seems flawed, because we use what we like, and no matter how many people prefer something else, we’re unlikely to move. Unless something strange happens 👀 .

Plus, often times, especially on decentralised platforms, the audience can be somewhat “over-enthusiastic” about their preferred way being The Wayâ„¢ï¸� over the alternatives. It won’t do us any favours to get data saying 40% report we should use Signal, 40% suggest Matrix and 20% choose XMPP, if the four of us won’t use any of them.

Pursue Podcast Palaver Proposals

So rather than ask our audience, I thought I’d see what other podcasters promote for feedback and chatter on their websites.

I picked a random set from shows I have heard of, and may have listened to, plus a few extra ones I haven’t. None of this is endorsement or approval, I wanted the facts, just the fax, ma’am.

I collated the data in a json file for some reason, then generated the tables below. I don’t know what to do with this information, but it’s a bit of data we may use if we ever decide to move away from Telegram.

Presenting Pint-Sized Payoff

The table shows some nerdy podcasts along with their primary means (as far as I can tell) of community engagement. Data was gathered manually from podcast home pages and “about” pages. I generally didn’t go into the page content for each episode. I made an exception for “Dot Social” and “Linux OTC” because there’s nothing but episodes on their home page.

It doesn’t matter for this research, I just thought it was interesting that some podcasters don’t feel the need to break out their contact details to a separate page, or make it more obvious. Perhaps they feel that listeners are likely to be viewing an episode page, or looking at a specific show metadata, so it’s better putting the contact details there.

I haven’t included YouTube, where many shows publish and discuss, in addition to a podcast feed.

I am also aware that some people exclusively, or perhaps primarily publish on YouTube (or other video platforms). Those aren’t podcasts IMNSHO.

Key to the tables below. Column names have been shorted because it’s a w i d e table. The numbers indicate how many podcasts use that communication platform.

  • EM - Email address (13/18)
  • MA - Mastodon account (9/18)
  • TW - Twitter account (8/18)
  • DS - Discord server (8/18)
  • TG - Telegram channel (4/18)
  • IR - IRC channel (5/18)
  • DW - Discourse website (2/18)
  • SK - Slack channel (3/18)
  • LI - LinkedIn (2/18)
  • WF - Web form (2/18)
  • SG - Signal group (3/18)
  • WA - WhatsApp (1/18)
  • FB - FaceBook (1/18)

Linux

Show EM MA TW DS TG IR DW SK MX LI WF SG WA FB
Linux Matters ✅ ✅ ✅ ✅ ✅ ✅
Ask The Hosts ✅ ✅ ✅ ✅ ✅
Destination Linux ✅ ✅ ✅ ✅ ✅
Linux Dev Time ✅ ✅ ✅ ✅ ✅
Linux After Dark ✅ ✅ ✅ ✅ ✅
Linux Unplugged ✅ ✅ ✅ ✅
This Week in Linux ✅ ✅ ✅ ✅ ✅
Ubuntu Security Podcast ✅ ✅ ✅ ✅ ✅
Linux OTC ✅ ✅ ✅

Open Source Adjunct

Show EM MA TW DS TG IR DW SK MX LI WF SG WA FB
2.5 Admins ✅ ✅
Bad Voltage ✅ ✅ ✅ ✅
Coffee and Open Source ✅
Dot Social ✅ ✅
Open Source Security ✅ ✅ ✅
localfirst.fm ✅

Other Tech

Show EM MA TW DS TG IR DW SK MX LI WF SG WA FB
ATP ✅ ✅ ✅ ✅
BBC Newscast ✅ ✅ ✅
The Rest is Entertainment ✅

Point

Not entirely sure what to do with this data. But there it is.

Is Linux Matters going to move away from Telegram to something else? No idea.

Alun JonesMessing with web spiders

Yesterday I read a Mastodon posting. Someone had noticed that their web site was getting huge amounts of traffic. When they looked into it, they discovered that it was OpenAI's - about 422 words

Alan PopeWindows 3.11 on QEMU 5.2.0

This is mostly an informational PSA for anyone struggling to get Windows 3.11 working in modern versions of QEMU. Yeah, I know, not exactly a massively viral target audience.

Anyway, short answer, use QEMU 5.2.0 from December 2020 to run Windows 3.11 from November 1993.

Windows 3.11, at 1280x1024, running Internet Explorer 5, looking at a GitHub issue

An innocent beginning

I made a harmless jokey reply to a toot from Thom at OSNews, lamenting the lack of native Mastodon client for Windows 3.11.

When I saw Thom’s toot, I couldn’t resist, and booted a Windows 3.11 VM that I’d installed six weeks ago, manually from floppy disk images of MSDOS and Windows.

I already had Lotus Organiser installed to post a little bit of nostalgia-farming on threads - it’s what they do over there.

Post by @popey
View on Threads

I thought it might be fun to post a jokey diary entry. I hurriedly made my silly post five minutes after Thom’s toot, expecting not to think about this again.

Incorrect, brain

I shut the VM down, then went to get coffee, chuckling to my smart, smug self about my successful nerdy rapid-response. While the kettle boiled, I started pondering - “Wait, if I really did want to make a Mastodon client for Windows 3.11, how would I do it?

I pondered and dismissed numerous shortcuts, including, but not limited to:

  • Fake it with screenshots doctored in MS Paint
  • Run an existing DOS Mastodon Client in a Window
  • Use the Windows Telnet client to connect insecurely to my laptop running the Linux command-line Mastodon client, Toot
  • Set up a proxy through which I could get to a Mastodon web page

I pondered a different way, in which I’d build a very simple proof of concept native Windows client, and leverage the Mastodon API. I’m not proficient in (m)any programming languages, but felt something like Turbo Pascal was time-appropriate and roughly within my capabilities.

Diversion

My mind settled on Borland Delphi, which I’d never used, but looked similar enough for a silly project to Borland Turbo Pascal 7.0 for DOS, which I had. So I set about installing Borland Delphi 1.0 from fifteen (virtual) floppy disks, onto my Windows 3.11 “Workstation” VM.

Windows 3.11, with a Borland Delphi window open

Thank you, whoever added the change floppy0 option to the QEMU Monitor. That saved a lot of time, and was reduced down to a loop of this fourteen times:

"Please insert disk 2"
CTRL+ALT+2
(qemu) change floppy 0 Disk02.img
CTRL+ALT+1
[ENTER]

During my research for this blog, I found a delightful, nearly decade-old video of David Intersimone (“David I”) running Borland Delphi 1 on Windows 3.11. David makes it all look so easy. Watch this to get a moving-pictures-with-sound idea of what I was looking at in my VM.

Once Delphi was installed, I started pondering the network design. But that thought wasn’t resident in my head for long, because it was immediately replaced with the reason why I didn’t use that Windows 3.11 VM much beyond the original base install.

The networking stack doesn’t work. Or at least, it didn’t.

That could be a problem.

Retro spelunking

I originally installed the VM by following this guide, which is notable as having additional flourishes like mouse, sound, and SVGA support, as well as TCP/IP networking. Unfortunately I couldn’t initially get the network stack working as Windows 3.11 would hang on a black screen after the familiar OS splash image.

Looking back to my silly joke, those 16-bit Windows-based Mastodon dreams quickly turned to dust when I realised I wouldn’t get far without an IP address in the VM.

Hopes raised

After some digging in the depths of retro forums, I stumbled on a four year-old repo maintained by Jaap Joris Vens.

Here’s a fully configured Windows 3.11 machine with a working internet connection and a load of software, games, and of course Microsoft BOB 🤓

Jaap Joris published this ready-to-go Windows 3.11 hard disk image for QEMU, chock full of games, utilities, and drivers. I thought that perhaps their image was configured differently, and thus worked.

However, after downloading it, I got the same “black screen after splash” as with my image. Other retro enthusiasts had the same issue, and reported the details on this issue, about a year ago.

does not work, black screen.

It works for me and many others. Have you followed the instructions? At which point do you see the black screen?

The key to finding the solution was a comment from Jaap Joris pointing out that the disk image “hasn’t changed since it was first committed 3 years ago”, implying it must have worked back then, but doesn’t now.

Joy of Open Source

I figured that if the original uploader had at least some success when the image was created and uploaded, it is indeed likely QEMU or some other component it uses may have (been) broken in the meantime.

So I went rummaging in the source archives, looking for the most recent release of QEMU, immediately prior to the upload. QEMU 5.2.0 looked like a good candidate, dated 8th December 2020, a solid month before 18th January 2021 when the hda.img file was uploaded.

If you build it, they will run

It didn’t take long to compile QEMU 5.2.0 on my ThinkPad Z13 running Ubuntu 24.04.1. It went something like this. I presumed that getting the build dependencies for whatever is the current QEMU version, in the Ubuntu repo today, will get me most of the requirements.

$ sudo apt-get build-dep qemu
$ mkdir qemu
$ cd qemu
$ wget https://download.qemu.org/qemu-5.2.0.tar.xz
$ tar xvf qemu-5.2.0.tar.xz
$ cd qemu-5.2.0
$ ./configure
$ make -j$(nproc)

That was pretty much it. The build ran for a while, and out popped binaries and the other stuff you need to emulate an old OS. I copied the bits required directly to where I already had put Jaap Joris’ hda.img and start script.

$ cd build
$ cp qemu-system-i386 efi-rtl8139.rom efi-e1000.rom efi-ne2k_pci.rom kvmvapic.bin vgabios-cirrus.bin vgabios-stdvga.bin vgabios-vmware.bin bios-256k.bin ~/VMs/windows-3.1/

I then tweaked the start script to launch the local home-compiled qemu-system-i386 binary, rather than the one in the path, supplied by the distro:

$ cat start
#!/bin/bash
./qemu-system-i386 -nic user,ipv6=off,model=ne2k_pci -drive format=raw,file=hda.img -vga cirrus -device sb16 -display gtk,zoom-to-fit=on

This worked a treat. You can probably make out in the screenshot below, that I’m using Internet Explorer 5 to visit the GitHub issue which kinda renders when proxied via FrogFind by Action Retro.

Windows 3.11, at 1280x1024, running Internet Explorer 5, looking at a GitHub issue

Share…

I briefly toyed with the idea of building a deb of this version of QEMU for a few modern Ubuntu releases, and throwing that in a Launchpad PPA then realised I’d need to make sure the name doesn’t collide with the packaged QEMU in Ubuntu.

I honestly couldn’t be bothered to go through the pain of effectively renaming (forking) QEMU to something like OLDQEMU so as not to damage existing installs. I’m sure someone could do it if they tried, but I suspect it’s quite a search and replace, or move the binaries somewhere under /opt. Too much effort for my brain.

I then started building a snap of qemu as oldqemu - which wouldn’t require any “real” forking or renaming. The snap could be called oldqemu but still contain qemu-system-i386 which wouldn’t clash with any existing binaries of the same name as they’d be self-contained inside the compressed snap, and would be launched as oldqemu.qemu-system-i386.

That would make for one package to maintain rather than one per release of Ubuntu. (Which is, as I am sure everyone is aware, one of the primary advantages of making snaps instead of debs in the first place.)

Anyway, I got stuck with another technical challenge in the time I allowed myself to make the oldqemu snap. I might re-visit it, especially as I could leverage the Launchpad Build farm to make multiple architecture builds for me to share.

…or not

In the meantime, the instructions are above, and also (roughly) in the comment I left on the issue, which has kindly been re-opened.

Now, about that Windows 3.11 Mastodon client…

Alan PopeVirtual Zane Lowe for Spotify

tl;dr

I bodged together a Python script using Spotipy (not a typo) to feed me #NewMusicDaily in a Spotify playlist.

No AI/ML, all automated, “fresh” tunes every day. Tunes that I enjoy get preserved in a Keepers playlist; those I don’t like to get relegated to the Sleepers playlist.

Any tracks older than eleven days are deleted from the main playlist, so I automatically get a constant flow of new stuff.

My personal Zane Lowe in a box

Nutshell

  1. The script automatically populates this Virtual Zane Lowe playlist with semi-randomly selected songs that were released within the last week or so, no older (or newer).
  2. I listen (exclusively?) to that list for a month, signaling songs I like by hitting a button on Spotify.
  3. Every day, the script checks for ’expired’ songs whose release date has passed by more than 11 days.
  4. The script moves songs I don’t like to the Sleepers playlist for archival (and later analysis), and to stop me hearing them.
  5. It moves songs I do like to the Keepers playlist, so I don’t lose them (and later analysis).
  6. Goto 1.

I can run the script at any time to “top up” the playlist or just let it run regularly to drip-feed me new music, a few tracks at a time.

Clearly, once I have stashed some favourites away in the Keepers pile, I can further investigate those artists, listen to their other tracks, and potentially discover more new music.

Below I explain at some length how and why.

NoCastAuGast

I spent an entire month without listening to a single podcast episode in August. I even unsubscribed from everything and deleted all the cached episodes.

Aside: Fun fact: The Apple Podcasts app really doesn’t like being empty and just keeps offering podcasts it knows I once listened to despite unsubscribing. Maybe I’ll get back into listening to these shows again, but music is on my mind for now.

While this is far from a staggering feat of human endeavour in the face of adversity, it was a challenge for me, given that I listened to podcasts all the time. This has been detailed in various issues of my personal email newsletter, which goes out on Fridays and is archived to read online or via RSS.

In August, instead, I re-listened to some audio books I previously enjoyed and re-listened to a lot of music already present on my existing Spotify playlists. This became a problem because I got bored with the playlists. Spotify has an algorithm that can feed me their idea of what I might want, but I decided to eschew their bot and make my own.

Note: I pay for Spotify Premium, then leveraged their API and built my “application” against that platform. I appreciate some people have Strong Opinions™️ about Spotify. I have no plans to stop using Spotify anytime soon. Feel free to use whatever music service you prefer, or self-host your 64-bit, 192 kHz Hi-Res Audio from HDTracks through an Elipson P1 Pre-Amp & DAC and Cary Audio Valve MonoBlok Power Amp in your listening room. I don’t care.

I’ll be here, listening on my Apple AirPods, or blowing the cones out of my car stereo. Anyway…

I spent the month listening to great (IMHO) music, predominantly released in the (distant) past on playlists I chronically mis-manage. On the other hand, my son is an expert playlist curator, a skill he didn’t inherit from me. I suspect he “gets the aux” while driving with friends, partly due to his Spotify playlist mastery.

As I’m not a playlist charmer, I inevitably got bored of the same old music during August, so I decided it was time for a change. During the month of September, my goal is to listen to as much new (to me) music as I can and eschew the crusty playlists of 1990s Brit-pop and late-70s disco.

How does one discover new music though?

Novel solutions

I wrote a Python script.

Hear me out. Back in the day, there was an excellent desktop music player for Linux called Banshee. One of the great features Banshee users loved was “Smart Playlists.” This gave users a lot of control over how a playlist was generated. There was no AI, no cloud, just simple signals from the way you listen to music that could feed into the playlist.

Watch a youthful Jorge Castro from 13 years ago do a quick demo.

Jorge Demonstrating the awesome power of Smart Playlists in Banshee (RIP in Peace)

Aside: Banshee was great, as were many other Mono applications like Tomboy and F-Spot. It’s a shame a bunch of blinkered, paranoid, noisy, and wrong Linux weirdos chased the developers away, effectively killing off those excellent applications. Good job, Linux community.

Hey ho. Moving on. Where was I…

Spotify clearly has some built-in, cloud-based “smarts” to create playlists, recommendations, and queues of songs that its engineers and algorithm think I might like. There’s a fly in the ointment, though, and her name is Alexa.

No, Alexa, NO!

We have a “Smart” speaker in the kitchen; the primary music consumers are not me. So “my” listening history is now somewhat tainted by all the Chase Atlantic & Central Cee my son listens to and Michael (fucking) Bublé, my wife, enjoys. She enjoys it so much that Bublé has featured on my end-of-year “Spotify Unwrapped” multiple times.

I’m sure he’s a delightful chap, but his stuff differs from my taste.

I had some ideas to work around all this nonsense. My goals here are two-fold.

  1. I want to find and enjoy some new music in my life, untainted by other house members.
  2. Feed the Spotify algorithm with new (to me) artists, genres and songs, so it can learn what else I may enjoy listening to.

Obviously, I also need to do something to muzzle the Amazon glossy screen of shopping recommendations and stupid questions.

The bonus side-quest is learning a bit more Python, which I completed. I spent a few hours one evening on this project. It was a fun and educational bit of hacking during time I might otherwise use for podcast listening. The result is four hundred or so lines of Python, including comments. My code, like my blog, tends to be a little verbose because I’m not an expert Python developer.

I’m pretty positive primarily professional programmers potentially produce petite Python.

Not me!

Noodling

My script uses the Spotify API via Spotipy to manage an initially empty, new, “dynamic” playlist. In a nutshell, here’s what the python script does with the empty playlist over time:

  • Use the Spotify search API to find tracks and albums released within the last eleven days to add to the playlist. I also imposed some simple criteria and filters.
    • Tracks must be accessible to me on a paid Spotify account in Great Britain.
    • The maximum number of tracks on the playlist is currenly ninety-four, so there’s some variety, but not too much as to be unweildy. Enough for me to skip some tracks I don’t like, but still have new things to listen to.
    • The maximum tracks per artist or album permitted on the playlist is three, again, for variety. Initially this was one, but I felt it’s hard to fully judge the appeal of an artist or album based off one song (not you: Black Lace), but I don’t want entire albums on the list. Three is a good middle-ground.
    • The maximum number of tracks to add per run is configurable and was initially set at twenty, but I’ll likely reduce that and run the script more frequently for drip-fed freshness.
  • If I use the “favourite” or “like” button on any track in the list before it gets reaped by the script after eleven days, the song gets added to a more permanent keepers playlist. This is so I can quickly build a collection of newer (to me) songs discovered via my script and curated by me with a single button-press.
  • Delete all tracks released more than eleven days ago if I haven’t favourited them. I chose eleven days to keep it modern (in theory) and fresh (foreshadowing). Technically, the script does this step first to make room for additional new songs.

None of this is set in stone, but it is configurable with variables at the start of the script. I’ll likely be fiddling with these through September until I get it “right,” whatever that means for me. Here’s a handy cut-out-and-keep block diagram in case that helps, but I suspect it won’t.

 +-----------------------------+
 | Spotify (Cloud) |
 | +---------------------+ |
 | | Main Playlist | |
 | +---------------------+ |
 | | | |
 | Like | | Dislike |
 | v | |
 | +---------------------+ |
 | | Keeper Playlist | |
 | +---------------------+ |
 | | |
 | v |
 | +---------------------+ |
 | | Sleeper Playlist | |
 | +---------------------+ |
 +-------------+---------------+
 ^
 |
 v
 +---------------------------+
 | Python Script |
 | +---------------------+ |
 | | Calls Spotify API | |
 | | and Manages Songs | |
 | +---------------------+ |
 +---------------------------+

Next track

The expectation is to run this script automatically every day, multiple times a day, or as often as I like, and end up with a frequently changing list of songs to listen to in one handy playlist. If I don’t like a song, I’ll skip it, and when I do like a song, I’ll likely play it more than once. and maybe click the “Like” icon.

My theory is that the list becomes a mix between thirty and ninety artists who have released albums over the previous rolling week. After the first test search on Tuesday, the playlist contained 22 tracks, which isn’t enough. I scaled the maximum up over the next few days. It’s now at ninety-four. If I exhaust all the music and get bored of repeats, I can always up the limit to get a few new songs.

In fact, on the very first run of the script, the test playlist completely filled with songs from one artist who had just released a new album. That triggered the implementation of the three songs per artist/album rule to reduce that happening.

I appreciate listening to tracks out of sequence, and a full album is different from the artist intended. But thankfully, I don’t listen to a lot of Adele, and the script no longer adds whole albums full of songs to the list. So, no longer a “me” problem.

No AI

I said at the top I’m not using any “AI/ML” in my script, and while that’s true, I don’t control what goes on inside the Spotify datacentre. The script is entirely subject to the whims of the Spotify API as to which tracks get returned to my requests. There are some constraints to the search API query complexity, and limits on what the API returns.

The Spotify API documentation has been excellent so far, as has the Spotipy docs.

Popular songs and artists often organically feature prominently in the API responses. Plus (I presume) artists and labels have financial incentives or an active marketing campaign with Spotify, further skewing search results. Amusingly, the API has an optional (and amusing) “hipster” tag to show the bottom 10% of results (ranked by popularity). I did that once, didn’t much like it, and won’t do it again.

It’s also subject to the music industry publishing music regularly, and licensing it to be streamed via Spotify where I live.

Not quite

With the script as-is, initially, I did not get fresh new tunes every single day as expected, so I had a further fettle to increase my exposure to new songs beyond what’s popular, trending, or tagged “new”. I changed the script to scan the last year of my listening habits to find genres of music I (and the rest of the family) have listened to a lot.

I trimmed this list down (to remove the genre taint) and then fed these genres to the script. It then randomly picks a selection of those genres and queries the API for new releases in those categories.

With these tweaks, I certainly think this script and the resulting playlist are worth listening to. It’s fresher and more dynamic than the 14-year-old playlist I currently listen to. Overall, the script works so that I now see songs and artists I’ve not listened to—or even heard of—before. Mission (somewhat) accomplished.

Indeed, with the genres feature enabled, I could add a considerable amount of new music to the list, but I am trying to keep it a manageable size, under a hundred tracks. Thankfully, I don’t need to worry about the script pulling “Death Metal,” “Rainy Day,” and “Disney” categories out of thin air because I can control which ones get chosen. Thus, I can coerce the selection while allowing plenty of randomness and newness.

I have limited the number of genre-specific songs so I don’t get overloaded with one music category over others.

Not new

There are a couple of wrinkles. One song that popped into the playlist this week is “Never Going Back Again” by Fleetwood Mac, recorded live at The Forum, Inglewood, in 1982. That’s older than the majority of what I listened to in all of August! It looks like Warner Records Inc. released that live album on 21st August 2024, well within my eleven-day boundary, so it’s technically within “The Rules” while also not being fresh, new music.

There’s also the compilation complication. Unfresh songs from the past re-released on “TOP HITS 2024” or “DANCE 2024 100 Hot Tracks” also appeared in my search criteria. For example, “Talk Talk” by Charli XCX, from her “Brat” album, released in June, is on the “DANCE 2024 100 Hot Tracks” compilation, released on 23rd August 2024, again, well within my eleven-day boundary.

I’m in two minds about these time-travelling playlist interlopers. I have never knowingly listened to Charli XCX’s “Brat” album by choice, nor have I heard live versions of Fleetwood Mac’s music. I enjoy their work, but it goes against the “new music” goal. But it is new to me which is the whole point of this exercise.

The further problem with compilations is that they contain music by a variety of artists, so they don’t hit the “max-per-artist” limit but will hit the “max-per-album” rule. However, if the script finds multiple newly released compilations in one run, I might end up with a clutch of random songs spread over numerous “Various Artists” albums, maxing out the playlist with literal “filler.”

I initially allowed compilations, but I’m irrationally bothered that one day, the script will add “The Birdie Song” by Black Lace as part of “DEUTSCHE TOP DISCO 3000 POP GEBURTSTAG PARTY TANZ SONGS ZWANZIG VIERUNDZWANZIG”.

Nein.

I added a filter to omit any “album type: compilation,” which knocks that bopping-bird-based botherer squarely on the bonce.

No more retro Europop compilation complications in my playlist. Alles klar.

Not yet

Something else I had yet to consider is that some albums have release dates in the future. Like a fresh-faced newborn baby with an IDE and API documentation, I assumed that albums published would generally have release dates of today or older. There may be a typo in the release_date field, or maybe stuff gets uploaded and made public ahead of time in preparation for a big marketing push on release_date.

I clearly do not understand the music industry or publishing process, which is fine.

Nuke it from orbit

I’ve been testing the script while I prototyped it, this week, leading up to the “Grand Launch” in September 2024 (next month/week). At the end of August I will wipe the slate (playlist) clean, and start again on 1st September with whatever rules and optimisations I’ve concocted this week. It will almost certainly re-add some of the same tracks back-in after the 31st August “Grand Purge”, but that’s as expected, working as designed. The rest will be pseudo-random genre-specific tracks.

I hope.

Newsletter

I will let this thing go mad each day with the playlist and regroup at the end of September to evaluate how this scheme is going. Expect a follow-up blog post detailing whether this was a fun and interesting excursion or pure folly. Along the way, I did learn a bit more Python, the Spotify API, and some other interesting stuff about music databases and JSON.

So it’s all good stuff, whether I enjoy the music or not.

You can get further, more timely updates in my weekly email newsletter, or view it in the newsletter archive, and via RSS, a little later.

Ken said he got “joy out of reading your newsletter”. YMMV. E&OE. HTH. HAND.

Nomenclature

Every good project needs a name. I initially called it my “Personal Dynamic Playlist of Sixty tracks over Eleven days,” or PDP-11/60 for short, because I’m a colossal nerd. Since bumping the max-tracks limit for the playlist, it could be re-branded PDP-11/94. However, this is a relatively niche and restrictive playlist naming system, so I sought other ideas.

My good friend Martin coined the term “Virtual Zane Lowe” (Zane is a DJ from New Zealand who is apparently renowned for sharing new music). That’s good enough for me. Below are links to all three playlists if you’d like to listen, laugh, live, love, or just look at them.

The “Keepers” and “Sleepers” lists will likely be relatively empty for a few days until the script migrates my preferred and disliked tracks over for safe-keeping & archival, respectively.

November approaches

Come back at the end of the month to see if: My script still works. The selections are good. I’m still listening to this playlist, and most importantly. Whether I enjoy doing so!

If it works, I’ll probably continue using it through October and into November as I commute to and from the office. If that happens, I’ll need to update the playlist artwork. Thankfully, there’s an API for that, too!

I may consider tidying up the script and sharing it online somewhere. It feels a bit niche and requires a paid Spotify account to even function, so I’m not sure what value others would get from it other than a hearty chuckle at my terribad Python “skills.”

One potentially interesting option would be to map the songs in Spotify to another, such as Apple Music or even videos on YouTube. The YouTube API should enable me to manage video playlists that mirror the ones I manage directly on Spotify. That could be a fun further extension to this project.

Another option I considered was converting it to a web app, a service I (and other select individuals) can configure and manage in a browser. I’ll look into that at the end of the month. If the current iteration of the script turns out to be a complete bust, then this idea likely won’t go far, either.

Thanks for reading. AirPods in. Click “Shuffle”.

Alan PopeText Editors with decent Grammar Tools

This is another blog post lifted wholesale out of my weekly newsletter. I do this when I get a bit verbose to keep the newsletter brief. The newsletter is becoming a blog incubator, which I’m okay with.

A reminder about that newsletter

The newsletter is emailed every Friday - subscribe here, and is archived and available via RSS a few days later.

I talked a bit about the process of setting up the newsletter on episode 34 of Linux Matters Podcast. Have a listen if you’re interested.

Linux Matters 34

Patreon supporters of Linux Matters can get the show a day or so early and without adverts. �

Multiple kind offers

Good news, everyone! I now have a crack team of fine volunteers who proofread the text that lands in your inbox/browser cache/RSS reader. Crucially, they’re doing that review before I send the mail, not after, as was previously the case. Thank you for volunteering, mystery proofreaders.

popey dreamland

Until now, my newsletter “workflow” (such as it was) involved hoping that I’d get it done and dusted by Friday morning. Then, ideally, it would spend some time “in review”, followed by saving to disk. But if necessary, it would be ready to be opened in an emergency text editor at a moment’s notice before emails were automatically sent by lunchtime.

I clearly don’t know me very well.

popey reality

What actually happened is that I would continue editing right up until the moment I sent it out, then bash through the various “post-processing” steps and schedule the emails for “5 minutes from now.” Boom! Done.

This often resulted in typos or other blemishes in my less-than-lovingly crafted emails to fabulous people. A few friends would ping me with corrections. But once the emails are sent, reaching out and fixing those silly mistakes is problematic.

Someone should investigate over-the-air updates to your email. Although zero-day patches and DLC for your inbox sound horrendous. Forget that.

In theory, I could tweak the archived version, but that is not straightforward.

Tool refresh?

Aside: Yes, I know it’s not the tools, but I should slow down, be more methodical and review every change to my document before publishing. I agree. Now, let’s move on.

While preparing the newsletter, I would initially write in Sublime Text (my desktop text editor of choice), with a Grammarly† (affiliate link) LSP extension, to catch my numerous blunders, and re-word my clumsy English.

Unfortunately, the Grammarly extension for Sublime broke a while ago, so I no longer have that available while I prepare the newsletter.

I could use Google Docs, I suppose, where Grammarly still works, augmenting the built-in spell and grammar checker. But I really like typing directly as Markdown in a lightweight editor, not a big fat browser. So I guess I need to figure something else out to check my spelling and grammar prior to the awesome review team getting it to save at least some of my blushes.

I’m not looking for suggestions for a different text editor—or am I? Maybe I am. I might be.

Sure, that’ll fix it.

ZX81 -> Spectrum -> CPC -> edlin -> Edit -> Notepad -> TextPad -> Sublime -> ?

I’ve used a variety of text editors over the years. Yes, the ZX81 and Sinclair Spectrum count as text editors. Yes, I am old.

I love Sublime’s minimalism, speed, and flexibility. I use it for all my daily work notes, personal scribblings, blog posts, and (shock) even authoring (some) code.

I also value Sublime’s data-recovery features. If the editor is “accidentally” terminated or a power-loss event occurs, Sublime reliably recovers its state, retaining whatever you were previously editing.

I regularly use Windows, Linux, and macOS on any given day across multiple computers. So, a cross-platform editor is also essential for me, but only on the laptop/desktop, as I never edit on mobile‡ devices.

I typically just open a folder as a “workspace” in a window or an additional tab in one window. I frequently open many folders, each full of files across multiple displays and machines.

All my notes are saved in folders that use Syncthing to keep in sync across machines. I leave all of those notes open for days, perhaps weeks, so having a robust sync tool combined with an editor that auto-reloads when files change is key.

Their notes are separately backed up, so cloud storage isn’t essential for my use case.

Something else?

Whatever else I pick, it’s really got to fit that model and requirements, or it’ll be quite a stretch for me to change. One option I considered and test-drove is NotepadNext. It’s an open-source re-implementation of Notepad++, written in C++ and Qt.

A while back, I packaged up and published it as a snap, to make it easy to install and update. It fits many of the above requirements already, with the bonus of being open-source, but sadly, there is no Grammarly support there either.

I’d prefer no :::: W I D E - L O A D :::: electron monsters. Also, not Notion or Obsidian, as I’ve already tried them, and I’m not a fan. In addition, no, not Vim or Emacs.

Bonus points if you have a suggestion where one of the selling points isn’t “AI”§.

Perhaps there isn’t a great plain text editor that fulfills all my requirements. I’m open to hearing suggestions from readers of this blog or the newsletter. My contact details are here somewhere.


† - Please direct missives about how terrible Grammarly is to /dev/null. Thanks. Further, suggestions that I shouldn’t rely on Grammarly or other tools and should just “Git Gud” (as the youths say) may be inserted into the A1481 on the floor.

‡ - I know a laptop is technically a “mobile” device.

§ - Yes, I know that “Not wanting AI” and “Wanting a tool like Grammarly” are possibly conflicting requirements.

â—‡ - For this blog post I copy and pasted the entire markdown source into a Google doc, used all the spelling and grammar tools, then pasted it back into Sublime, pushed to git, and I’m done. Maybe that’s all I need to do? Keep my favourite editor, and do all the grammar in one chunk at the end in a tab of a browser I already had open anyway. Beat that!

Alan PopeApplication Screenshots on macOS

I initially started typing this as short -[ Contrafibularities ]- segment for my free, weekly newsletter. But it got a bit long, so I broke it out into a blog post instead.

About that newsletter

The newsletter is emailed every Friday - subscribe here, and is archived and available via RSS a few days later. I talked a bit about the process of setting up the newsletter on episode 34 of Linux Matters Podcast. Have a listen if you’re interested.

Linux Matters 34

Patreon supporters of Linux Matters can get the show a day or so early, and without adverts. �

Going live!

I have a work-supplied M3 MacBook Pro. It’s a lovely device with ludicrous battery endurance, great screen, keyboard and decent connectivity. As an ex-Windows user at work, and predominantly Linux enthusiast at home, macOS throws curveballs at me on a weekly basis. This week, screenshots.

I scripted a ‘going live’ shell script for my personal live streams. For the title card, I wanted the script to take a screenshot of the running terminal, Alacritty. I went looking for ways to do this on the command line, and learned that macOS has shipped a screencapture command-line tool for some time now. Amusingly the man page for it says:

DESCRIPTION
 The screencapture utility is not very well documented to date.
 A list of options follows.

and..

BUGS
 Better documentation is needed for this utility.

This is 100% correct.

How hard can it be?

Perhaps I’m too used to scrot on X11, that I have used for over 20 years. If I want a screenshot of the current running system, just run scrot and bang there’s a PNG in the current directory showing what’s on screen. Easy peasy.

On macOS, run screencapture image.png and you’ll get a screenshot alright, of the desktop, your wallpaper. Not the application windows carefully arranged on top. To me, this is somewhat obtuse. However, it is also possible to screenshot a window, if you know the <windowid>.

From the screencapture man page:

 -l <windowid> Captures the window with windowid.

There appears to be no straightforward way to actually get the <windowid> on macOS, though. So, to discover the <windowid> you might want the GetWindowID utility from smokris (easily installed using Homebrew).

That’s fine and relatively straightforward if there’s only one Window of the application, but a tiny bit more complex if the app reports multiple windows - even when there’s only one. Alacritty announces multiple windows, for some reason.

$ GetWindowID Alacritty --list
"" size=500x500 id=73843
"(null)" size=0x0 id=73842
"alan@Alans-MacBook-Pro.local (192.168.1.170) - byobu" size=1728x1080 id=73841

FINE. We can deal with that:

$ GetWindowID Alacritty --list | grep byobu | awk -F '=' '{print $3}'
73841

You may then encounter the mysterious could not create image from window error. This threw me off a little, initially. Thankfully I’m far from the first to encounter this.

Big thanks to this rancher-sandbox, rancher-desktop pull request against their screenshot docs. Through that I discovered there’s a macOS security permission I had to enable, for the terminal application to be able to initiate screenshots of itself.

A big thank-you to both of the above projects for making their knowledge available. Now I have this monstrosity in my script, to take a screenshot of the running Alacritty window:

screencapture -l$(GetWindowID Alacritty --list | \
 grep byobu | \
 awk -F '=' '{print $3}') titlecard.png

If you watch any of my live streams, you may notice the title card. Now you know how it’s made, or at least how the screenshot is created, anyway.

Andy SmithDaniel Kitson – Collaborator (work in progress)

Collaborators

Last night we went to see Daniel Kitson's "Collaborator" (work in progress). I'd no idea what to expect but it was really good!

A photo of the central area of a small theatre in the round. There are          four tiers of seating and then an upper balcony.Most seats are filled.          The central stage area is empty except for four large stacks of          paper.
The in-the-round setup of Collaborator at The Albany Theatre, Deptford, London

It has been reviewed at 4/5 stars in Chortle and positively in the Guardian, but I don't recommend reading any reviews because they'll spoil what you will experience. We went in to it blind as I always prefer that rather than thorough research of a show. I think that was the correct decision. I've been on Daniel's fan newsletter for ages but hadn't had chance to see him live until now.

While I've seen some comedy gigs that resembled this, I've never seen anything quite like it.

At £12 a ticket this is an absolute bargain. We spent more getting there by public transport!

Shout out to the nerds

If you're a casual comedy enjoyer looking for something a bit different then that's all you need to know. If like me however you consider yourself a bit of a wanky appreciator of comedy as an art form, I have some additional thoughts!

Collaborator wasn't rolling-on-the-floor-in-tears funny, but was extremely enjoyable and Jenny and I spent the whole way home debating how Kitson designed it and what parts of it really meant. Not everyone wants that in comedy, and that's fine. I don't always want it either. But to get it sometimes is a rare treat.

It's okay to enjoy a McIntyre or Peter Kay crowd-pleaser about "do you have a kitchen drawer full of junk?" or "do you remember white dog poo?" but it's also okay to appreciate something that's very meta and deconstructive. Stewart Lee for example is often accused of being smug and arrogant when he critiques the work of other comedians, and his fans to some extent are also accused of enjoying feeling superior more than they enjoy a laugh - and some of them who miss the point definitely are like this.

But acts like Kitson and Lee are constructed personalities where what they claim to think and how they behave is a fundamental part of the performance. You are to some extent supposed to disagree with and be challenged by their views and behaviours — and I don't just mean they are edgelording with some "saying the things that can't be said" schtick. Sometimes it's fun to actually have thoughts about it. It's a different but no less valid (or more valid!) experience. A welcome one in this case!

I mean, I might have to judge you if you enjoy Mrs Brown's Boys, but I accept it has an audience as an art form.

White space

There was a comment on Fedi about how the crowd pictured here appears to be a sea of white faces, despite London being a fairly diverse city. This sort of thing hasn't escaped me. I've found it to be the case in most of the comedy gigs I've attended in person, where the performer is white. I don't know why. In fact, acts like Stewart Lee and Richard Herring will frequently make reference to the fact that their stereotypical audience member is a middle aged white male computer toucher with lefty London sensibilities. So, me then.

Don't get me wrong, I do try to see some diverse acts and have been in a demographic minority a few times. Sadly enough, just going to see a female act can be enough to put you in an audience of mostly women. That happened when we went to see Bridget Christie's Who Am I? ("a menopause laugh a minute with a confused, furious, sweaty woman who is annoyed by everything", 4 stars, Chortle), and it's a shame that people seem to stick in their lanes so much.

References

Andy SmithIncluding remote data in a MediaWiki article

A few months ago I needed to include some data — that was generated and held remotely — into a MediaWiki article.

Here's the solution I chose which enabled me to generate some tables populated with data that only exists in some remote YAML files:

Screenshot of a wiki article that describes three different contact         methods; a mailing list' an IRC chat room and a Telegram group. Beside         each method is a table of their activity. The mailing list shows 10         messages in the last 30 days. The IRC channel shows 25 messages in the         last 30 days. The Telegram group shows 91 messages in the last 30         days.
Screenshot of the Community article showing tables of activity stats
INFO

I did actually do all this back in early April, but as I couldn't read my own blog site at the time I had to set up a new blog before I could write about it! 😀

Background

All the way back in March 2024 I'd decided that BitFolk probably should have some alternative chat venue to its IRC channel, which had been largely silent for quite some time. So, I'd opened a Telegram group and spruced up the Community article on BitFolk's wiki.

When writing about the new thing in the article I got to thinking how I feel when I see a project with a bunch of different contact methods listed.

I'm usually glad to see that a project has ways to contact them that I don't consider awful, but if all the ones that I consider non-awful are actually deserted, barren and disused then I'd like to be able to decide whether I would actually want to hold my nose and go to a Discord some service I ordinarily would dislike.

So, it's not just that these things exist — easy to just list off — but I decided I would like to also include some information about how active these things are (or not).

The problem

BitFolk's wiki is a MediaWiki site, so including any sort of dynamic content that isn't already implemented in the software would require code changes or an extension.

The one solution that doesn't involve developing something or using an existing extension would be to put a HTML <iframe> in a template that's set to allow raw HTML. <iframe>s aren't normally allowed in general articles due to the havoc they could cause with a population of untrusted authors, but putting them in templates would be okay since the content they would include could be locked down that way.

The appearance of such a thing though is just not very nice without a lot of styling work. That's basically a web site inside a web site. I had the hunch that there would be existing extensions for including structured remote data. And there is!

External_Data extension

The extension I settled on is called External_Data.

Description

Allows for using and displaying values retrieved from various sources: external URLs and SOAP services, local wiki pages and local files (in CSV, JSON, XML and other formats), database tables, LDAP servers and local programs output.

Just what I was looking for!

While this extension can just include plain text, there are other, simpler extebnsions I could have used if I just wanted to do that. You see, each of the sets of activity stats will have to be generated by a program specific to each service; counting mailing list posts is not like counting IRC messages, and so on.

I wanted to write programs that would store this information in a structured format like YAML and then External_Data would be used to turn each of those remote YAML files into a table.

Example YAML data

I structured the output of my programs like this:

---
bitfolk:
  messages_last_30day: 91
  messages_last_6hour: 0
  messages_last_day: 0
stats_at: 2024-06-29 21:02:03

Markup in the wiki article

In the wiki article that is formatted like this:

{{#get_web_data:url=https://ruminant.bitfolk.com/social-stats/tg.yaml
|format=yaml
|data=bftg_stats_at=stats_at,bftg_last_6hour=messages_last_6hour,bftg_last_day=messages_last_day,bftg_last_30day=messages_last_30day
}}

{| class="wikitable" style="float:right; width:25em; margin:1em"
|+ Usage stats as of {{#external_value:bftg_stats_at}} GMT
|-
!colspan="3" | Messages in the last…
|-
! 6 hours || 24 hours || 30 days
|- style="text-align:center"
| {{#external_value:bftg_last_6hour}}
| {{#external_value:bftg_last_day}}
| {{#external_value:bftg_last_30day}}
|}

How it works

  1. Data is requested from a remote URL (https://ruminant.bitfolk.com/social-stats/tg.yaml).
  2. It's parsed as YAML.
  3. Variables from the YAML are stored in variables in the article, e.g. bftg_stats_at is set to the value of stats_at from the YAML.
  4. A table in wiki syntax is made and the data inserted in to it with directives like {{#external_value:bftg_stats_at}}.

This could obviously be made cleaner by putting all the wiki markup in a template and just calling that with the variables.

Wrinkle: MediaWiki's caching

MediaWiki caches quite aggressively, which makes a lot of sense: it's expensive for some PHP to request wiki markup out of a database and convert it into HTML every time when it almost certainly hasn't changed since the last time someone looked at it. But that frustrates what I'm trying to do here. The remote data does update and MediaWiki doesn't know about that!

In theory it looks like it is possible to adjust cache times per article (or even per remote URI) but I didn't have much success getting that to work. It is possible to force an article's cache to be purged with just a POST request though, so I solved the problem by having each of my activity summarising programs issue such a request when their job is done. This will do it:

curl -s -X POST 'https://tools.bitfolk.com/wiki/Community?action=purge'

They only run once an hour anyway, so it's not a big deal.

Concerns?

Isn't it dangerous to allow article authors to include arbitrary remote data?

Yes! The main wiki configuration can have a section added which sets an allowlist of domains or even URI prefixes for what is allowed to be included.

What if the remote data becomes unavailable?

The extension has settings for how stale data can be before it's rejected. In this case it's a trivial use so it doesn't really matter, of course.

Jon SpriggsMounting a damaged #ZFS Pool disk to recover data

TL;DR? zpool import -d /dev/sdb1 -o readonly=on -R /recovery/poolname poolname

I have a pair of Proxmox servers, each with a single ZFS drive attached, with GlusterFS over the top to provide storage to the VMs.

Last week I had a power outage which took both nodes offline. When the power came back on, one node’s system drive had failed entirely and during recovery the second machine refused to restart some of the VMs.

Rather than try to fix things properly, I decided to “Nuke-and-Pave”, a decision I’m now regretting a little!

I re-installed one of the nodes OK, set up the new ZFS drive, set up Gluster and then started transferring the content from the old machine to the new one.

During the file transfer, I saw a couple of messages about failed blocks, and finally got a message from the cluster about how the pool was considered degraded, but as this was largely performed while I was asleep, I didn’t notice until I woke up… when the new node was offline.

I connected a Keyboard and Monitor to the box and saw a kernel panic. I rebooted the node, and during the boot sequence, just after the Systemd service that scanned the ZFS pool, it panicked again.

Unplugging the data drive from the machine and rebooting it, the node came up just fine.

I plugged the drive into my laptop and ran zpool import -d /dev/sdb1 -R /recovery/poolname poolname and my laptop crashed (although, I was running this in GUI mode, so I don’t know if it was a kernel panic or “just” a crash.)

Finally, I ran zpool import -d /dev/sdb1 -o read-only=on -R /recovery/poolname poolname and the drive came up in /recovery/poolname, so I could transfer files off to another drive until I figure out what’s going on!

Once I was done, I ran zfs unmount poolname and was able to detach the disk from the device.

Featured image is “don’t panic orangutan” by “Esperluette” on Flickr and is released under a CC-BY license.

Andy SmithThoughts on commenting facilities for this site

This site, being a static one, presents some challenges with regard to accepting comments from its readers. There's also a bunch of comments that already exist on the legacy site. I have some thoughts about what I should do about this.

TL;DR

I think I will:

  • Try to set up Isso at least for the purpose of importing old comments.
  • Then I'll see what Isso is like generally.
  • If Isso didn't work for importing then I'll try the XML conversion myself.
  • Independently, I'll investigate the Fediverse conversation thing.

Updates

2024-06-24

Isso was implemented and comments from the old Wordpress blog were imported into it. I'm still unsure if I will continue to allow new comments through it though.

The problem

It's a bit tricky to accept comments onto a web site that is running off of static files. JavaScript is basically the only way to do it, for which there are a number of options.

There's a couple of hundred comments on the 300 or so articles that exist on the legacy Wordpress site as well, and at least some of them I think are worth moving over when I get around to moving over an article from there.

Is it really worth having comments?

I mean, the benefits are slim, but they definitely do exist. I've had a few really useful and interesting comments over the years and it would be a shame to do away with that feature even if it does make life a lot easier.

So, conclusions:

  • I should find a way to bring over some if not all of the comments that already exist.
  • I should provide at least one way1 to let people comment on new articles.

Other people's value judgements can and will differ. A lot of this is just going to be, like, my opinion, man.

In that case, what to do about…

Existing comments

I've got an XML export of the legacy blog which includes all the comment data along with the post data. The Wordpress-to-Markdown conversion program that I've used only converts the post body, though, so at the moment none of the articles I've migrated have had their comments brought along with them.

I think it will be enough to also add the existing comment data as static HTML. I don't think there's any real need to make it possible for people to interact with past comments. There's some personal information that commenters may have provided, like what they want to be known as and their web site if any. There has to be a means for that to be deleted upon request, but I think it will be okay to expect such requests to come in by email.

After a casual search I haven't managed to find existing software that will convert the comments in a Wordpress XML export into Markdown or static HTML. I might have missed something though because the search results are filled with a plethora of Wordpress plugins for static site export. One of those might actually be suitable. If you happen to know of something that may be suitable please let me know! I guess that would have to be by email or Fediverse right now (links at the bottom of the page).

It is claimed, however, that Isso (see below) can import comments from a Wordpress XML export!

The comments XML

Comments in the XML export look like this (omitting some uninteresting fields):

<item>
  <wp:post_id>16</wp:post_id>
  <!-- more stuff about the article in here -->
  <wp:comment>
    <wp:comment_id>119645</wp:comment_id>
    <wp:comment_parent>0</wp:comment_parent>
    <wp:comment_author><![CDATA[Greyhound lover]]></wp:comment_author>
    <wp:comment_date><![CDATA[2009-07-08 10:35:14]]></wp:comment_date>
    <wp:comment_content><![CDATA[What a nicely reasoned and well informed article.

The message about Greyhound rescue is getting through, but far too slowly.

I hope your post gets a lot of traffic.

Ray]]></wp:comment_content>
  </wp:comment>
</item>

If I can't find existing software to do it, I think my skills do stretch to using XSLT or something to transform the list of <wp:comment></wp:comment>s into Markdown or HTML for simple inclusion.

Wordpress does comment threading by having the <wp:comment_parent> be non-zero. I think that would be nice to replicate but if my skills end up not being up to it then it will be okay to just have a flat chronological list. I'll keep the data to leave the door open to improving it in future.

I haven't decided yet if it will be more important to bring over old comments, or to figure out a solution for new comments.

Future comments

All of the options for adding comments to a static site involve JavaScript. Whatever I choose to do, people who want to keep JS disabled are not going to be able to add comments and will just have to make do with email.

I'm aware of a few different options in this space.

Disqus

Just no. Surveillance capitalism.

giscus

giscus stores comments in GitHub discussions. It's got a nice user interface since the UI of GitHub itself is pretty fancy, but does mean that every commenter will require a GitHub account and anonymous comments aren't possible.

There's also utterances which stores things in GitHub issues, but has fewer features than giscuss and the same major downsides.

I am Not A Fan of requiring people to use GitHub.

Hyvor Talk

Hyvor Talk is a closed source paid service that's a bit fancier than giscus.

I'm still not particularly a fan of making people log in to some third party service.

Isso

Isso is a self-hosted open source service that's got quite a nice user interface, permits things like Markdown in comments, and optionally allows anonymous comments so that commenters don't need to maintain an account if they don't want to.

I think this one is a real contender!

Mastodon API

This isn't quite a commenting system, since it doesn't involve directly posting comments.

The idea is that each article has an associated toot ID which is the identifier for a post on a Mastodon server. The Mastodon API is then used to display all Fediverse replies to that post. So:

  1. You post on your Mastodon server about the article.
  2. You take the toot ID of that post and set it in a variable in the article's front matter.
  3. JavaScript on your site is then able to display all the comments on that Fediverse post.
NOTE

In this section I talk about "Fediverse" and "Mastodon".

I'm not an expert on this but my understanding is that Fediverse instances exchange data using the ActivityPub protocol, and Mastodon is a particular implementation of a Fediverse instance.

However, Mastodon's API is unique to itself (and derivative software), so this commenting system would rely on the article author having an account on a Mastodon server. Though, everyone else replying on the Fediverse would not necessarily be using Mastodon on their instances yet their replies would still show up.

The effect is that a Fediverse conversation about your article is placed on your article. This project is an example of such a thing.

Of course, not everyone has a Fediverse account and not everyone wants one, but at least anyone potentially can, without having to deal with some central third party. And if no existing Fediverse instance suits them then they can set up their own. It's a decentralised solution.

The extremely niche nature of the Fediverse is pretty stark:

Active users
Fediverse

~1 million as of June 2024.

GitHub

~100 million as of January 2023 (unclear how "active" is defined).

Fediverse comments are also basically just plain text and links. Unfortunately no way to express yourself better with Markdown or other styling.

Nevertheless, I like it. I think I want to pursue this. Maybe in combination with Isso, if that doesn't get too noisy.


1

"Email" doesn't count!

Jon SpriggsMaking .bashrc more manageable

How many times have you seen an instruction in a setup script which says “Now add source <(somescript completion bash) to your ~/.bashrc file” or “Add export SOMEVAR=abc123 to your .bashrc file”?

This is great when it’s one or two lines, but for a big chunk of them? Whew!

Instead, I created this block in mine:

if [ -d ~/.bash_extensions.d ]; then
    for extension in ~/.bash_extensions.d/[a-zA-Z0-9]*
    do
        . "$extension"
    done
fi

This dynamically loads all the files in ~/.bash_extensions.d/ which start with a letter or a digit, so it means I can manage when things get loaded in, or removed from my bash shell.

For example, I recently installed the pre-release of Atuin, so my ~/.bash_extensions.d/atuin file looks like this:

source $HOME/.atuin/bin/env
eval "$(atuin init bash --disable-up-arrow)"

And when I installed direnv, I created ~/.bash_extensions.d/direnv which has this in it:

eval "$(direnv hook bash)"

This is dead simple, and now I know that if I stop using direnv, I just need to remove that file, rather than hunting for a line in .bashrc.

Featured image is “Gears gears cogs bits n pieces” by “Les Chatfield” on Flickr and is released under a CC-BY license.

Josh HollandEven more on git scratch branches: using Jujutsu

Even more on git scratch branches: using Jujutsu

This is the third post in an impromptu series:

  1. Use a scratch branch in git
  2. More on git scratch branches: using stgit

It seems the main topic of this blog is now git scratch branches and ways to manage them, although the main prompt for this one is discovering someone else had exactly the same idea, as I found from a blog post extolling Jujutsu.

I don’t have much to add to the posts from qword and Sandy, beyond the fact that Jujutsu really is the perfect tool to make this workflow straightforward. The default change selection logic in jj rebase means that 9 times out of 10 it’s enough just to run jj rebase -d master to get everything up to date with the master branch, and the Jujutsu workflow as a whole really is a great experience.

So go forth, use Jujutsu to manage your dev branch, and hopefully I’ll never have to write another post on this, and you can have the traditional “I rewrote my blogging engine from scratch again” post that I’ve been owing for a month or two now.

Chris WallaceNon-Eulerian paths

I’ve been doing a bit of work on Non-Eulerian paths.  I haven't made any algorithmic progress...

Chris WallaceMore Turtle shapes

I’ve come across some classic curves using arcs which can be described in my variant of Turtle...

Chris WallaceMy father in Tairua

My father in Tairua in 1929. Paku is in the background. My father, Francis Brabazon Wallace came...

Chris Wallace“Characters” a Scroll Saw Project

Now that the Delta Scroll saw is working, I was looking for a project to build up my skill. Large...

Phil SpencerWho’s sick of this shit yet?

I find some headlines just make me angry these days, especially ones centered around hyper late stage capitalism.


This one about Apple and Microsoft just made me go “Who the fuck cares?” and seriously, why should I care. those two idiot companies having insane and disgustingly huge market caps isn’t something I’m impressed by.

If anything it makes me furious.

Do something useful besides making iterations of the same ol junk. Make a few thousand houses, make an affordable grocery supply chain.

If you’re doing anything else you’re a waste of everyones time….as I type this on my Apple computer. Still, that bit of honesty aside I don’t give a fuck about either companies made up valuation.

Phil SpencerNew year new…..This

I have made a new years goal to retire this server before March, the OS has been upgraded many many times over the years and various software I’ve used has come and gone so there is lots of cruft. This server/VM started in San Fransisco and then my provider stopped offering VMs in CA and moved my VM to the UK which is where it has been ever since. This VM started its life in Jan 2008 and it is time to die.

During my 2 week xmas break I have been updating web facing software as much as I could so that when I do put the bullet in the head of this thing I can transfer my blog, wiki, and a couple other still active sites to the new OS without minimal tweaking in the new home.

So far the biggest issues I ran into were with my mediawiki, that entire site is very old, from around 2006 2 years before I started hosting it for someone and then I inherited it entirely around 2009 so the database is very finicky to upgrade and some of the extensions are no longer maintained. What I ended up doing was setting up a docker instance at home to test upgrading and working through the kinks and I have put together a solid step by step on how to move/upgrade it to latest.

I have also gotten sick of running my own e-mail servers, the spam management, certificates, block lists…..it’s annoying. I found out recently that iCloud which I already have a subscription to allows up to 5 custom e-mail domains so I retired my Philtopia e-mail to it early in December and as of today I moved the vo-wiki domain to it as well. Much less hassle for me, I already work enough for work I don’t need to work at home as well.

The other work continues, site by site but I think I am on track to put an end to this ol server early in the year.

Phil Spencer8bit party

It’s been a few years…four? since my Commodore 64 collection started and I’ve now got 2 working C64’s and a C128 that functions along with 2 disk drives, a tape drive and a collection of addon hardware and boxed games.

That isn’t all I am collecting however I also have my Nintendo Entertainment System and even more recently I acquired a Sega Master System. The 8bit era really seems to catch my eye far more than anything that came after. I suppose it’s because the whole era made it on hacks and luck.

In any case here are some pictures of my collection, I don’t collect for the sake of collecting. Everything I have I use or play cause otherwise why bother having it?

Enjoy

My desk
NES
Sega Master System
Commodore 64
Games

Phil SpencerI think it’s time the blog came back

It’s been a while since I’ve written a blog post, almost 4 years in fact but I think it is time for a comeback.

The reason for this being that social media has become so locked down you can’t actually give a valid opinion about something without someone flagging your comment or it being caught by a robot. Oddly enough it seems the right wing folks can say whatever they want against the immigrant villain of the month or LGTBQIA+ issues without being flagged but if you dare stand up to them or offer an opposing opinion. 30 day ban!

So it is time to dust off the ol blog and put my opinions to paper somewhere else just like the olden days before social media! It isn’t all bad of course, I’ve found mastodon quite open to opinions but the fediverse is getting a lot of corporate attention these days and i’m sure it’s only a year or two before even that ends up a complete mess.

Crack open the blogs and let those opinions fly

BitFolk Issue TrackerPanel - Feature #215 (New): Sort DNS domains alphabetically

The secondary DNS domains at https://panel.bitfolk.com/dns/ are currently ordered alphabetically, grouped by TLD. When there are many domains this is not completely obvious. It would perhaps be better to default to straight alpha order, or at the very least have that as an option.

Paul RaynerPrint (only) my public IP

Every now and then, I need to know my public IP. The easiest way to find it is to visit one of the sites which will display it for you, such as https://whatismyip.com. Whilst useful, all of the ones I know (including that one) are chock full of adverts, and can’t easily be scraped as part of any automated scripting.

This has been a minor irritation for years, so the other night I decided to fix it.

http://ip.pr0.uk is my answer. It’s 50 lines of rust, and is accessible via tcp on port 11111, and via http on port 8080.

use std::io::Write;

use std::net::{IpAddr, Ipv4Addr, Ipv6Addr, SocketAddr, TcpListener, TcpStream};
use chrono::Utc;
use threadpool::ThreadPool;

fn main() {
    let worker_count = 4;
    let pool = ThreadPool::new(worker_count);
    let tcp_port = 11111;
    let socket_v4_tcp = SocketAddr::new(IpAddr::V4(Ipv4Addr::new(0, 0, 0, 0)), tcp_port);

    let http_port = 8080;
    let socket_v4_http = SocketAddr::new(IpAddr::V4(Ipv4Addr::new(0, 0, 0, 0)), http_port);

    let socket_addrs = vec![socket_v4_tcp, socket_v4_http];
    let listener = TcpListener::bind(&socket_addrs[..]);
    if let Ok(listener) = listener {
        println!("Listening on {}:{}", listener.local_addr().unwrap().ip(), listener.local_addr().unwrap().port());
        for stream in listener.incoming() {
            let stream = stream.unwrap();
            let addr =stream.peer_addr().unwrap().ip().to_string();
            if stream.local_addr().unwrap_or(socket_v4_http).port() == tcp_port {
                pool.execute(move||send_tcp_response(stream, addr));
            } else {
                //http might be proxied via https so let anything which is not the tcp port be http
                pool.execute(move||send_http_response(stream, addr));
            }
        }
    } else {
        println!("Unable to bind to port")
    }
}

fn send_tcp_response(mut stream:TcpStream, addr:String) {
    stream.write_all(addr.as_bytes()).unwrap();
}

fn send_http_response(mut stream:TcpStream, addr:String) {

    let html = format!("<html><head><title>{}</title></head><body><h1>{}</h1></body></html>", addr, addr);
    let length = html.len();
    let response = format!("HTTP/1.1 200 OK\r\nContent-Length: {length}\r\n\r\n{html}" );
    stream.write_all(response.as_bytes()).unwrap();
    println!("{}\tHTTP\t{}",Utc::now().to_rfc2822(),addr)
}

A little explanation is needed on the array of SocketAddr. This came from an initial misreading of the docs, but I liked the result and decided to keep it that way. Calls to listen() will only listen on one port - the first one in the array which is free. The result is that when you run this program, it listens on port 11111. If you keep it running and start another copy, that one listens on port 80 (because it can’t bind to port 11111). So to run this on my server, I just have systemd keep 2 copies alive at any time.

The code and binaries for Linux and Windows are available on Github.

Next steps

I might well leave it there. It works for me, so it’s done. Here are some things I could do though:

1) Don’t hard code the ports 2) Proxy https 3) make a client 4) make it available as a binary for anyone to run on crates.io 5) Optionally print the ttl. This would be mostly useful to people running their own instance.

Boring Details

Logging

I log the IP, port, and time of each connection. This is just in case it ever gets flooded and I need to block an IP/range. The code you see above is the code I run. No browser detection, user agent or anythign like that is read or logged. Any data you send with the connection is discarded. If I proxied https via nginx, that might log a bit of extra data as a side effect.

Systemd setup

There’s not much to this either. I have a template file:

[Unit]
Description=Run the whatip binary. Instance %i
After=network.target

[Service]
ExecStart=/path/to/whatip
Restart=on-failure

StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=whatip%i

[Install]
WantedBy=multi-user.target

stored at /etc/systemd/system/whatip@.service and then set up two instances to run:

systemctl enable whatip@1
systemctl enable whatip@2

Thanks for reading

David Leadbeater"[31m"?! ANSI Terminal security in 2023 and finding 10 CVEs

A paper detailing how unescaped output to a terminal could result in unexpected code execution, on many terminal emulators. This research found around 10 CVEs in terminal emulators across common client platforms.

Alun JonesMessing with web spiders

You've surely heard of ChatGPT and its ilk. These are massive neural networks trained using vast swathes of text. The idea is that if you've trained a network on enough - about 467 words

BitFolk Issue TrackerPanel - Bug #214 (New): List of referrals includes closed accounts in the total

At the bottom of the customer's list of referrals is the line "That's about £xx.xx per year!" The "£xx.xx" is a total of all their referrals ever, even if those referrals are no longer active. It only makes sense to add up active referrals, while still showing all referrals.

Alun JonesI wrote a static site generator

Back in 2019, when Google+ was shut down, I decided to start writing this blog. It seemed better to take my ramblings under my own control, rather than posting content - about 716 words

Alun JonesSolar panels

After an 8 month wait, we finally had solar panels installed at the start of July. We originally ordered, from e-on, last November, and were told that there was a - about 542 words

Alun JonesVirtual WiFi

At work I've been developing a system which runs on an embedded Linux machine and provides a service via a "captive" WiFi access point. Much of my dev work on - about 243 words

BitFolk Issue TrackerMisc infrastructure - Feature #212: Publish a DKIM record for bitfolk.com and sign emails with it

Aggregate reports show that the Icinga host is sending mail as without DKIM signature, though SPF is already covered.
DKIM signatures added for this now.

Alex HudsonJobs in the AI Future

Everyone is talking about what AI can do right now, and the impact that it is likely to have on us. This weekends’s Semafor Flagship (which is an excellent newsletter; I recommend subscribing!) asks a great question: “What do we teach the AI generation?”. As someone who grew up with computers, knowing he wanted to write software, and knowing that tech was a growth area, I never had to grapple with this type of worry personally. But I do have kids now. And I do worry. I’m genuinely unsure what I would recommend a teenager to do today, right now. But here’s my current thinking.

Paul RudkinYour new post

Your new post

This is a new blog post. You can author it in Markdown, which is awesome.

David LeadbeaterNAT-Again: IRC NAT helper flaws

A Linux kernel bug allows unencrypted NAT'd IRC sessions to be abused to access resources behind NAT, or drop connections. Switch to TLS right now. Or read on.

Paul RaynerPutting dc in (chroot) jail

A little over 4 years ago, I set up a VM and configured it to offer dc over a network connection using xinetd. I set it up at http://dc.pr0.uk and made it available via a socket connection on port 1312.

Yesterday morning I woke to read a nice email from Sylvan Butler pointing out that users could run shell commands from dc…

I had set up the dc command to run as a user “dc”, but still, if someone could run a shell command they could, for example, put a key in the dc user’s .ssh config, run sendmail (if it was set up), try for privelidge escalations to get root etc.

I’m not sure what the 2017 version of me was thinking (or wasn’t), but the 2022 version of me is not happy to leave it like this. So here’s how I put dc in jail.

Firstly, how do you run shell commands from dc? It’s very easy. Just prefix with a bang:

$ dc
!echo "I was here" > /tmp/foo
!cat /tmp/foo
I was here

So, really easy. Even if it was hard, it would still be bad.

This needed to be fixed. Firstly I thought about what else was on the VM - nothing that matters. This is a good thing because the helpful Sylvan might not have been the first person to spot the issue (although network dc is pretty niche). I still don’t want this vulnerability though as someone else getting access to this box could still use it to send spam, host malware or anything else they wanted to do to a cheap tiny vm.

I looked at restricting the dc user further (it had no login shell, and no home directory already), but it felt like I would always be missing something, so I turned to chroot jails.

A chroot jail lets you run a command, specifying a directory which is used as / for that command. The command (In theory) can’t escape that directory, so can’t see or touch anything outside it. Chroot is a kernel feature, and forms a basic security feature of linux, so should be good enough to protect network dc if set up correctly, even if it’s not perfect.

Firstly, let’s set up the directory for the jail. We need the programs to run inside the jail, and their dependent libraries. The script to run a networked dc instance looks like this:

#!/bin/bash
dc --version
sed -u -e 's/\r/\n/g' | dc

Firstly, I’ve used bash here, but this script is trivial, so it can use sh instead. We also need to keep the sed (I’m sure there are plenty of ways to do the replace not using sed, but it’s working fine as it is). For each of the 3 programs we need to run the script, I ran ldd to get their dependencies:

$ ldd /usr/bin/dc
	linux-vdso.so.1 =>  (0x00007fffc85d1000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fc816f8d000)
	/lib64/ld-linux-x86-64.so.2 (0x0000555cd93c8000)
$ ldd /bin/sh
	linux-vdso.so.1 =>  (0x00007ffdd80e0000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fa3c4855000)
	/lib64/ld-linux-x86-64.so.2 (0x0000556443a1e000)
$ ldd /bin/sed
	linux-vdso.so.1 =>  (0x00007ffd7d38e000)
	libselinux.so.1 => /lib/x86_64-linux-gnu/libselinux.so.1 (0x00007faf5337f000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007faf52fb8000)
	libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x00007faf52d45000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007faf52b41000)
	/lib64/ld-linux-x86-64.so.2 (0x0000562e5eabc000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007faf52923000)
$

So we copy those files to the exact directory structure inside the jail directory. Afterwards it looks like this:

$ ls -alR
.:
total 292
drwxr-xr-x 4 root root   4096 Feb  5 10:13 .
drwxr-xr-x 4 root root   4096 Feb  5 09:42 ..
-rwxr-xr-x 1 root root  47200 Feb  5 09:50 dc
-rwxr-xr-x 1 root root     72 Feb  5 10:13 dctelnet
drwxr-xr-x 3 root root   4096 Feb  5 09:49 lib
drwxr-xr-x 2 root root   4096 Feb  5 09:50 lib64
-rwxr-xr-x 1 root root  72504 Feb  5 09:58 sed
-rwxr-xr-x 1 root root 154072 Feb  5 10:06 sh

./lib:
total 12
drwxr-xr-x 3 root root 4096 Feb  5 09:49 .
drwxr-xr-x 4 root root 4096 Feb  5 10:13 ..
drwxr-xr-x 2 root root 4096 Feb  5 10:01 x86_64-linux-gnu

./lib/x86_64-linux-gnu:
total 2584
drwxr-xr-x 2 root root    4096 Feb  5 10:01 .
drwxr-xr-x 3 root root    4096 Feb  5 09:49 ..
-rwxr-xr-x 1 root root 1856752 Feb  5 09:49 libc.so.6
-rw-r--r-- 1 root root   14608 Feb  5 10:00 libdl.so.2
-rw-r--r-- 1 root root  468920 Feb  5 10:00 libpcre.so.3
-rwxr-xr-x 1 root root  142400 Feb  5 10:01 libpthread.so.0
-rw-r--r-- 1 root root  146672 Feb  5 09:59 libselinux.so.1

./lib64:
total 168
drwxr-xr-x 2 root root   4096 Feb  5 09:50 .
drwxr-xr-x 4 root root   4096 Feb  5 10:13 ..
-rwxr-xr-x 1 root root 162608 Feb  5 10:01 ld-linux-x86-64.so.2
$

and here is the modified dctelnet command:

#!/sh
#dc | dos2unix 2>&1
./dc --version
./sed -u -e 's/\r/\n/g' | ./dc

I’ve switched to using sh instead of bash, and all of the commands are now relative paths, as they are just in the root directory.

First attempt

Now I have a directory that I can use for a chrooted dc network dc. I need to set up the xinetdconfig to use chroot and the jail I have set up:

service dc
{
	disable		= no
	type		= UNLISTED
	id		= dc-stream
	socket_type	= stream
	protocol	= tcp
	server		= /usr/sbin/chroot
	server_args	= /home/dc/ ./dctelnet
	user		= root
	wait		= no
	port		= 1312
	rlimit_cpu	= 60
	env		= HOME=/ PATH=/
}

I needed to set the HOME and PATH environment variables otherwise (not sure whether it was sh,sed or dc causing it) I got a segfault, and to run chroot, you need to be root, so I could no longer run the service as the user dc. This shouldn’t be a problem because the resulting process is constrained.

A bit more security

Chroot jails have a reputation for being easy to get wrong, and they are not something I have done a lot of work with, so I want to take a bit of time to think about whether I’ve left any glaring holes, and also try to improve on the simple option above a bit if I can.

Firstly, can dc still execute commands with the ! operation?

 ~> nc -v dc.pr0.uk 1312
Connection to dc.pr0.uk 1312 port [tcp/*] succeeded!
dc (GNU bc 1.06.95) 1.3.95

Copyright 1994, 1997, 1998, 2000, 2001, 2004, 2005, 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE,
to the extent permitted by law.
!ls
^C⏎

Nope. Ok, that’s good. The chroot jail has sh though, and has it in the PATH, so can it still get a shell and call dc, sh and sed?

 ~> nc -v dc.pr0.uk 1312
Connection to dc.pr0.uk 1312 port [tcp/*] succeeded!
dc (GNU bc 1.06.95) 1.3.95

Copyright 1994, 1997, 1998, 2000, 2001, 2004, 2005, 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE,
to the extent permitted by law.
!pwd
^C⏎

pwd is a builtin, so it looks like the answer is no, but why? Running strings on my version of dc, there is no mention of sh or exec, but there is a mention of system. From the man page of system:

The system() library function uses fork(2) to create a child process that executes the shell  command  specified in command using execl(3) as follows:

           execl("/bin/sh", "sh", "-c", command, (char *) 0);

So dc calls system() when you use !, which makes sense. system() calls /bin/sh, which does not exist in the jail, breaking the ! call.

For a system that I don’t care about, that is of little value to anyone else, that sees very little traffic, that’s probably good enough, but I want to make it a bit better - if there was a problem with the dc program, or you could get it to pass something to sed, and trigger an issue with that, you could mess with the jail file system, overwrite the dc application, and likely break out of jail as the whole thing is running as root.

So I want to do two things. Firstly, I don’t want dc running as root in the jail. Secondly, I want to throw away the environment after each use, so if you figure out how to mess with it you don’t affect anyone else’s fun.

Here’s a bash script which I think does both of these things:

#!/bin/bash
set -e
DCDIR="$(mktemp -d /tmp/dc_XXXX)"
trap '/bin/rm -rf -- "$DCDIR"' EXIT
cp -R /home/dc/ $DCDIR/
cd $DCDIR/dc
PATH=/
HOME=/
export PATH
export HOME
/usr/sbin/chroot --userspec=1001:1001 . ./dctelnet
  • Line 2 - set -e causes the script to exit on the first error
  • Lines 3 & 4 - make a temporary directory to run in, then set a trap to clean it up when the script exits.
  • I then copy the required files for the jail to the new temp directory, set $HOME and SPATH and run the jail as an unprivileged user (uid 1001).

Now to make some changes to the xinetd file:

service dc
{
        disable         = no
        type            = UNLISTED
        id              = dc-stream
        socket_type     = stream
        protocol        = tcp
        server          = /usr/local/bin/dcinjail
        user            = root
        wait            = no
        port            = 1312
        rlimit_cpu      = 60
        log_type        = FILE /var/log/dctelnet.log
        log_on_success  = HOST PID DURATION
        log_on_failure  = HOST
}

The new version just runs the script from above. It still needs to run as root to be able to chroot.

I’ve also added some logging as this has piqued my interest and I want to see how many people (other than me) ever connect, and for how long.

As always, I’m interested in feedback or questions. I’m no expert in this setup so may not be able to answer questions, but if you see something that looks wrong (or that you know is wrong), please let me know. I’m also interested to hear other ways of process isolation - I know I could have used containers, and think I could have used systemd or SELinux features (or both) to further lock down the dc user and achive a similar result.

Thanks for reading.

Christopher RobertsFixing SVG Files in DokuWiki

Featured Image

Having upgraded a DokuWiki server from 16.04 to 18.04, I found that SVG images were no longer displaying in the browser. As I was unable to find any applicable answers on-line, I thought I should break my radio silence by detailing my solution.

Inspecting the file using browser tools, Network and refreshing the page showed that the file was being downloaded as octet-stream. Sure enough using curl showed the same.

curl -Ik https://example.com/file.svg

All the advice on-line is to ensure that /etc/nginx/mime-types includes the line:

image/svg+xml   svg svgz;

But that was already in place.

I decided to try uploading the SVG file again, in case the Inkscape format was causing breakage. Yes, a long-shot indeed.

The upload was rejected by DokuWiki, as SVG was not in the list of allowed file extensions; so I added the following line to /var/www/dokuwiki/conf/mime.local.conf:

svg   image/svg_xml

Whereon the images started working again. Presumably Dokuwiki was seeing the mime-type as image/svg instead of image/svg+xml and this mismatch was preventing nginx serving up the correct content-type.

Hopefully this will help others, do let me know if it has helped you.

Paul RaynerSnakes and Ladders, Estimation and Stats (or 'Sometimes It Takes Ages')

Snakes And Ladders

Simple kids game, roll a dice and move along a board. Up ladders, down snakes. Not much to it?

We’ve been playing snakes and ladders a bit (lot) as a family because my 5 year old loves it. Our board looks like this:

Some games on this board take a really long time. My son likes to play games till the end, so until all players have finished. It’s apparently really funny when everyone else has finished and I keep finding the snakes over and over. Sometimes one player finishes really quickly - they hit some good ladders, few or no snakes and they are done in no time.

This got me thinking. What’s the distribution of game lengths for snakes and ladders? How long should we expect a game to take? How long before we typically have a winner?

Fortunately for me, snakes and ladders is a very simple game to model with a bit of python code.

Firstly, here are the rules we play:

1) Each player rolls a normal 6 sided dice and moves their token that number of squares forward. 2) If a player lands on the head of a snake, they go down the snake 3) If a player lands on the bottom of a ladder, they go up to the top of the ladder. 4) If a player rolls a 6, they get another roll 5) On this board, some ladders and snakes interconnect - the bottom of a snake is the head of another, or the top of a ladder is also the head of a snake. When this happens, you do all of the actions in turn, so down both snakes or up the ladder, down the snake. 6) You don’t need an exact roll to finish, once you get 100 or more, you are done.

To model the board in python, all we really need are the coordinates of the snakes and the ladders - their starting and ending squares.

def get_snakes_and_ladders():

   snakes = [
        (96,27),
        (88,66),
        (89,46),
        (79,44),
        (76,19),
        (74,52),
        (57,3),
        (60,39),
        (52,17),
        (50,7),
        (32,15),
        (30,9)
    ]
    ladders = [
        (6,28),
        (10,12),
        (18,37),
        (40,42),
        (49,67),
        (55,92),
        (63,76),
        (61,81),
        (86,94)
    ]
    return snakes + ladders

Since snakes and ladders are both mappings from one point to another, we can combine them in one array as above.

The game is moddeled with a few lines of python:

class Game:

    def __init__(self) -> None:
        self.token = 1
        snakes_and_ladders_list = get_snakes_and_ladders()
        self.sl = {}
        for entry in snakes_and_ladders_list:
            self.sl[entry[0]] = entry[1]

    def move(self, howmany):
        self.token += howmany
        while (self.token in self.sl):
            self.token = self.sl[self.token]
        return self.token

    def turn(self):
        num = self.roll()
        self.move(num)
        if num == 6:
            self.turn()
        if self.token>=100:
            return True
        return False

    def roll(self):
        return randint(1,6)

A turn consists of all the actions taken by a player before the next player gets their turn. This can consist of multiple moves if the player rolles one or more sixes, as rolling a six gives you another move.

With this, we can run some games and plot them. Here’s what a sample looks like.

The Y axis is the position on the board, and the X axis is the number of turns. This small graphical representation of the game shows how variable it can be. The red player finishes in under 20 moves, whereas the orange player takes over 80.

To see how variable it is, we can run the simulation a large number of times and look at the results. Running for 10,000 games we get the following:

function result
min 5
max 918
mean 90.32
median 65

So the fastest finish in 10,000 games was just 5 turns, and the slowest was an awful (if you were rolling the dice) 918 turns.

Here are some histograms for the distribution of game lengths, the distribution of number of turns for a player to win in a 3 person game, and the number of turns for all players to finish in a 3 person game.

The python code for this post is at snakes.py

Alex HudsonIntroduction to the Metaverse

You’ve likely heard the term “metaverse” many times over the past few years, and outside the realm of science fiction novels, it has tended to refer to some kind of computer-generated world. There’s often little distinction between a “metaverse” and a relatively interactive virtual reality world.

There are a huge number of people who think this simply a marketing term, and Facebook’s recent rebranding of its holding company to “Meta” has only reinforced this view. However, I think this view is wrong, and I hope to explain why.

Alex HudsonIt's tough being an Azure fan

Azure has never been the #1 cloud provider - that spot continues to belong to AWS, which is the category leader. However, in most people’s minds, it has been a pretty reasonable #2, and while not necessarily vastly differentiated from AWS there are enough things to write home about.

However, even as a user and somewhat of a fan of the Azure technology, it is proving increasing difficult to recommend.

Josh HollandMore on git scratch branches: using stgit

More on git scratch branches: using stgit

I wrote a short post last year about a useful workflow for preserving temporary changes in git by using a scratch branch. Since then, I’ve come across stgit, which can be used in much the same way, but with a few little bells and whistles on top.

Let’s run through a quick example to show how it works. Let’s say I want to play around with the cool new programming language Zig and I want to build the compiler myself. The first step is to grab a source code checkout:

$ git clone https://github.com/ziglang/zig
        Cloning into 'zig'...
        remote: Enumerating objects: 123298, done.
        remote: Counting objects: 100% (938/938), done.
        remote: Compressing objects: 100% (445/445), done.
        remote: Total 123298 (delta 594), reused 768 (delta 492), pack-reused 122360
        Receiving objects: 100% (123298/123298), 111.79 MiB | 6.10 MiB/s, done.
        Resolving deltas: 100% (91169/91169), done.
        $ cd zig
        

Now, according to the instructions we’ll need to have CMake, GCC or clang and the LLVM development libraries to build the Zig compiler. On NixOS it’s usual to avoid installing things like this system-wide but instead use a file called shell.nix to specify your project-specific dependencies. So here’s the one ready for Zig (don’t worry if you don’t understand the Nix code, it’s the stgit workflow I really want to show off):

$ cat > shell.nix << EOF
        { pkgs ? import <nixpkgs> {} }:
        pkgs.mkShell {
          buildInputs = [ pkgs.cmake ] ++ (with pkgs.llvmPackages_12; [ clang-unwrapped llvm lld ]);
        }
        EOF
        $ nix-shell
        

Now we’re in a shell with all the build dependencies, and we can go ahead with the mkdir build && cd build && cmake .. && make install steps from the Zig build instructions1.

But now what do we do with that shell.nix file?

$ git status
        On branch master
        Your branch is up to date with 'origin/master'.
        
        Untracked files:
          (use "git add <file>..." to include in what will be committed)
                shell.nix
        
        nothing added to commit but untracked files present (use "git add" to track)
        

We don’t really want to add it to the permanent git history, since it’s just a temporary file that is only useful to us. But the other options of just leaving it there untracked or adding it to .git/info/exclude are unsatisfactory as well: before I started using scratch branches and stgit, I often accidentally deleted my shell.nix files which were sometimes quite annoying to have to recreate when I needed to pin specific dependency versions and so on.

But now we can use stgit to take care of it!

$ stg init # stgit needs to store some metadata about the branch
        $ stg new -m 'add nix config'
        Now at patch "add-nix-config"
        $ stg add shell.nix
        $ stg refresh
        Now at patch "add-nix-config"
        

This little dance creates a new commit adding our shell.nix managed by stgit. You can stg pop it to unapply, stg push2 to reapply, and stg pull to do a git pull and reapply the patch back on top. The main stgit documentation is helpful to explain all the possible operations.

This solves all our problems! We have basically recreated the scratch branch from before, but now we have pre-made tools to apply, un-apply and generally play around with it. The only problem is that it’s really easy to accidentally push your changes back to the upstream branch.

Let’s have another example. Say I’m sold on the stgit workflow, I have a patch at the bottom of my stack adding some local build tweaks and, on top of that, a patch that I’ve just finished working on that I want to push upstream.

$ cd /some/other/project
        $ stg series # show all my patches
        + add-nix-config
        > fix-that-bug
        

Now I can use stg commit to turn my stgit patch into a real immutable git commit that stgit isn’t going to mess around with any more:

$ stg commit fix-that-bug
        Popped fix-that-bug -- add-nix-config
        Pushing patch "fix-that-bug" ... done
        Committed 1 patch
        Pushing patch "add-nix-config ... done
        Now at patch "add-nix-config"
        

And now what we should do before git pushing is stg pop -a to make sure that we don’t push add-nix-config or any other local stgit patches upstream. Sadly it’s all too easy to forget that, and since stgit updates the current branch to point at the current patch, just doing git push here will include the commit representing the add-nix-config patch.

The way to prevent this is through git’s hook system. Save this as pre-push3 (make sure it’s executable):

#!/bin/bash
        # An example hook script to verify what is about to be pushed.  Called by "git
        # push" after it has checked the remote status, but before anything has been
        # pushed.  If this script exits with a non-zero status nothing will be pushed.
        #
        # This hook is called with the following parameters:
        #
        # $1 -- Name of the remote to which the push is being done
        # $2 -- URL to which the push is being done
        #
        # If pushing without using a named remote those arguments will be equal.
        #
        # Information about the commits which are being pushed is supplied as lines to
        # the standard input in the form:
        #
        #   <local ref> <local sha1> <remote ref> <remote sha1>
        
        remote="$1"
        url="$2"
        
        z40=0000000000000000000000000000000000000000
        
        while read local_ref local_sha remote_ref remote_sha
        do
            if [ "$local_sha" = $z40 ]
            then
                # Handle delete
                :
            else
                # verify we are on a stgit-controlled branch
                git show-ref --verify --quiet "${local_ref}.stgit" || continue
                if [ $(stg series --count --applied) -gt 0 ]
                then
                    echo >&2 "Unapplied stgit patch found, not pushing"
                    exit 1
                fi
            fi
        done
        
        exit 0
        

Now we can’t accidentally4 shoot ourselves in the foot:

$ git push
        Unapplied stgit patch found, not pushing
        error: failed to push some refs to <remote>
        

Happy stacking!


  1. At the time of writing, Zig depends on the newly-released LLVM 12 toolchain, but this hasn’t made it into the nixos-unstable channel yet, so this probably won’t work on your actual NixOS machine.↩︎

  2. an unfortunate naming overlap between pushing onto a stack and pushing a git repo↩︎

  3. A somewhat orthogonal but also useful tip here so that you don’t have to manually add this to every repository is to configure git’s core.hooksDir to something like ~/.githooks and put it there.↩︎

  4. You can always pass --no-verify if you want to bypass the hook.↩︎

Jon FautleyUsing the Grafana Cloud Agent with Amazon Managed Prometheus, across multiple AWS accounts

Observability is all the rage these days, and the process of collecting metrics is getting easier. Now, the big(ger) players are getting in on the action, with Amazon releasing a Managed Prometheus offering and Grafana now providing a simplified “all-in-one” monitoring agent. This is a quick guide to show how you can couple these two together, on individual hosts, and incorporating cross-account access control. The Grafana Cloud Agent Grafana Labs have taken (some of) the best bits of the Prometheus monitoring stack and created a unified deployment that wraps the individual moving parts up into a single binary.

Paul RaynerValentines Gift - a Tidy Computer Cupboard

Today, my lovely wife (who is far more practical than me) gave me this as a valentines present (along with a nice new pair of Nano X1 trainers).

This is my nice new home server rack. It’s constructed from the finest pallet wood and repurposed chipboard, and has 8 caster wheels (cheaper than the apple ones) on the bottom.

After three and a half years living in our house, the cupboard under the stairs was a mess of jumbled cables and computer bits. It all worked, but with things balanced on other things, held up by their cables, and three years of dust everywhere it really needed an overhall. We’ve recently had a new fibre connection go in (yay - 1Gbps at home!), so yet another cable, and yet another box to balance on top of other boxes.

This was the sorry mess that I called a home network this morning:

And thanks to my lovely gift, and some time to rewire everything (make new cables), it now looks like this:

and a close up:

In there I have my server, UPS, NAS, phone system, lighting system, FTTC broadband, Fibre broadband, router, main switch, and a cable going out and round the house into my office. Lovely and neat, and because it’s on wheels, I can pull it out to get round the back :-)

I am very happy with my new setup.

Paul RudkinSpring Festival Extra Bandwidth

Due to local restrictions on mass gatherings, this year’s Spring Festival Ceremony of my employer will be held online for all employees (> 2000).

To support the peak in bandwidth demand, some of the mobile phone providers have added additional cells in the grounds of our company. They have been testing for the last few days, so in a few hours we will see how they will perform!!

Paul Rudkin

The roads of China have just got a little more dangerous. My wife has passed her China driving test today!

People of China, save yourself while you can!

Paul RudkinShanghai reports 6 local COVID-19 cases, first outbreak since November - Global Times

Shanghai found six local confirmed COVID-19 cases in its most populous downtown Huangpu district on Thursday, two months since the last local case was reported in the city, local health authority said on Friday.

Source:Shanghai reports 6 local COVID-19 cases, first outbreak since November - Global Times

Footnotes