Planet BitFolk

BitFolk WikiSponsored hosting

Sponsored Projects: +BarCamp Surrey

← Older revision Revision as of 23:53, 13 January 2025
(One intermediate revision by the same user not shown)
Line 43: Line 43:


* [https://57north.co/ 57North Hacklab]
* [https://57north.co/ 57North Hacklab]
* [https://barcampsurrey.org/ BarCamp Surrey]
* [https://edinburghhacklab.com/ Edinburgh Hacklab]
* [https://edinburghhacklab.com/ Edinburgh Hacklab]
* [https://eof.org.uk/ EOF Hackspace]
* [https://eof.org.uk/ EOF Hackspace]
Line 54: Line 55:
* [http://www.somakeit.org.uk/ Southampton Makerspace]
* [http://www.somakeit.org.uk/ Southampton Makerspace]
* [http://www.surrey.lug.org.uk/ Surrey Linux User Group]
* [http://www.surrey.lug.org.uk/ Surrey Linux User Group]
* [http://ubuntupodcast.org/ Ubuntu Podcast]

BitFolk Issue TrackerBilling - Feature #219 (Closed): Add host name to data transfer reports

Seems to be working

BitFolk Issue TrackerBilling - Feature #219 (In Progress): Add host name to data transfer reports

BitFolk WikiBooting

Boot process: Removed tag for lost animated gif from imgur. Very old info anyway

← Older revision Revision as of 02:13, 10 January 2025
Line 28: Line 28:


==Boot process==
==Boot process==
<imgur thumb="yes" w="562">cufm6gd.gif</imgur>
On the right you should see an animated GIF of a terminal session where a Debian stretch VPS is booted with a GRUB-legacy config file. The bootloader config is viewed and booted. Then '''grub-pc''' package is installed to convert the VPS to GRUB2. ttyrec or ttygif seem to have introduced some corruption and offset characters, but you probably get the idea.
If you're looking at your console in the Xen Shell then the first thing that you should see is a list of boot methods that BitFolk's GRUB recognises. It checks for each of the following things, in order, on each of your block devices and every partition of those block devices:
If you're looking at your console in the Xen Shell then the first thing that you should see is a list of boot methods that BitFolk's GRUB recognises. It checks for each of the following things, in order, on each of your block devices and every partition of those block devices:


Alun JonesManaging load from abusive web bots

A few months back I created a small web application which generated a fake hierarchy of web pages, on the fly, using a Markov Chain to make gibberish content that - about 1987 words

David LeadbeaterDéjà vu: Ghostly CVEs in my terminal title

Exploring a security bug in Ghostty that is eerily familiar.

BitFolk Issue TrackerBilling - Feature #219 (Closed): Add host name to data transfer reports

A customer has requested that rather than just the VPS account name being used in the data transfer emails, the host name (reverse DNS of the primary IPv4 address) should be shown too, as they found it confusing which VPS was the subject of the report.

So instead of:

From: BitFolk Data Transfer Monitor <xfer@bitfolk.com>
Subject: [4/4] BitFolk VPS 'tom' data transfer report

This is a weekly report of data transfer for domain 'tom' on
host 'tanqueray.bitfolk.com'. This report is informational only and not
an invoice or bill.

It would look more like:

From: BitFolk Data Transfer Monitor <xfer@bitfolk.com>
Subject: [4/4] BitFolk VPS 'tom' (tom.dogsitter.services) data transfer report

This is a weekly report of data transfer for domain 'tom'
(tom.dogsitter.services) on host 'tanqueray.bitfolk.com'. This report
is informational only and not an invoice or bill.

BitFolk Issue TrackerBitFolk - Feature #216: Add phishing-resistant authentication for https://panel.bitfolk.com/

Thanks. I have started to give this a read now. 😀

Jon SpriggsQuick Tip: Don’t use concat in your spreadsheet, use textjoin!

I found this on Threads today

CONCAT vs TEXTJOIN – The ultimate showdown! 🥊
TEXTJOIN is the GOAT:
=TEXTJOIN(“, “, TRUE, A1:A10)
â—� Adds delimiters automatically
â—� Ignores empty cells
â—� Works with ranges
Goodbye CONCAT, you won’t be missed!

And I’ve tested it this morning. I don’t have excel any more, but it works on Google Sheets, no worries!

BitFolk Issue TrackerBitFolk - Feature #216: Add phishing-resistant authentication for https://panel.bitfolk.com/

Here's an excellent tour through everything WebAuthn.

https://www.imperialviolet.org/tourofwebauthn/tourofwebauthn.html

Andy SmithI recommend avoiding the need to have panretinal photocoagulation (PRP) laser treatment

WARNING

This article contains descriptions of medical procedures on the eye. If that sort of thing makes you squeamish you may want to give it a miss.

Yesterday I had panretinal photocoagulation (PRP) laser treatment in both eyes, and it was quite unpleasant. I recommend trying really hard to avoid having to ever have that if possible.

PRP is used to manage symptoms of proliferative diabetic retinopathy. A laser is used to burn abnormal new blood vessels around the retina.

Having had a different kind of laser treatment before I wasn't expecting this to be a big deal. Unfortunately I was wrong and it was a bit of an ordeal.

As usual at these eye examinations I had drops to dilate my pupils and a bunch of different scans and photographs of the back of my eye taken so they knew what they were dealing with. Then in preparation for the procedure, some numbing eye drops. It's an odd sensation not being able to feel your eyelids or the skin around your eyes, but that part wasn't uncomfortable.

Next up the consultant held some sort of eyepiece firmly against the surface of my eyeball and applied a decent amount of pressure to keep it in place.

Then the laser pulses began. Many, many pulses. Each caused an unpleasant stabbing sensation in my eyeball with a dull ache following it. It wasn't so much that it was painful — Wikipedia describes this as "stinging" and in isolation I'd agree with that description. However while this was taking place my head was in a chin rest with an eyepiece thing pressed against my eyeball and the knowledge that if I moved unexpectedly then I risked having my vision destroyed by the laser. And these laser pulses were coming multiple times per second.

I was doing some grunting at the discomfort of each laser pulse when…

Consultant: What! I'm on 30% power. If I make it any lower it'll be homeopathy, know what I mean? It needs to be effective.

Me, through gritted teeth: Just do what you need to do.

Another thing I was not prepared for was total blindness during the procedure and for a few minutes after. He was telling me to look in certain directions, but my vision had gone completely black due to the laser so I couldn't actually tell which direction I was looking in.

Then when it was finally over for one eye, I still could not see anything and due to the anaesthetic could not even tell if my eye was open or not as I couldn't feel my eyelid. Thankfully that recovered after a couple of minutes so he could begin on the other eye…

Post procedure was not too bad. It's an outpatient procedure and I was immediately able to go home on the bus! My eyes just felt tired and took a lot longer to recover from the dilation drops than they usually do (I have vision tests several times a year and they always involve dilation drops). A headache between the temples did force me to go to bed early, but feel fine today.

So… if you have diabetes then blood sugar control is important to help avoid having to go through something like this. If you lose a genetic lottery then after decades living with diabetes you may need it anyway, or if you win then perhaps you never do, but I just suggest doing what you can to improve your odds.

This is still only the second most unpleasant procedure I've had on my eye though!

Jon SpriggsA few weird issues in the networking on our custom AWS EKS Workers, and how we worked around them

For “reasons”, at work we run AWS Elastic Kubernetes Service (EKS) with our own custom-built workers. These workers are based on Alma Linux 9, instead of AWS’ preferred Amazon Linux 2023. We manage the deployment of these workers using AWS Auto-Scaling Groups.

Our unusal configuration of these nodes mean that we sometimes trip over configurations which are tricky to get support on from AWS (no criticism of their support team, if I was in their position, I wouldn’t want to try to provide support for a customer’s configuration that was so far outside the recommended configuration either!)

Over the past year, we’ve upgraded EKS1.23 to EKS1.27 and then on to EKS1.31, and we’ve stumbled over a few issues on the way. Here are a couple of notes on the subject, in case they help anyone else in their journey.

All three of the issues below were addressed by running an additional service on the worker nodes in a Systemd timed service which triggers every minute.

Incorrect routing for the 12th IP address onwards

Something the team found really early on (around EKS 1.18 or somewhere around there) was that the AWS VPC-CNI wasn’t managing the routing tables on the node properly. We raised an issue on the AWS VPC CNI (we were on CentOS 7 at the time) and although AWS said they’d fixed the issue, we currently need to patch the routing tables every minute on our nodes.

What happens?

When you get past the number of IP addresses that a single ENI can have (typically ~12), the AWS VPC-CNI will attach a second interface to the worker, and start adding new IP addresses to that. The VPC-CNI should setup routing for that second interface, but for some reason, in our case, it doesn’t. You can see this happens because the traffic will come in on the second ENI, eth1, but then try to exit the node on the first ENI, eth0, with a tcpdump, like this:

[root@test-i-01234567890abcdef ~]# tcpdump -i any host 192.0.2.123
tcpdump: data link type LINUX_SLL2
dropped privs to tcpdump
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes
09:38:07.331619 eth1  In  IP ip-192-168-1-100.eu-west-1.compute.internal.41856 > ip-192-0-2-123.eu-west-1.compute.internal.irdmi: Flags [S], seq 1128657991, win 64240, options [mss 1359,sackOK,TS val 2780916192 ecr 0,nop,wscale 7], length 0
09:38:07.331676 eni989c4ec4a56 Out IP ip-192-168-1-100.eu-west-1.compute.internal.41856 > ip-192-0-2-123.eu-west-1.compute.internal.irdmi: Flags [S], seq 1128657991, win 64240, options [mss 1359,sackOK,TS val 2780916192 ecr 0,nop,wscale 7], length 0
09:38:07.331696 eni989c4ec4a56 In  IP ip-192-0-2-123.eu-west-1.compute.internal.irdmi > ip-192-168-1-100.eu-west-1.compute.internal.41856: Flags [S.], seq 3367907264, ack 1128657992, win 26847, options [mss 8961,sackOK,TS val 1259768406 ecr 2780916192,nop,wscale 7], length 0
09:38:07.331702 eth0  Out IP ip-192-0-2-123.eu-west-1.compute.internal.irdmi > ip-192-168-1-100.eu-west-1.compute.internal.41856: Flags [S.], seq 3367907264, ack 1128657992, win 26847, options [mss 8961,sackOK,TS val 1259768406 ecr 2780916192,nop,wscale 7], length 0

The critical line here is the last one – it’s come in on eth1 and it’s going out of eth0. Another test here is to look at ip rule

[root@test-i-01234567890abcdef ~]# ip rule
0:	from all lookup local
512:	from all to 192.0.2.111 lookup main
512:	from all to 192.0.2.143 lookup main
512:	from all to 192.0.2.66 lookup main
512:	from all to 192.0.2.113 lookup main
512:	from all to 192.0.2.145 lookup main
512:	from all to 192.0.2.123 lookup main
512:	from all to 192.0.2.5 lookup main
512:	from all to 192.0.2.158 lookup main
512:	from all to 192.0.2.100 lookup main
512:	from all to 192.0.2.69 lookup main
512:	from all to 192.0.2.129 lookup main
1024:	from all fwmark 0x80/0x80 lookup main
1536:	from 192.0.2.123 lookup 2
32766:	from all lookup main
32767:	from all lookup default

Notice here that we have two entries from all to 192.0.2.123 lookup main and from 192.0.2.123 lookup 2. Let’s take a look at what lookup 2 gives us, in the routing table

[root@test-i-01234567890abcdef ~]# ip route show table 2
192.0.2.1 dev eth1 scope link

Fix the issue

This is pretty easy – we need to add a default route if one doesn’t already exist. Long before I got here, my boss created a script which first runs ip route show table main | grep default to get the gateway for that interface, then runs ip rule list, looks for each lookup <number> and finally runs ip route add to put the default route on that table, the same as on the main table.

ip route add default via "${GW}" dev "${INTERFACE}" table "${TABLE}"

Is this still needed?

I know when we upgraded our cluster from EKS1.23 to EKS1.27, this script was still needed. When I’ve just checked a worker running EKS1.31, after around 12 hours of running, and a second interface being up, it’s not been needed… so perhaps we can deprecate this script?

Dropping packets to the containers due to Martians

When we upgraded our cluster from EKS1.23 to EKS1.27 we also changed a lot of the infrastructure under the surface (AlmaLinux 9 from CentOS7, Containerd and Runc from Docker, CGroups v2 from CGroups v1, and so on). We also moved from using an AWS Elastic Load Balancer (ELB) or “Classic Load Balancer” to AWS Network Load Balancer (NLB).

Following the upgrade, we started seeing packets not arriving at our containers and the system logs on the node were showing a lot of martian source messages, particularly after we configured our NLB to forward original IP source addresses to the nodes.

What happens

One thing we noticed was that each time we added a new pod to the cluster, it added a new eni[0-9a-f]{11} interface, but the sysctl value for net.ipv4.conf.<interface>.rp_filter (return path filtering – basically, should we expect the traffic to be arriving at this interface for that source?) in sysctl was set to 1 or “Strict mode” where the source MUST be the coming from the best return path for the interface it arrived on. The AWS VPC-CNI is supposed to set this to 2 or “Loose mode” where the source must be reachable from any interface.

In this case you’d tell this because you’d see this in your system journal (assuming you’ve got net.ipv4.conf.all.log_martians=1 configured):

Dec 03 10:01:19 test-i-01234567890abcdef kernel: IPv4: martian source 192.168.1.100 from 192.0.2.123, on dev eth1

The net result is that packets would be dropped by the host at this point, and they’d never be received by the containers in the pods.

Fix the issue

This one is also pretty easy. We run sysctl -a and loop through any entries which match net.ipv4.conf.([^\.]+).rp_filter = (0|1) and then, if we find any, we run sysctl -w net.ipv4.conf.\1.rp_filter = 2 to set it to the correct value.

Is this still needed?

Yep, absolutely. As of our latest upgrade to EKS1.31, if this value isn’t set, then it will drop packets. VPC-CNI should be fixing this, but for some reason it doesn’t. And setting the conf.ipv4.all.rp_filter to 2 doesn’t seem to make a difference, which is contrary to the documentation in the relevant Kernel documentation.

After 12 IP addresses are assigned to a node, Kubernetes services stop working for some pods

This was pretty weird. When we upgraded to EKS1.31 on our smallest cluster we initially thought we had an issue with CoreDNS, in that it sometimes wouldn’t resolve IP addresses for services (DNS names for services inside the cluster are resolved by <servicename>.<namespace>.svc.cluster.local to an internal IP address for the cluster – in our case, in the range 172.20.0.0/16). We upgraded CoreDNS to the EKS1.31 recommended version, v1.11.3-eksbuild.2 and that seemed to fix things… until we upgraded our next largest cluster, and things REALLY went wrong, but only when we had increased to over 12 IP addresses assigned to the node.

You might see this as frequent restarts of a container, particularly if you’re reliant on another service to fulfil an init container or the liveness/readyness check.

What happens

EKS1.31 moves KubeProxy from iptables or ipvs mode to nftables – a shift we had to make internally as AlmaLinux 9 no longer supports iptables mode, and ipvs is often quite flaky, especially when you have a lot of pod movements.

With a single interface and up to 11 IP addresses assigned to that interface, everything runs fine, but the moment we move to that second interface, much like in the first case above, we start seeing those pods attached to the second+ interface being unable to resolve service addresses. On further investigation, doing a dig from a container inside that pod to the service address of the CoreDNS service 172.20.0.10 would timeout, but a dig against the actual pod address 192.0.2.53 would return a valid response.

Under the surface, on each worker, KubeProxy adds a rule to nftables to say “if you try and reach 172.20.0.10, please instead direct it to 192.0.2.53”. As the containers fluctuate inside the cluster, KubeProxy is constantly re-writing these rules. For whatever reason though, KubeProxy currently seems unable to determine that a second or subsequent interface has been added, and so these rules are not applied to the pods attached to that interface…. or at least, that’s what it looks like!

Fix the issue

In this case, we wrote a separate script which was also triggered every minute. This script looks to see if the interfaces have changed by running ip link and looking for any interfaces called eth[0-9]+ which have changed, and then if it has, it runs crictl pods (which lists all the running pods in Containerd), looks for the Pod ID of KubeProxy, and then runs crictl stopp <podID> [1] and crictl rmp <podID> [1] to stop and remove the pod, forcing kubelet to restart the KubeProxy on the node.

[1] Yes, they aren’t typos, stopp means “stop the pod” and rmp means “remove the pod”, and these are different to stop and rm which relate to the container.

Is this still needed?

As this was what I was working on all-day yesterday, yep, I’d say so 😊 – in all seriousness though, if this hadn’t been a high-priority issue on the cluster, I might have tried to upgrade the AWS VPC-CNI and KubeProxy add-ons to a later version, to see if the issue was resolved, but at this time, we haven’t done that, so maybe I’ll issue a retraction later 😂

Featured image is “Apoptosis Network (alternate)� by “Simon Cockell� on Flickr and is released under a CC-BY license.

Ross YoungerGetting value from CI

Many of my peers within the software world know the value of Continuous Integration and don’t need convincing. This article is for everybody else.

Introduction

In my first job out of college we had what you’d recognise as CI, though the term wasn’t so popular then. It was powerful, very useful, but a source of Byzantine complexity.

I’ve also worked for people who didn’t think CI was worth doing because it was too expensive to set up and maintain. This is not totally unreasonable; the real question is to figure out where the value for your project might lie.

Recently, a friend wrote:

I don't really know very much about CI. I would be interested in knowing more and might even use some of the quick wins (...) I do not want to become completely reliant upon GitHub for anything.

So let’s start with a primer.

Terminology: What is CI?

Unfortunately the term “CI” is sometimes misused and/or confused.

The short answer is that it’s automation that regularly (continuously) does something useful with your codebase. These actions might take place on every commit, nightly, or be activated by some external trigger.

CI usually refers to a spectrum of practices, each step building on the last:

Continuous… Typical activities
Build Builds your code, usually to the unit or module level. Runs unit tests.
Integration Assembles modules to a “finished application”, whatever that means. Runs integration tests.
Test A full suite of automated tests. May include regression, performance, deployability and data migration.
Delivery When the test suite passes, the latest version of the system is automatically released to a staging environment. This might involve building packages and putting them in a download area.
Deployment When the automated tests pass, the software automatically goes live. Hold tight!

Exactly what these phases mean for your project, and how far you go with them, depends on your project.

  • What suits my embedded firmware probably won’t suit your cloud app or that other person’s desktop app.
  • The lines between the phases are blurry. For example, it may or may not make sense to build and integrate everything in one go.

� Why CI

If deployed appropriately, CI can save time, reduce costs and improve quality. Even on a hobby project, there is often value in saving your time.

1. Automating stuff, so the humans don’t have to

You could use your engineers to do the repetitive drudge work of creating a release across multiple platforms. You could have them run a full barrage of tests before committing a code change… but should you? Engineers are expensive and generally dislike boring stuff, so the smart business move is usually to automate away the repetitive parts and have them focus where they can deliver most value.

If you’re not sure, consider this: how much time does your team spend per release cycle on the repetitive parts? Consider your expected frequency of release cycles, that should lead you to the answer.

2. Automatic analysis and status reporting

One place I worked had a release process which relied on an engineer reading multiple megabytes of log file to see if things had been successful. Many things could go wrong and leave the final output in a plausible but half-broken state. Worse, it wasn’t as simple as running the script in stop-on-error mode, because some of the steps were prone to false alarms.

You may be ahead of me here, but I didn’t think much of that setup.

Compilation failed? Show me the compiler output from the file that failed.

A test failed? I want to see the result of that test (expected & observed results).

Everything passed? Great, but don’t spend megabytes to convey one single bit of information.

At its simplest, a small project will have a single main branch, and the operational information you need can be boiled down to a small number of states:

Red traffic light Yellow traffic light Green traffic light
Something is broken Non-critical warning (not all projects use this) Everything is working

In a non-remote workplace it might make sense to set up some sort of status annunciator.

  • Some people use coloured lava lamps or similar.
  • At one place I worked the machinery in the factory had physical traffic light (andon) lamp sets. We set one of these up, driven by a Raspberry Pi wired in to the build server.
  • Some projects build more elaborate virtual dashboards that suit their needs. Multiple branches, multiple build configurations, whatever makes sense.

3. Improved quality

This one might be self-evident, but I’ll spell it out anyway.

A good CI system will let you incorporate tests of many different types, with variable pass/fail criteria. Think beyond unit and integration testing:

  • Regression (check that your bugs stay fixed)
  • Code quality (code/test coverage analysis; static analysis; dynamic memory leak analysis; automated code style checks)
  • Security analysis (are there any known issues in your dependencies?)
  • License/SBOM compliance
  • Fuzz testing (how does it handle randomised, unexpected inputs?)
  • Performance requirements
  • “Early warning” performance canaries
  • Standards compliance
  • System data migration
  • On-device testing (might be real, emulated or simulated hardware)
Performance canaries

Particularly where physical devices are involved, you might have a performance margin built in to your hardware spec. As the project evolves, inevitably new features will erode this margin. When you run out things are going to go wrong, so you want to take action before you get there.

An early warning canary is some sort of metric with a threshold. Examples might include free memory, CPU/MCU consumption, or task processing time. When the threshold is passed, that's a sign that things are getting tight and it's time to take pre-emptive action. You might plan to spend some time on algorithmic optimisations, or to kick off a new hardware spin.

If you can automate a really robust set of tests, you can have a lot of confidence in the state of your code at any given time. This gives incredible agility: you can release at any time, if the tests pass. This is the key to moving quickly, and is how a number of tech companies operate.

For a success story involving physical devices, check out the HP LaserJet team’s DevOps transformation.

4. Reduced time to resolve issues

If there’s one thing I’ve learned in the software business, it’s that it’s cheaper to find bugs closer to development - by orders of magnitude.

In other words, reduce the feedback cycle to reduce your costs. This is where automated tests and checks have great value.

  • If there is something wrong in code I modified a minute ago, I’m still in the right headspace and can usually fix it pretty quickly.
  • If it takes a few days to get a test result, I won’t remember all the detail and will have to refresh my memory.
  • If it takes several months to hear that something’s wrong, I may be working on a totally different part of the system and it will take longer to context switch.
  • If a bug report comes in from the field a year or two later, I might as well be starting again from scratch.

But - as ever - engineering is a trade-off. You can’t write a test to catch a bug you haven’t foreseen. It may be prohibitively expensive to test all possible combinations before release.

� Why not CI

CI is not suitable for all software projects.

If you’re writing a scratch throw-away project that won’t live for very long, even simple CI may not be worth it.

If you have a legacy codebase that was written without testing in mind, it might be prohibitively expensive to refactor to set these up. Nevertheless, in such projects there is often still some value to be found in a continuous build.

Let’s be pragmatic.

Tests aren’t everything

On the face of it, more testing means greater quality, right? Well… maybe?

Keep the end goal in sight. It’s up to you to decide what makes sense for your situation; I recommend taking a whole-of-organisation view.

  • You need to balance test runtime against overall feedback cycles. If the tests take too long to run, you’re slowing people down.
  • Some tests are expensive in terms of time or consuming resources, so you might not want to run them daily.
  • Tests involving physical devices can be difficult to automate, and risk creating a process bottleneck. (Consider emulation and/or simulation where appropriate.)
  • Beware of over-testing; you may not need to exhaustively check all the combinations. Statistical techniques might help you out here.
  • Beware of making your black-box tests too strict; this can lead to brittle tests that are more hassle to maintain than they are worth.

Costs and maintenance

It will take time and effort to set up CI. How much time and effort, I can’t say.

In times past, CI was quite the bespoke effort.

These days there is good tooling support for many environments, so it is usually pretty quick to get something going. From there you can decide how far to go.

It might be too big for your platform

CI platforms are designed for small, lightweight processes. Think seconds to minutes, not hours.

If you need to build a large application or a full Yocto firmware image, it’s going to be tough to make that fit within the limits of a cloud-hosted CI platform. Don’t despair! There are ways out, but you need to be smart. Alternative options include:

  • self-hosting CI runners that are take part in a cloud source repository;
  • self-hosting the CI environment (e.g. Gitlab, Jenkins, CircleCI), noting that most source code hosting platforms have integrations;
  • split up the task into multiple smaller CI jobs making good use of artefacts between stages;
  • reconsidering what is truly worth automating anyway.

👷 Steps you can take

1. Build your units

In most projects you already had to set up a buildsystem. Automating this is usually pretty cheap though you will need to get the tooling right.

Tooling on cloud platforms

On-cloud CI (as provided by Github, Gitlab, Bitbucket and others) is generally containerised. What this means is that your project has to know how to install its own tooling, starting with a minimal (usually Linux) container image.

This is really good practice! Doing so means your required tools are themselves expressed in source code under revision control.

Where this might get tricky is if you have multiple build configurations (platforms or builds with different features). Don’t be surprised if automating reveals shortcomings in your setup.

If you have autogenerated documentation, consider running that too. (In Rust, for example, it could be as easy as adding a cargo doc step.)

2. Test your units

Adding unit tests to CI is usually pretty cheap though it will depend on the language and available test frameworks.

If you want to include language-intrinsic checks (e.g. code style, static analysis) this is a good time to build them in. Some analyses can be quite expensive so it may not make sense to run all the checks at the same frequency.

3. Integrate it

If you’re pulling multiple component parts (microservices, standalone executables) together to make an end result, that’s the next step. Do they play nicely? Do you want to run any isolated tests among them before you move to delivery-level tests?

4. Add more checks

I spoke about these above.

This is where things stop being cheap and you have to start thinking about building out supporting infrastructure.

5. Deliver it

Now we’re getting quite situation-specific. Think about what it means to deliver your project.

Are you building a package for an ecosystem (Rust crate / Python pypi / npm.js / …) ? You might be able to automate the packaging steps and that might be pretty cheap.

Are you building an application? Perhaps you can automate the process of building the installer / container / whatever shape it takes. If you have multiple build configurations or platforms, it could get very tedious to build them all by hand and there is often a win for automation.

Where there's code signing involved, you'll need to decide whether it makes sense to automate that or leave it as a manual release step. Never put private keys or other code signing secrets directly into source! Some platforms have secrets mechanisms that may be of use, but it pays to be cautious. If your secrets leak, how will you repair the situation?

Closing thoughts

  • Most projects will benefit from a little CI. You don’t need to have unit tests, though they are a good idea.
  • You’re going to have to maintain your CI, so build it for maintainability like you do your software.
  • Apply agile to your CI as you do to your deliverables. Perfect is the enemy of good enough. Build something, get feedback, iterate!
  • CI vendors want to lock you in to their platform. Keep your eyes open.
  • Don’t let CI become an all-consuming monster that prevents you from delivering in the first place!

Andy SmithCheck yo PTRs

Backstory

The other day I was looking through a log file and saw one of BitFolk's IP addresses doing something. I didn't recognise the address so I did a reverse lookup and got 2001-ba8-1f1-f284-0-0-0-2.autov6rev.bitfolk.space — which is a generic setting and not very useful.

It's quick to look this up and fix it of course, but I wondered how many other such addresses I had forgotten to take care of the reverse DNS for.

ptrcheck

In order to answer that question, automatically and in bulk, I wrote ptrcheck.

It was able to tell me that almost all of my domains had at least one reference to something without a suitable PTR record.

$ ptrcheck --server [::1] --zone strugglers.net� 192.168.9.10 is pointed to by: intentionally-broken.strugglers.net. Missing PTR for 192.168.9.10 1 missing/broken PTR record$

Though it wasn't all bad news. 😀

$ ptrcheck --server [::1] --zone dogsitter.services -vConnecting to ::1 port 53 for AXFR of zone dogsitter.servicesZone contains 57 recordsFound 3 unique address (A/AAAA) records� 2001:ba8:1f1:f113::80 is pointed to by: dogsitter.services., dev.dogsitter.services., www.dogsitter.services. Found PTR: www.dogsitter.services.� 85.119.84.147 is pointed to by: dogsitter.services., dev.dogsitter.services., tom.dogsitter.services., www.dogsitter.services. Found PTR: dogsitter.services.� 2001:ba8:1f1:f113::2 is pointed to by: tom.dogsitter.services. Found PTR: tom.dogsitter.services.� 100.0% good PTRs! Good job!$

How it works

See the repository for full details, but briefly: ptrcheck does a zone transfer of the zone you specify and keeps track of every address (A / AAAA) record. It then does a PTR query for each unique address record to make sure it

  1. exists
  2. is "acceptable"

You can provide a regular expression for what you deem to be "unacceptable", otherwise any PTR content at all is good enough.

Why might a PTR record be "unacceptable"??

I am glad you asked.

A lot of hosting providers generate generic PTR records when the customer doesn't set their own. They're not a lot better than having no PTR at all.

Failure to comply is no longer an option (for me)

The program runs silently (unless you use --verbose) so I was able to make a cron job that runs once a day and complains at me if any of my zones ever refer to a missing or unacceptable PTR ever again!

By the way, I ran it against all BitFolk customer zones; 26.5% of them had at least one missing or generic PTR record.

BitFolk WikiMonitoring

Setup: NRPE example config

← Older revision Revision as of 11:17, 19 November 2024
Line 21: Line 21:


These sorts of checks can work without an agent (i.e. without anything installed on your VPS). More complicated checks such as disk space, load or anything else that you can check with a script will need some sort of agent such as an [https://exchange.nagios.org/directory/Addons/Monitoring-Agents/NRPE--2D-Nagios-Remote-Plugin-Executor/details NRPE] daemon or [[Wikipedia:SNMP|SNMP]] daemon.
These sorts of checks can work without an agent (i.e. without anything installed on your VPS). More complicated checks such as disk space, load or anything else that you can check with a script will need some sort of agent such as an [https://exchange.nagios.org/directory/Addons/Monitoring-Agents/NRPE--2D-Nagios-Remote-Plugin-Executor/details NRPE] daemon or [[Wikipedia:SNMP|SNMP]] daemon.
===NRPE===
NRPE is a typical agent you would run that would allow BitFolk's monitoring system to execute health checks on your VPS. On Debian/Ubuntu systems it can be installed from the package '''nagios-nrpe-server'''. This will normally pull in the package '''monitoring-plugins-basic''' which contains the check plugins.
Check plugins end up in the '''/usr/lib/nagios/plugins/''' directory. NRPE can run any of these when asked and feed the info back to BitFolk's Icinga. All of the existing ones should support a <code>--help</code> argument to let you know how to use them, e.g.
<syntaxhighlight lang="text">
$ /usr/lib/nagios/plugins/check_tcp --help
</syntaxhighlight>
You can run check plugins from the command line:
<syntaxhighlight lang="text">
$ /usr/lib/nagios/plugins/check_tcp -H 85.119.82.70 -p 443
TCP OK - 0.000 second response time on 85.119.82.70 port 443|time=0.000322s;;;0.000000;10.000000
</syntaxhighlight>
There are a large number of Nagios-compatible check plugins in existence so you should be able to find one that does what you. If there's not, it's easy to write one. Here's an example of using '''check_disk''' to check the disk space of your root filesystem.
<syntaxhighlight lang="text">
$ /usr/lib/nagios/plugins/check_disk -w '10%' -c '4%' -p /
DISK OK - free space: / 631 MB (11% inode=66%);| /=5035MB;5381;5739;0;5979
</syntaxhighlight>
Once you have that working, you put it in an NRPE config file such as '''/etc/nagios/nrpe.d/xvda1.cfg'''.
<syntaxhighlight lang="text">
command[check_xvda1]=/usr/lib/nagios/plugins/check_disk -w '10%' -c '4%' -p /
</syntaxhighlight>
You should then tell BitFolk (in a support ticket) what the name of it is ("'''check_xvda1'''"). It will then get added to BitFolk's Icinga.
By this means you can check anything you can script.


==Alerts==
==Alerts==

BitFolk WikiUser:Equinox/WireGuard

← Older revision Revision as of 20:33, 13 November 2024
Line 1: Line 1:


'''STOP !!!!!!!!'''


'''This is NOT READY ! I guarantee it won't work yet. (Mainly the routing, also table=off needs research, I can't remember exactly what it does).'''
'''STOP !!!!!'''


'''Is is untried / untested. It's a first draft fished out partly from my running system and partly my notes.'''
'''This is a first draft! If you're a hardy network type who can recover from errors / omissions in this page then go for it (and fix this page!)'''
 
'''HOWEVER...''' If you are a hardy network type and you want a go ... Have at it.




Line 32: Line 29:
PrivateKey = # Insert the contents of the file server-private-key generated above
PrivateKey = # Insert the contents of the file server-private-key generated above
ListenPort = # Pick an empty UDP port to listen on. Remember to open it in your firewall
ListenPort = # Pick an empty UDP port to listen on. Remember to open it in your firewall
Address = 10.254.1.254/24
Address = 10.254.1.254/32
Address = 2a0a:1100:1018:1::fe/64
Address = 2a0a:1100:1018:1::fe/128
Table = off
</syntaxhighlight>
</syntaxhighlight>


Line 53: Line 49:
Address = 10.254.1.1/24
Address = 10.254.1.1/24
Address = 2a0a:1100:1018:1::1/64
Address = 2a0a:1100:1018:1::1/64
Table = off
</syntaxhighlight>
</syntaxhighlight>


Line 60: Line 55:
=== Add the Client Information to the Server ===
=== Add the Client Information to the Server ===


Append the following to the server /etc/wireguard/wg0.conf, inserting your generated information where appropriate:
Append the following to the '''server''' /etc/wireguard/wg0.conf, inserting your generated information where appropriate:


<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
Line 73: Line 68:
=== Add the Server Information to the Client ===
=== Add the Server Information to the Client ===


Append the following to the server /etc/wireguard/wg0.conf, inserting your generated information where appropriate:
Append the following to the '''client''' /etc/wireguard/wg0.conf, inserting your generated information where appropriate:


<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
Line 93: Line 88:
# systemctl start wg-quick@wg0
# systemctl start wg-quick@wg0
# systemctl enable wg-quick@wg0      # Optional - start VPN at startup
# systemctl enable wg-quick@wg0      # Optional - start VPN at startup
</syntaxhighlight>
If all went well you should now have a working tunnel. Confirm by running:
<syntaxhighlight lang="text">
# wg
</syntaxhighlight>
If both sides have a reasonable looking "latest handshake" line then the tunnel is up.
The wg-quick scripts automatically set up routes / default routes based on the contents of the wg0.conf files, so at this point you can test the link by pinging addresses from either side.
Two further things may/will need to be configured to allow full routing...
==== Enable IP Forwarding ====
Edit /etc/sysctl.conf, or a local conf file in /etc/sysctl.d/ and enable IPv4 and/or IPv6 forwarding
<syntaxhighlight lang="text">
net.ipv4.ip_forward=1
net.ipv6.conf.all.forwarding=1
</syntaxhighlight>
Reload the kernel variables:
<syntaxhighlight lang="text">
systemctl reload procps
</syntaxhighlight>
==== WireGuard Max MTU Size ====
If some websites don't work properly over IPv6 (Netflix) you may be running into MTU size problems. If using nftables, this can be entirely fixed with the following line in the forward table:
<syntaxhighlight lang="text">
oifname "wg0" tcp flags syn tcp option maxseg size set rt mtu
</syntaxhighlight>
==== Enable Forwarding In Your Firewall ====
You will have to figure out how to do this for your flavour of firewall. If you happen to be using nftables, the following snippet is an example (by no means a full config!) of how to forward IPv6 back and forth. This snippet allows all outbound traffic but throws incoming traffic to the table "ipv6-incoming-firewall" for further filtering. (For testing you could just "accept" but don't leave it like that!)
<syntaxhighlight lang="text">
chain ip6-forwarding {
    type filter hook forward priority 0; policy drop;
    oifname "wg0" tcp flags syn tcp option maxseg size set rt mtu
    ip6 saddr 2a0a:1100:1018:1::/64 accept
    ip6 daddr 2a0a:1100:1018:1::/64 jump ipv6-incoming-firewall
}
</syntaxhighlight>
</syntaxhighlight>


Jon SpriggsTalk Summary – OggCamp ’24 – Kubernetes, A Guide for Docker users

Format: Theatre Style room. ~30 attendees.

Slides: Available to view (Firefox/Chrome recommended – press “S” to see the required speaker notes)

Video: Not recorded. I’ll try to record it later, if I get a chance.

Slot: Graphine 1, 13:30-14:00

Notes: Apologies for the delay on posting this summary. The talk was delivered to a very busy room. Lots of amazing questions. The presenter notes were extensive, but entirely unused when delivered. One person asked a question, I said I’d follow up with them later, but didn’t find them before the end of the conference. One person asked about the benefits of EKS over ECS in AWS… as I’ve not used ECS, I couldn’t answer, but it sounds like they largely do the same thing.

Ross YoungerAnnouncing qcp

The QUIC Copier (qcp) is an experimental high-performance remote file copy utility for long-distance internet connections.

Source repository: https://github.com/crazyscot/qcp

📋 Features

  • 🔧 Drop-in replacement for scp
  • 🛡ï¸� Similar security to scp, using existing, well-known mechanisms
  • 🚀 Better throughput on congested networks

📖 About qcp

qcp is a hybrid protocol combining ssh and QUIC.

We use ssh to establish a control channel to the target machine, then spin up the QUIC protocol to transfer data.

This has the following useful properties:

  • User authentication is handled entirely by ssh
  • Data is transmitted over UDP, avoiding known issues with TCP over “long, fat pipe” connections
  • Data in transit is protected by TLS using ephemeral keys
  • The security mechanisms all use existing, well-known cryptographic algorithms

For full documentation refer to qcp on docs.rs.

Motivation

I needed to copy multiple large (3+ GB) files from a server in Europe to my home in New Zealand.

I’ve got nothing against ssh or scp. They’re brilliant. I’ve been using them since the 1990s. However they run on top of TCP, which does not perform very well when the network is congested. With a fast fibre internet connection, a long round-trip time and noticeable packet loss, I was right in the sour spot. TCP did its thing and slowed down, but when the congestion cleared it was very slow to get back up to speed.

If you’ve ever been frustrated by download performance from distant websites, you might have been experiencing this same issue. Friends with satellite (pre-Starlink) internet connections seem to be particularly badly affected.

💻 Getting qcp

The project is a Rust binary crate.

You can install it:

  • as a Debian package or pre-compiled binary from the latest qcp release page (N.B. the Linux builds are static musl binaries);
  • with cargo install qcp (you will need to have a rust toolchain and capnpc installed);
  • by cloning and building the source repository.

You will need to install qcp on both machines. Please refer to the README for more.

See also

Andy SmithProtecting URIs from Tor nodes with the Apache HTTP Server

Recently I found one of my web services under attack from clients using Tor.

For the most part I am okay with the existence of Tor, but if you're being attacked largely or exclusively through Tor then you might need to take actions like:

  • Temporarily or permanently blocking access entirely.
  • Taking away access to certain privileged functions.

Here's how I did it.

Step 1: Obtain a list of exit nodes

Tor exit nodes are the last hop before reaching regular Internet services, so traffic coming through Tor will always have a source IP of an exit node.

Happily there are quite a few services that list Tor nodes. I like https://www.dan.me.uk/tornodes which can provide a list of exit nodes, updated hourly.

This comes as a list of IP addresses one per line so in order to turn it into an httpd access control list:

$ curl -s 'https://www.dan.me.uk/torlist/?exit' |
    sed 's/^/Require not ip /' |
    sudo tee /etc/apache2/tor-exit-list.conf >/dev/null

This results in a file like:

$ head -10 /etc/apache2/tor-exit-list.conf
Require not ip 102.130.113.9
Require not ip 102.130.117.167
Require not ip 102.130.127.117
Require not ip 103.109.101.105
Require not ip 103.126.161.54
Require not ip 103.163.218.11
Require not ip 103.164.54.199
Require not ip 103.196.37.111
Require not ip 103.208.86.5
Require not ip 103.229.54.107

Step 2: Configure httpd to block them

Totally blocking traffic from these IPs would be easier than what I decided to do. If you just wanted to totally block traffic from Tor then the easy and efficient answer would be to insert all these IPs into an nftables set or an iptables IP set.

For me, it's only some URIs on my web service that I don't want these IPs accessing and I wanted to preserve the ability of Tor's non-abusive users to otherwise use the rest of the service. An httpd access control configuration is necessary.

Inside the virtualhost configuration file I added:

    <Location /some/sensitive/thing>
        <RequireAll>
            Require all granted
            Include /etc/apache2/tor-exit-list.conf
        </RequireAll>
    </Location>

Step 3: Test configuration and reload

It's a good idea to check the correctness of the httpd configuration now. Aside from syntax errors in the list of IP addresses, this might catch if you forgot any modules necessary for these directives. Although I think they are all pretty core.

Assuming all is well then a graceful reload will be needed to make httpd see the new configuration.

$ sudo apache2ctl configtest
Syntax OK
$ sudo apache2ctl graceful

Step 4: Further improvements

Things can't be left there, but I haven't got around to any of this yet.

  1. Script the repeated download of the Tor exit node list. The list of active Tor nodes will change over time.
  2. Develop some checks on the list such as:
    1. Does it contain only valid IP addresses?
    2. Does it contain at least min number of addresses and less than max number?
  3. If the list changed, do the config test and reload again. httpd will not include the altered config file without a reload.
  4. If the list has not changed in x number of days, consider the data source stale and think about emptying the list.

Performance thoughts

I have not checked how much this impacts performance. My service is not under enough load for this to be noticeable for me.

At the moment the Tor exit node list is around 2,100 addresses and I don't know how efficient the Apache HTTP Server is about a large list of Require not ip directives. Worst case is that for every request to that URI it will be scanning sequentially through to the end of the list.

I think that using httpd's support for DBM files in RewriteMaps might be quite efficient but this comes with the significant issue that IPv6 addresses have multiple formats, while a DBM lookup will be doing a literal text comparison.

For example, all of the following represent the same IPv6 address:

  • 2001:db8::
  • 2001:0DB8::
  • 2001:Db8:0000:0000:0000:0000:0000:0000
  • 2001:db8:0:0:0:0:0:0

httpd does have built-in functions to upper- or lower-case things, but not to compress or expand an IPv6 address. httpd access control directives are also able to match the request IP against a CIDR net block, although at the moment Dan's Tor node list does only contain individual IP addresses. At a later date one might like to try to aggregate those individual IP addresses into larger blocks.

httpd's RewriteMaps can also query an SQL server. Querying a competent database implementation like PostgreSQL could be made to alleviate some of those concerns if the data were represented properly, though this does start to seem like an awful lot of work just for an access control list!

Over on Fedi, it was suggested that a firewall rule — presumably using an nftables set or iptables IP set, which are very efficient — could redirect matching source IPs to a separate web server on a different port, which would then do the URI matching as necessary.

<nerdsnipe>There does not seem to be an Apache HTTP Server authz module for IP sets. That would be the best of both worlds!</nerdsnipe>

BitFolk WikiIPv6/VPNs

Using WireGuard

← Older revision Revision as of 16:31, 31 October 2024
Line 19: Line 19:
== Using WireGuard ==
== Using WireGuard ==
Probably the more sensible choice in the 2020s, but, help?
Probably the more sensible choice in the 2020s, but, help?
[[User:Equinox/WireGuard|Not ready yet, but it's a start]]


== Using tincd ==
== Using tincd ==

Chris WallaceMoving from exist-db 3.0.1 to 6.0.1 6.2.0

Moving from exist-db 3.0.1 to 6.0.1 6.2.0That’s an awful lot of release notes to read through...

David LeadbeaterRestrict sftp with Linux user namespaces

A script to restrict SFTP to some directories, without needing chroot or other privileged configuration.

Andy SmithGenerating a link-local address from a MAC address in Perl

Example

On the host

$ ip address show dev eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether aa:00:00:4b:a0:c1 brd ff:ff:ff:ff:ff:ff
[…]
    inet6 fe80::a800:ff:fe4b:a0c1/64 scope link 
       valid_lft forever preferred_lft forever

Generated by script

$ lladdr.pl aa:00:00:4b:a0:c1
fe80::a800:ff:fe4b:a0c

Code

#!/usr/bin/env perl

use warnings;
use strict;
use 5.010;

if (not defined $ARGV[0]) {
    die "Usage: $0 MAC-ADDRESS"
}

my $mac = $ARGV[0];

if ($mac !~ /^
    \p{PosixXDigit}{2}:
    \p{PosixXDigit}{2}:
    \p{PosixXDigit}{2}:
    \p{PosixXDigit}{2}:
    \p{PosixXDigit}{2}:
    \p{PosixXDigit}{2}
    /ix) {
    die "'$mac' doesn't look like a MAC address";
}

my @octets = split(/:/, $mac);

# Algorithm:
# 1. Prepend 'fe80::' for the first 64 bits of the IPv6
# 2. Next 16 bits: Use first octet with 7th bit flipped, and second octet
#    appended
# 3. Next 16 bits: Use third octet with 'ff' appended
# 4. Next 16 bits: Use 'fe' with fourth octet appended
# 5. Next 16 bits: Use 5th octet with 6th octet appended
# = 128 bits.
printf "fe80::%x%02x:%x:%x:%x\n",
    hex($octets[0]) ^ 2,
    hex($octets[1]),
    hex($octets[2] . 'ff'),
    hex('fe' . $octets[3]),
    hex($octets[4] . $octets[5]);

See also

Jon SpriggsThis little #bash script will make capturing #output from lots of #scripts a lot easier

A while ago, I was asked to capture a LOT of data for a support case, where they wanted lots of commands to be run, like “kubectl get namespace” and then for each namespace, get all the pods with “kubectl get pods -n $namespace” and then describe each pod with “kubectl get pod -n namespace $podname”. Then do the same with all the services, deployments, ingresses and endpoints.

I wrote this function, and a supporting script to execute the actual checks, and just found it while clearing up!

#!/bin/bash

filename="$(echo $* | sed -E -e 's~[ -/\\]~_~g').log"
echo "\$ $@" | tee "${filename}"
$@ 2>&1 | tee -a "${filename}"

This script is quite simple, it does three things

  1. Take the command you’re about to run, strip all the non-acceptable-filename characters out and replace them with underscores, and turn that into the output filename.
  2. Write the command into the output file, replacing any prior versions of that file
  3. Execute the command, and append the log to the output file.

So, how do you use this? Simple

log_result my-command --with --all --the options

This will produce a file called my-command_--with_--all_--the_options.log that contains this content:

$ my-command --with --all --the options
Congratulations, you ran my-command and turned on the options "--with --all --the options". Nice one!

… oh, and the command I ran to capture the data for the support case?

log_result kubectl get namespace
for TYPE in pod ingress service deployment endpoints
do
  for ns in $(kubectl get namespace | grep -v NAME | awk '{print $1}' )
  do
    echo $ns
    for item in $(kubectl get $TYPE -n $ns | grep -v NAME | awk '{print $1}')
    do
      log_result kubectl get $TYPE -n $ns $item -o yaml
      log_result kubectl describe $TYPE -n $ns $item
    done
  done
done

Featured image is “Travel log texture” by “Mary Vican” on Flickr and is released under a CC-BY license.

Jon SpriggsWhy (and how) I’ve started writing my Shell Scripts in Python

I’ve been using Desktop Linux for probably 15 years, and Server Linux for more like 25 in one form or another. One of the things you learn to write pretty early on in Linux System Administration is Bash Scripting. Here’s a great example

#!/bin/bash

i = 0
until [ $i -eq 10 ]
do
  print "Jon is the best!"
  (( i += 1 ))
done

Bash scripts are pretty easy to come up with, you just write the things you’d type into the interactive shell, and it does those same things for you! Yep, it’s pretty hard not to love Bash for a shell script. Oh, and it’s portable too! You can write the same Bash script for one flavour of Linux (like Ubuntu), and it’s probably going to work on another flavour of Linux (like RedHat Enterprise Linux, or Arch, or OpenWRT).

But. There comes a point where a Bash script needs to be more than just a few commands strung together.

At work, I started writing a “simple” installer for a Kubernetes cluster – it provisions the cloud components with Terraform, and then once they’re done, it then starts talking to the Kubernetes API (all using the same CLI tools I use day-to-day) to install other components and services.

When the basic stuff works, it’s great. When it doesn’t work, it’s a bit of a nightmare, so I wrote some functions to put logs in a common directory, and another function to gracefully stop the script running when something fails, and then write those log files out to the screen, so I know what went wrong. And then I gave it to a colleague, and he ran it, and things broke in a way that didn’t make sense for either of us, so I wrote some more functions to trap that type of error, and try to recover from them.

And each time, the way I tested where it was working (or not working) was to just… run the shell script, and see what it told me. There had to be a better way.

Enter Python

Python earns my vote for a couple of reasons (and they might not be right for you!)

  • I’ve been aware of the language for some time, and in fact, had patched a few code libraries in the past to use Ansible features I wanted.
  • My preferred IDE (Integrated Desktop Environment), Visual Studio Code, has a step-by-step debugger I can use to work out what’s going on during my programming
  • It’s still portable! In fact, if anything, it’s probably more portable than Bash, because the version of Bash on the Mac operating system – OS X is really old, so lots of “modern” features I’d expect to be in bash and associate tooling isn’t there! Python is Python everywhere.
  • There’s an argument parsing tool built into the core library, so if I want to handle things like ./myscript.py --some-long-feature "option-A" --some-long-feature "option-B" -a -s -h -o -r -t --argument I can do, without having to remember how to write that in Bash (which is a bit esoteric!)
  • And lastly, for now at least!, is that Python allows you to raise errors that can be surfaced up to other parts of your program

Given all this, my personal preference is to write my shell scripts now in Python.

If you’ve not written python before, variables are written without any prefix (like you might have seen $ in PHP) and any flow control (like if, while, for, until) as well as any functions and classes use white-space indentation to show where that block finishes, like this:

def do_something():
  pass

if some_variable == 1:
  do_something()
  and_something_else()
  while some_variable < 2:
    some_variable = some_variable * 2

Starting with Boilerplate

I start from a “standard” script I use. This has a lot of those functions I wrote previously for bash, but with cleaner code, and in a way that’s a bit more understandable. I’ll break down the pieces I use regularly.

Starting the script up

Here’s the first bit of code I always write, this goes at the top of everything

#!/usr/bin/env python3
import logging
logger = logging

This makes sure this code is portable, but is always using Python3 and not Python2. It also starts to logging engine.

At the bottom I create a block which the “main” code will go into, and then run it.

def main():
  logger.basicConfig(level=logging.DEBUG)
  logger.debug('Started main')

if __name__ == "__main__":
    main()

Adding argument parsing

There’s a standard library which takes command line arguments and uses them in your script, it’s called argparse and it looks like this:

#!/usr/bin/env python3
# It's convention to put all the imports at the top of your files
import argparse
import logging
logger = logging

def process_args():
  parser=argparse.ArgumentParser(
    description="A script to say hello world"
  )

  parser.add_argument(
    '--verbose', # The stored variable can be found by getting args.verbose
    '-v',
    action="store_true",
    help="Be more verbose in logging [default: off]"
  )

  parser.add_argument(
    'who', # This is a non-optional, positional argument called args.who
    help="The target of this script"
  )
  args = parser.parse_args()

  if args.verbose:
      logger.basicConfig(level=logging.DEBUG)
      logger.debug('Setting verbose mode on')
  else:
      logger.basicConfig(level=logging.INFO)

  return args

def main():
  args=process_args()

  print(f'Hello {args.who}')
  # Using f'' means you can include variables in the string
  # You could instead do printf('Hello %s', args.who)
  # but I always struggle to remember in what order I wrote things!

if __name__ == "__main__":
    main()

The order you put things in makes a lot of difference. You need to have the if __name__ == "__main__": line after you’ve defined everything else, but then you can put the def main(): wherever you want in that file (as long as it’s before the if __name__). But by having everything in one file, it feels more like those bash scripts I was talking about before. You can have imports (a bit like calling out to other shell scripts) and use those functions and classes in your code, but for the “simple” shell scripts, this makes most sense.

So what else do we do in Shell scripts?

Running commands

This is class in it’s own right. You can pass a class around in a variable, but it has functions and properties of it’s own. It’s a bit chunky, but it handles one of the biggest issues I have with bash scripts – capturing both the “normal” output (stdout) and the “error” output (stderr) without needing to put that into an external file you can read later to work out what you saw, as well as storing the return, exit or error code.

# Add these extra imports
import os
import subprocess

class RunCommand:
    command = ''
    cwd = ''
    running_env = {}
    stdout = []
    stderr = []
    exit_code = 999

    def __init__(
      self,
      command: list = [], 
      cwd: str = None,
      env: dict = None,
      raise_on_error: bool = True
    ):
        self.command = command
        self.cwd = cwd
        
        self.running_env = os.environ.copy()

        if env is not None and len(env) > 0:
            for env_item in env.keys():
                self.running_env[env_item] = env[env_item]

        logger.debug(f'exec: {" ".join(command)}')

        try:
            result = subprocess.run(
                command,
                cwd=cwd,
                capture_output=True,
                text=True,
                check=True,
                env=self.running_env
            )
            # Store the result because it worked just fine!
            self.exit_code = 0
            self.stdout = result.stdout.splitlines()
            self.stderr = result.stderr.splitlines()
        except subprocess.CalledProcessError as e:
            # Or store the result from the exception(!)
            self.exit_code = e.returncode
            self.stdout = e.stdout.splitlines()
            self.stderr = e.stderr.splitlines()

        # If verbose mode is on, output the results and errors from the command execution
        if len(self.stdout) > 0:
            logger.debug(f'stdout: {self.list_to_newline_string(self.stdout)}')
        if len(self.stderr) > 0:
            logger.debug(f'stderr: {self.list_to_newline_string(self.stderr)}')

        # If it failed and we want to raise an exception on failure, record the command and args
        # then Raise Away!
        if raise_on_error and self.exit_code > 0:
            command_string = None
            args = []
            for element in command:
                if not command_string:
                    command_string = element
                else:
                    args.append(element)

            raise Exception(
                f'Error ({self.exit_code}) running command {command_string} with arguments {args}\nstderr: {self.stderr}\nstdout: {self.stdout}')

    def __repr__(self) -> str: # Return a string representation of this class
        return "\n".join(
            [
               f"Command: {self.command}",
               f"Directory: {self.cwd if not None else '{current directory}'}",
               f"Env: {self.running_env}",
               f"Exit Code: {self.exit_code}",
               f"nstdout: {self.stdout}",
               f"stderr: {self.stderr}" 
            ]
        )

    def list_to_newline_string(self, list_of_messages: list):
        return "\n".join(list_of_messages)

So, how do we use this?

Well… you can do this: prog = RunCommand(['ls', '/tmp', '-l']) with which we’ll get back the prog object. If you literally then do print(prog) it will print the result of the __repr__() function:

Command: ['ls', '/tmp', '-l']
Directory: current directory
Env: <... a collection of things from your environment ...>
Exit Code: 0
stdout: total 1
drwx------ 1 root  root  0 Jan 1 01:01 somedir
stderr:

But you can also do things like:

for line in prog.stdout:
  print(line)

or:

try:
  prog = RunCommand(['false'], raise_on_error=True)
catch Exception as e:
  logger.error(e)
  exit(e.exit_code)

Putting it together

So, I wrote all this up into a git repo, that you’re more than welcome to take your own inspiration from! It’s licenced under an exceptional permissive license, so you can take it and use it without credit, but if you want to credit me in some way, feel free to point to this blog post, or the git repo, which would be lovely of you.

Github: JonTheNiceGuy/python_shell_script_template

Featured image is “The Conch” by “Kurtis Garbutt” on Flickr and is released under a CC-BY license.

Alan PopeWhere are Podcast Listener Communities

Parasocial chat

On Linux Matters we have a friendly and active, public Telegram channel linked on our Contact page, along with a Discord Channel. We also have links to Mastodon, Twitter (not that we use it that much) and email.

At the time of writing there are roughly this ⬇ï¸� number of people (plus bots, sockpuppets and duplicates) in or following each Linux Matters “official” presence:

Channel Number
Telegram 796
Discord 683
Mastodon 858
Twitter 9919

Preponderance of chat

We chose to have a presence in lots of places, but primarily the talent presenters (Martin, Mark, and myself (and Joe)) only really hang out to chat on Telegram and Mastodon.

I originally created the Telegram channel on November 20th, 2015, when we were publishing the Ubuntu Podcast (RIP in Peace) A.K.A. Ubuntu UK Podcast. We co-opted and renamed the channel when Linux Matters launched in 2023.

Prior to the channel’s existence, we used the Ubuntu UK Local Community (LoCo) Team IRC channel on Freenode (also, RIP in Peace).

We also re-branded our existing Mastodon accounts from the old Ubuntu Podcast to Linux Matters.

We mostly continue using Telegram and Mastodon as our primary methods of communication because on the whole they’re fast, reliable, stay synced across devices, have the features we enjoy, and at least one of them isn’t run by a weird billionaire.

Other options

We link to a lot of other places at the top of the Linux Matters home page, where our listeners can chat, mostly to eachother and not us.

Being over 16, I’m not a big fan of Discord, and I know Mark doesn’t even have an account there. None of us use Twitter much anymore, either.

Periodically I ponder if we (Linux Matters) should use something other than Telegram. I know some listeners really don’t like the platform, but prefer other places like Signal, Matrix or even IRC. I know for sure some non-listeners don’t like Telegram, but I care less about their opinions.

Part of the problem is that I don’t think any of us really enjoy the other realtime chat alternatives. Both Matrix and Signal have terrible user experience, and other flaws. Which is why you don’t tend to find us hanging out in either of those places.

There are further options I haven’t even considered, like Wire, WhatsApp, and likely more I don’t even know or care about.

So we kept using Telegram over any of the above alternative options.

Pondering Posting Polls

I have repeatedly considered asking the listeners about their preferred chat platforms via our existing channels. But that seems flawed, because we use what we like, and no matter how many people prefer something else, we’re unlikely to move. Unless something strange happens 👀 .

Plus, often times, especially on decentralised platforms, the audience can be somewhat “over-enthusiastic” about their preferred way being The Wayâ„¢ï¸� over the alternatives. It won’t do us any favours to get data saying 40% report we should use Signal, 40% suggest Matrix and 20% choose XMPP, if the four of us won’t use any of them.

Pursue Podcast Palaver Proposals

So rather than ask our audience, I thought I’d see what other podcasters promote for feedback and chatter on their websites.

I picked a random set from shows I have heard of, and may have listened to, plus a few extra ones I haven’t. None of this is endorsement or approval, I wanted the facts, just the fax, ma’am.

I collated the data in a json file for some reason, then generated the tables below. I don’t know what to do with this information, but it’s a bit of data we may use if we ever decide to move away from Telegram.

Presenting Pint-Sized Payoff

The table shows some nerdy podcasts along with their primary means (as far as I can tell) of community engagement. Data was gathered manually from podcast home pages and “about” pages. I generally didn’t go into the page content for each episode. I made an exception for “Dot Social” and “Linux OTC” because there’s nothing but episodes on their home page.

It doesn’t matter for this research, I just thought it was interesting that some podcasters don’t feel the need to break out their contact details to a separate page, or make it more obvious. Perhaps they feel that listeners are likely to be viewing an episode page, or looking at a specific show metadata, so it’s better putting the contact details there.

I haven’t included YouTube, where many shows publish and discuss, in addition to a podcast feed.

I am also aware that some people exclusively, or perhaps primarily publish on YouTube (or other video platforms). Those aren’t podcasts IMNSHO.

Key to the tables below. Column names have been shorted because it’s a w i d e table. The numbers indicate how many podcasts use that communication platform.

  • EM - Email address (13/18)
  • MA - Mastodon account (9/18)
  • TW - Twitter account (8/18)
  • DS - Discord server (8/18)
  • TG - Telegram channel (4/18)
  • IR - IRC channel (5/18)
  • DW - Discourse website (2/18)
  • SK - Slack channel (3/18)
  • LI - LinkedIn (2/18)
  • WF - Web form (2/18)
  • SG - Signal group (3/18)
  • WA - WhatsApp (1/18)
  • FB - FaceBook (1/18)

Linux

Show EM MA TW DS TG IR DW SK MX LI WF SG WA FB
Linux Matters ✅ ✅ ✅ ✅ ✅ ✅
Ask The Hosts ✅ ✅ ✅ ✅ ✅
Destination Linux ✅ ✅ ✅ ✅ ✅
Linux Dev Time ✅ ✅ ✅ ✅ ✅
Linux After Dark ✅ ✅ ✅ ✅ ✅
Linux Unplugged ✅ ✅ ✅ ✅
This Week in Linux ✅ ✅ ✅ ✅ ✅
Ubuntu Security Podcast ✅ ✅ ✅ ✅ ✅
Linux OTC ✅ ✅ ✅

Open Source Adjunct

Show EM MA TW DS TG IR DW SK MX LI WF SG WA FB
2.5 Admins ✅ ✅
Bad Voltage ✅ ✅ ✅ ✅
Coffee and Open Source ✅
Dot Social ✅ ✅
Open Source Security ✅ ✅ ✅
localfirst.fm ✅

Other Tech

Show EM MA TW DS TG IR DW SK MX LI WF SG WA FB
ATP ✅ ✅ ✅ ✅
BBC Newscast ✅ ✅ ✅
The Rest is Entertainment ✅

Point

Not entirely sure what to do with this data. But there it is.

Is Linux Matters going to move away from Telegram to something else? No idea.

Alun JonesMessing with web spiders

Yesterday I read a Mastodon posting. Someone had noticed that their web site was getting huge amounts of traffic. When they looked into it, they discovered that it was OpenAI's - about 422 words

Alan PopeWindows 3.11 on QEMU 5.2.0

This is mostly an informational PSA for anyone struggling to get Windows 3.11 working in modern versions of QEMU. Yeah, I know, not exactly a massively viral target audience.

Anyway, short answer, use QEMU 5.2.0 from December 2020 to run Windows 3.11 from November 1993.

Windows 3.11, at 1280x1024, running Internet Explorer 5, looking at a GitHub issue

An innocent beginning

I made a harmless jokey reply to a toot from Thom at OSNews, lamenting the lack of native Mastodon client for Windows 3.11.

When I saw Thom’s toot, I couldn’t resist, and booted a Windows 3.11 VM that I’d installed six weeks ago, manually from floppy disk images of MSDOS and Windows.

I already had Lotus Organiser installed to post a little bit of nostalgia-farming on threads - it’s what they do over there.

Post by @popey
View on Threads

I thought it might be fun to post a jokey diary entry. I hurriedly made my silly post five minutes after Thom’s toot, expecting not to think about this again.

Incorrect, brain

I shut the VM down, then went to get coffee, chuckling to my smart, smug self about my successful nerdy rapid-response. While the kettle boiled, I started pondering - “Wait, if I really did want to make a Mastodon client for Windows 3.11, how would I do it?

I pondered and dismissed numerous shortcuts, including, but not limited to:

  • Fake it with screenshots doctored in MS Paint
  • Run an existing DOS Mastodon Client in a Window
  • Use the Windows Telnet client to connect insecurely to my laptop running the Linux command-line Mastodon client, Toot
  • Set up a proxy through which I could get to a Mastodon web page

I pondered a different way, in which I’d build a very simple proof of concept native Windows client, and leverage the Mastodon API. I’m not proficient in (m)any programming languages, but felt something like Turbo Pascal was time-appropriate and roughly within my capabilities.

Diversion

My mind settled on Borland Delphi, which I’d never used, but looked similar enough for a silly project to Borland Turbo Pascal 7.0 for DOS, which I had. So I set about installing Borland Delphi 1.0 from fifteen (virtual) floppy disks, onto my Windows 3.11 “Workstation” VM.

Windows 3.11, with a Borland Delphi window open

Thank you, whoever added the change floppy0 option to the QEMU Monitor. That saved a lot of time, and was reduced down to a loop of this fourteen times:

"Please insert disk 2"
CTRL+ALT+2
(qemu) change floppy 0 Disk02.img
CTRL+ALT+1
[ENTER]

During my research for this blog, I found a delightful, nearly decade-old video of David Intersimone (“David I”) running Borland Delphi 1 on Windows 3.11. David makes it all look so easy. Watch this to get a moving-pictures-with-sound idea of what I was looking at in my VM.

Once Delphi was installed, I started pondering the network design. But that thought wasn’t resident in my head for long, because it was immediately replaced with the reason why I didn’t use that Windows 3.11 VM much beyond the original base install.

The networking stack doesn’t work. Or at least, it didn’t.

That could be a problem.

Retro spelunking

I originally installed the VM by following this guide, which is notable as having additional flourishes like mouse, sound, and SVGA support, as well as TCP/IP networking. Unfortunately I couldn’t initially get the network stack working as Windows 3.11 would hang on a black screen after the familiar OS splash image.

Looking back to my silly joke, those 16-bit Windows-based Mastodon dreams quickly turned to dust when I realised I wouldn’t get far without an IP address in the VM.

Hopes raised

After some digging in the depths of retro forums, I stumbled on a four year-old repo maintained by Jaap Joris Vens.

Here’s a fully configured Windows 3.11 machine with a working internet connection and a load of software, games, and of course Microsoft BOB 🤓

Jaap Joris published this ready-to-go Windows 3.11 hard disk image for QEMU, chock full of games, utilities, and drivers. I thought that perhaps their image was configured differently, and thus worked.

However, after downloading it, I got the same “black screen after splash” as with my image. Other retro enthusiasts had the same issue, and reported the details on this issue, about a year ago.

does not work, black screen.

It works for me and many others. Have you followed the instructions? At which point do you see the black screen?

The key to finding the solution was a comment from Jaap Joris pointing out that the disk image “hasn’t changed since it was first committed 3 years ago”, implying it must have worked back then, but doesn’t now.

Joy of Open Source

I figured that if the original uploader had at least some success when the image was created and uploaded, it is indeed likely QEMU or some other component it uses may have (been) broken in the meantime.

So I went rummaging in the source archives, looking for the most recent release of QEMU, immediately prior to the upload. QEMU 5.2.0 looked like a good candidate, dated 8th December 2020, a solid month before 18th January 2021 when the hda.img file was uploaded.

If you build it, they will run

It didn’t take long to compile QEMU 5.2.0 on my ThinkPad Z13 running Ubuntu 24.04.1. It went something like this. I presumed that getting the build dependencies for whatever is the current QEMU version, in the Ubuntu repo today, will get me most of the requirements.

$ sudo apt-get build-dep qemu
$ mkdir qemu
$ cd qemu
$ wget https://download.qemu.org/qemu-5.2.0.tar.xz
$ tar xvf qemu-5.2.0.tar.xz
$ cd qemu-5.2.0
$ ./configure
$ make -j$(nproc)

That was pretty much it. The build ran for a while, and out popped binaries and the other stuff you need to emulate an old OS. I copied the bits required directly to where I already had put Jaap Joris’ hda.img and start script.

$ cd build
$ cp qemu-system-i386 efi-rtl8139.rom efi-e1000.rom efi-ne2k_pci.rom kvmvapic.bin vgabios-cirrus.bin vgabios-stdvga.bin vgabios-vmware.bin bios-256k.bin ~/VMs/windows-3.1/

I then tweaked the start script to launch the local home-compiled qemu-system-i386 binary, rather than the one in the path, supplied by the distro:

$ cat start
#!/bin/bash
./qemu-system-i386 -nic user,ipv6=off,model=ne2k_pci -drive format=raw,file=hda.img -vga cirrus -device sb16 -display gtk,zoom-to-fit=on

This worked a treat. You can probably make out in the screenshot below, that I’m using Internet Explorer 5 to visit the GitHub issue which kinda renders when proxied via FrogFind by Action Retro.

Windows 3.11, at 1280x1024, running Internet Explorer 5, looking at a GitHub issue

Share…

I briefly toyed with the idea of building a deb of this version of QEMU for a few modern Ubuntu releases, and throwing that in a Launchpad PPA then realised I’d need to make sure the name doesn’t collide with the packaged QEMU in Ubuntu.

I honestly couldn’t be bothered to go through the pain of effectively renaming (forking) QEMU to something like OLDQEMU so as not to damage existing installs. I’m sure someone could do it if they tried, but I suspect it’s quite a search and replace, or move the binaries somewhere under /opt. Too much effort for my brain.

I then started building a snap of qemu as oldqemu - which wouldn’t require any “real” forking or renaming. The snap could be called oldqemu but still contain qemu-system-i386 which wouldn’t clash with any existing binaries of the same name as they’d be self-contained inside the compressed snap, and would be launched as oldqemu.qemu-system-i386.

That would make for one package to maintain rather than one per release of Ubuntu. (Which is, as I am sure everyone is aware, one of the primary advantages of making snaps instead of debs in the first place.)

Anyway, I got stuck with another technical challenge in the time I allowed myself to make the oldqemu snap. I might re-visit it, especially as I could leverage the Launchpad Build farm to make multiple architecture builds for me to share.

…or not

In the meantime, the instructions are above, and also (roughly) in the comment I left on the issue, which has kindly been re-opened.

Now, about that Windows 3.11 Mastodon client…

Alan PopeVirtual Zane Lowe for Spotify

tl;dr

I bodged together a Python script using Spotipy (not a typo) to feed me #NewMusicDaily in a Spotify playlist.

No AI/ML, all automated, “fresh” tunes every day. Tunes that I enjoy get preserved in a Keepers playlist; those I don’t like to get relegated to the Sleepers playlist.

Any tracks older than eleven days are deleted from the main playlist, so I automatically get a constant flow of new stuff.

My personal Zane Lowe in a box

Nutshell

  1. The script automatically populates this Virtual Zane Lowe playlist with semi-randomly selected songs that were released within the last week or so, no older (or newer).
  2. I listen (exclusively?) to that list for a month, signaling songs I like by hitting a button on Spotify.
  3. Every day, the script checks for ’expired’ songs whose release date has passed by more than 11 days.
  4. The script moves songs I don’t like to the Sleepers playlist for archival (and later analysis), and to stop me hearing them.
  5. It moves songs I do like to the Keepers playlist, so I don’t lose them (and later analysis).
  6. Goto 1.

I can run the script at any time to “top up” the playlist or just let it run regularly to drip-feed me new music, a few tracks at a time.

Clearly, once I have stashed some favourites away in the Keepers pile, I can further investigate those artists, listen to their other tracks, and potentially discover more new music.

Below I explain at some length how and why.

NoCastAuGast

I spent an entire month without listening to a single podcast episode in August. I even unsubscribed from everything and deleted all the cached episodes.

Aside: Fun fact: The Apple Podcasts app really doesn’t like being empty and just keeps offering podcasts it knows I once listened to despite unsubscribing. Maybe I’ll get back into listening to these shows again, but music is on my mind for now.

While this is far from a staggering feat of human endeavour in the face of adversity, it was a challenge for me, given that I listened to podcasts all the time. This has been detailed in various issues of my personal email newsletter, which goes out on Fridays and is archived to read online or via RSS.

In August, instead, I re-listened to some audio books I previously enjoyed and re-listened to a lot of music already present on my existing Spotify playlists. This became a problem because I got bored with the playlists. Spotify has an algorithm that can feed me their idea of what I might want, but I decided to eschew their bot and make my own.

Note: I pay for Spotify Premium, then leveraged their API and built my “application” against that platform. I appreciate some people have Strong Opinions™️ about Spotify. I have no plans to stop using Spotify anytime soon. Feel free to use whatever music service you prefer, or self-host your 64-bit, 192 kHz Hi-Res Audio from HDTracks through an Elipson P1 Pre-Amp & DAC and Cary Audio Valve MonoBlok Power Amp in your listening room. I don’t care.

I’ll be here, listening on my Apple AirPods, or blowing the cones out of my car stereo. Anyway…

I spent the month listening to great (IMHO) music, predominantly released in the (distant) past on playlists I chronically mis-manage. On the other hand, my son is an expert playlist curator, a skill he didn’t inherit from me. I suspect he “gets the aux” while driving with friends, partly due to his Spotify playlist mastery.

As I’m not a playlist charmer, I inevitably got bored of the same old music during August, so I decided it was time for a change. During the month of September, my goal is to listen to as much new (to me) music as I can and eschew the crusty playlists of 1990s Brit-pop and late-70s disco.

How does one discover new music though?

Novel solutions

I wrote a Python script.

Hear me out. Back in the day, there was an excellent desktop music player for Linux called Banshee. One of the great features Banshee users loved was “Smart Playlists.” This gave users a lot of control over how a playlist was generated. There was no AI, no cloud, just simple signals from the way you listen to music that could feed into the playlist.

Watch a youthful Jorge Castro from 13 years ago do a quick demo.

Jorge Demonstrating the awesome power of Smart Playlists in Banshee (RIP in Peace)

Aside: Banshee was great, as were many other Mono applications like Tomboy and F-Spot. It’s a shame a bunch of blinkered, paranoid, noisy, and wrong Linux weirdos chased the developers away, effectively killing off those excellent applications. Good job, Linux community.

Hey ho. Moving on. Where was I…

Spotify clearly has some built-in, cloud-based “smarts” to create playlists, recommendations, and queues of songs that its engineers and algorithm think I might like. There’s a fly in the ointment, though, and her name is Alexa.

No, Alexa, NO!

We have a “Smart” speaker in the kitchen; the primary music consumers are not me. So “my” listening history is now somewhat tainted by all the Chase Atlantic & Central Cee my son listens to and Michael (fucking) Bublé, my wife, enjoys. She enjoys it so much that Bublé has featured on my end-of-year “Spotify Unwrapped” multiple times.

I’m sure he’s a delightful chap, but his stuff differs from my taste.

I had some ideas to work around all this nonsense. My goals here are two-fold.

  1. I want to find and enjoy some new music in my life, untainted by other house members.
  2. Feed the Spotify algorithm with new (to me) artists, genres and songs, so it can learn what else I may enjoy listening to.

Obviously, I also need to do something to muzzle the Amazon glossy screen of shopping recommendations and stupid questions.

The bonus side-quest is learning a bit more Python, which I completed. I spent a few hours one evening on this project. It was a fun and educational bit of hacking during time I might otherwise use for podcast listening. The result is four hundred or so lines of Python, including comments. My code, like my blog, tends to be a little verbose because I’m not an expert Python developer.

I’m pretty positive primarily professional programmers potentially produce petite Python.

Not me!

Noodling

My script uses the Spotify API via Spotipy to manage an initially empty, new, “dynamic” playlist. In a nutshell, here’s what the python script does with the empty playlist over time:

  • Use the Spotify search API to find tracks and albums released within the last eleven days to add to the playlist. I also imposed some simple criteria and filters.
    • Tracks must be accessible to me on a paid Spotify account in Great Britain.
    • The maximum number of tracks on the playlist is currenly ninety-four, so there’s some variety, but not too much as to be unweildy. Enough for me to skip some tracks I don’t like, but still have new things to listen to.
    • The maximum tracks per artist or album permitted on the playlist is three, again, for variety. Initially this was one, but I felt it’s hard to fully judge the appeal of an artist or album based off one song (not you: Black Lace), but I don’t want entire albums on the list. Three is a good middle-ground.
    • The maximum number of tracks to add per run is configurable and was initially set at twenty, but I’ll likely reduce that and run the script more frequently for drip-fed freshness.
  • If I use the “favourite” or “like” button on any track in the list before it gets reaped by the script after eleven days, the song gets added to a more permanent keepers playlist. This is so I can quickly build a collection of newer (to me) songs discovered via my script and curated by me with a single button-press.
  • Delete all tracks released more than eleven days ago if I haven’t favourited them. I chose eleven days to keep it modern (in theory) and fresh (foreshadowing). Technically, the script does this step first to make room for additional new songs.

None of this is set in stone, but it is configurable with variables at the start of the script. I’ll likely be fiddling with these through September until I get it “right,” whatever that means for me. Here’s a handy cut-out-and-keep block diagram in case that helps, but I suspect it won’t.

 +-----------------------------+
 | Spotify (Cloud) |
 | +---------------------+ |
 | | Main Playlist | |
 | +---------------------+ |
 | | | |
 | Like | | Dislike |
 | v | |
 | +---------------------+ |
 | | Keeper Playlist | |
 | +---------------------+ |
 | | |
 | v |
 | +---------------------+ |
 | | Sleeper Playlist | |
 | +---------------------+ |
 +-------------+---------------+
 ^
 |
 v
 +---------------------------+
 | Python Script |
 | +---------------------+ |
 | | Calls Spotify API | |
 | | and Manages Songs | |
 | +---------------------+ |
 +---------------------------+

Next track

The expectation is to run this script automatically every day, multiple times a day, or as often as I like, and end up with a frequently changing list of songs to listen to in one handy playlist. If I don’t like a song, I’ll skip it, and when I do like a song, I’ll likely play it more than once. and maybe click the “Like” icon.

My theory is that the list becomes a mix between thirty and ninety artists who have released albums over the previous rolling week. After the first test search on Tuesday, the playlist contained 22 tracks, which isn’t enough. I scaled the maximum up over the next few days. It’s now at ninety-four. If I exhaust all the music and get bored of repeats, I can always up the limit to get a few new songs.

In fact, on the very first run of the script, the test playlist completely filled with songs from one artist who had just released a new album. That triggered the implementation of the three songs per artist/album rule to reduce that happening.

I appreciate listening to tracks out of sequence, and a full album is different from the artist intended. But thankfully, I don’t listen to a lot of Adele, and the script no longer adds whole albums full of songs to the list. So, no longer a “me” problem.

No AI

I said at the top I’m not using any “AI/ML” in my script, and while that’s true, I don’t control what goes on inside the Spotify datacentre. The script is entirely subject to the whims of the Spotify API as to which tracks get returned to my requests. There are some constraints to the search API query complexity, and limits on what the API returns.

The Spotify API documentation has been excellent so far, as has the Spotipy docs.

Popular songs and artists often organically feature prominently in the API responses. Plus (I presume) artists and labels have financial incentives or an active marketing campaign with Spotify, further skewing search results. Amusingly, the API has an optional (and amusing) “hipster” tag to show the bottom 10% of results (ranked by popularity). I did that once, didn’t much like it, and won’t do it again.

It’s also subject to the music industry publishing music regularly, and licensing it to be streamed via Spotify where I live.

Not quite

With the script as-is, initially, I did not get fresh new tunes every single day as expected, so I had a further fettle to increase my exposure to new songs beyond what’s popular, trending, or tagged “new”. I changed the script to scan the last year of my listening habits to find genres of music I (and the rest of the family) have listened to a lot.

I trimmed this list down (to remove the genre taint) and then fed these genres to the script. It then randomly picks a selection of those genres and queries the API for new releases in those categories.

With these tweaks, I certainly think this script and the resulting playlist are worth listening to. It’s fresher and more dynamic than the 14-year-old playlist I currently listen to. Overall, the script works so that I now see songs and artists I’ve not listened to—or even heard of—before. Mission (somewhat) accomplished.

Indeed, with the genres feature enabled, I could add a considerable amount of new music to the list, but I am trying to keep it a manageable size, under a hundred tracks. Thankfully, I don’t need to worry about the script pulling “Death Metal,” “Rainy Day,” and “Disney” categories out of thin air because I can control which ones get chosen. Thus, I can coerce the selection while allowing plenty of randomness and newness.

I have limited the number of genre-specific songs so I don’t get overloaded with one music category over others.

Not new

There are a couple of wrinkles. One song that popped into the playlist this week is “Never Going Back Again” by Fleetwood Mac, recorded live at The Forum, Inglewood, in 1982. That’s older than the majority of what I listened to in all of August! It looks like Warner Records Inc. released that live album on 21st August 2024, well within my eleven-day boundary, so it’s technically within “The Rules” while also not being fresh, new music.

There’s also the compilation complication. Unfresh songs from the past re-released on “TOP HITS 2024” or “DANCE 2024 100 Hot Tracks” also appeared in my search criteria. For example, “Talk Talk” by Charli XCX, from her “Brat” album, released in June, is on the “DANCE 2024 100 Hot Tracks” compilation, released on 23rd August 2024, again, well within my eleven-day boundary.

I’m in two minds about these time-travelling playlist interlopers. I have never knowingly listened to Charli XCX’s “Brat” album by choice, nor have I heard live versions of Fleetwood Mac’s music. I enjoy their work, but it goes against the “new music” goal. But it is new to me which is the whole point of this exercise.

The further problem with compilations is that they contain music by a variety of artists, so they don’t hit the “max-per-artist” limit but will hit the “max-per-album” rule. However, if the script finds multiple newly released compilations in one run, I might end up with a clutch of random songs spread over numerous “Various Artists” albums, maxing out the playlist with literal “filler.”

I initially allowed compilations, but I’m irrationally bothered that one day, the script will add “The Birdie Song” by Black Lace as part of “DEUTSCHE TOP DISCO 3000 POP GEBURTSTAG PARTY TANZ SONGS ZWANZIG VIERUNDZWANZIG”.

Nein.

I added a filter to omit any “album type: compilation,” which knocks that bopping-bird-based botherer squarely on the bonce.

No more retro Europop compilation complications in my playlist. Alles klar.

Not yet

Something else I had yet to consider is that some albums have release dates in the future. Like a fresh-faced newborn baby with an IDE and API documentation, I assumed that albums published would generally have release dates of today or older. There may be a typo in the release_date field, or maybe stuff gets uploaded and made public ahead of time in preparation for a big marketing push on release_date.

I clearly do not understand the music industry or publishing process, which is fine.

Nuke it from orbit

I’ve been testing the script while I prototyped it, this week, leading up to the “Grand Launch” in September 2024 (next month/week). At the end of August I will wipe the slate (playlist) clean, and start again on 1st September with whatever rules and optimisations I’ve concocted this week. It will almost certainly re-add some of the same tracks back-in after the 31st August “Grand Purge”, but that’s as expected, working as designed. The rest will be pseudo-random genre-specific tracks.

I hope.

Newsletter

I will let this thing go mad each day with the playlist and regroup at the end of September to evaluate how this scheme is going. Expect a follow-up blog post detailing whether this was a fun and interesting excursion or pure folly. Along the way, I did learn a bit more Python, the Spotify API, and some other interesting stuff about music databases and JSON.

So it’s all good stuff, whether I enjoy the music or not.

You can get further, more timely updates in my weekly email newsletter, or view it in the newsletter archive, and via RSS, a little later.

Ken said he got “joy out of reading your newsletter”. YMMV. E&OE. HTH. HAND.

Nomenclature

Every good project needs a name. I initially called it my “Personal Dynamic Playlist of Sixty tracks over Eleven days,” or PDP-11/60 for short, because I’m a colossal nerd. Since bumping the max-tracks limit for the playlist, it could be re-branded PDP-11/94. However, this is a relatively niche and restrictive playlist naming system, so I sought other ideas.

My good friend Martin coined the term “Virtual Zane Lowe” (Zane is a DJ from New Zealand who is apparently renowned for sharing new music). That’s good enough for me. Below are links to all three playlists if you’d like to listen, laugh, live, love, or just look at them.

The “Keepers” and “Sleepers” lists will likely be relatively empty for a few days until the script migrates my preferred and disliked tracks over for safe-keeping & archival, respectively.

November approaches

Come back at the end of the month to see if: My script still works. The selections are good. I’m still listening to this playlist, and most importantly. Whether I enjoy doing so!

If it works, I’ll probably continue using it through October and into November as I commute to and from the office. If that happens, I’ll need to update the playlist artwork. Thankfully, there’s an API for that, too!

I may consider tidying up the script and sharing it online somewhere. It feels a bit niche and requires a paid Spotify account to even function, so I’m not sure what value others would get from it other than a hearty chuckle at my terribad Python “skills.”

One potentially interesting option would be to map the songs in Spotify to another, such as Apple Music or even videos on YouTube. The YouTube API should enable me to manage video playlists that mirror the ones I manage directly on Spotify. That could be a fun further extension to this project.

Another option I considered was converting it to a web app, a service I (and other select individuals) can configure and manage in a browser. I’ll look into that at the end of the month. If the current iteration of the script turns out to be a complete bust, then this idea likely won’t go far, either.

Thanks for reading. AirPods in. Click “Shuffle”.

Ross YoungerBroadcast graphics for fencing

I created a TV graphics package for fencing tournaments.

Earlier this year, Christchurch played host to the Commonwealth Junior & Cadet fencing tournament.

Selected parts of the tournament were livestreamed, with a package broadcast on Sky TV (NZ). The broadcast and finals streams had a live graphics package fed from the scoreboard.

The programmes were produced using a broadcast-spec OB truck supplied by Kiwi Outside Broadcast. The truck graphics PC used Captivate to generate its graphics, which output as key+fill SDI signals. These were fed to the vision mixer and keyed onto the picture in the usual way.

The package can be seen in action on the Commonwealth Junior & Cadet 2024 programmes.

The details read from the scoreboard cover the “hit” lamps, scores, clock (including fractional seconds in the last 10s), period, red/yellow cards, and the priority indicator. On top of that, the package provides a place to enter the fencer names and nationalities, and set colours for them.

This is all made possible by the scoreboard, a Favero FA-07, offering a data feed over an RS-422 interface. I wrote a Python script to parse the data feed, turn it into a JSON dictionary and pass on to Captivate.

Alan PopeText Editors with decent Grammar Tools

This is another blog post lifted wholesale out of my weekly newsletter. I do this when I get a bit verbose to keep the newsletter brief. The newsletter is becoming a blog incubator, which I’m okay with.

A reminder about that newsletter

The newsletter is emailed every Friday - subscribe here, and is archived and available via RSS a few days later.

I talked a bit about the process of setting up the newsletter on episode 34 of Linux Matters Podcast. Have a listen if you’re interested.

Linux Matters 34

Patreon supporters of Linux Matters can get the show a day or so early and without adverts. �

Multiple kind offers

Good news, everyone! I now have a crack team of fine volunteers who proofread the text that lands in your inbox/browser cache/RSS reader. Crucially, they’re doing that review before I send the mail, not after, as was previously the case. Thank you for volunteering, mystery proofreaders.

popey dreamland

Until now, my newsletter “workflow” (such as it was) involved hoping that I’d get it done and dusted by Friday morning. Then, ideally, it would spend some time “in review”, followed by saving to disk. But if necessary, it would be ready to be opened in an emergency text editor at a moment’s notice before emails were automatically sent by lunchtime.

I clearly don’t know me very well.

popey reality

What actually happened is that I would continue editing right up until the moment I sent it out, then bash through the various “post-processing” steps and schedule the emails for “5 minutes from now.” Boom! Done.

This often resulted in typos or other blemishes in my less-than-lovingly crafted emails to fabulous people. A few friends would ping me with corrections. But once the emails are sent, reaching out and fixing those silly mistakes is problematic.

Someone should investigate over-the-air updates to your email. Although zero-day patches and DLC for your inbox sound horrendous. Forget that.

In theory, I could tweak the archived version, but that is not straightforward.

Tool refresh?

Aside: Yes, I know it’s not the tools, but I should slow down, be more methodical and review every change to my document before publishing. I agree. Now, let’s move on.

While preparing the newsletter, I would initially write in Sublime Text (my desktop text editor of choice), with a Grammarly† (affiliate link) LSP extension, to catch my numerous blunders, and re-word my clumsy English.

Unfortunately, the Grammarly extension for Sublime broke a while ago, so I no longer have that available while I prepare the newsletter.

I could use Google Docs, I suppose, where Grammarly still works, augmenting the built-in spell and grammar checker. But I really like typing directly as Markdown in a lightweight editor, not a big fat browser. So I guess I need to figure something else out to check my spelling and grammar prior to the awesome review team getting it to save at least some of my blushes.

I’m not looking for suggestions for a different text editor—or am I? Maybe I am. I might be.

Sure, that’ll fix it.

ZX81 -> Spectrum -> CPC -> edlin -> Edit -> Notepad -> TextPad -> Sublime -> ?

I’ve used a variety of text editors over the years. Yes, the ZX81 and Sinclair Spectrum count as text editors. Yes, I am old.

I love Sublime’s minimalism, speed, and flexibility. I use it for all my daily work notes, personal scribblings, blog posts, and (shock) even authoring (some) code.

I also value Sublime’s data-recovery features. If the editor is “accidentally” terminated or a power-loss event occurs, Sublime reliably recovers its state, retaining whatever you were previously editing.

I regularly use Windows, Linux, and macOS on any given day across multiple computers. So, a cross-platform editor is also essential for me, but only on the laptop/desktop, as I never edit on mobile‡ devices.

I typically just open a folder as a “workspace” in a window or an additional tab in one window. I frequently open many folders, each full of files across multiple displays and machines.

All my notes are saved in folders that use Syncthing to keep in sync across machines. I leave all of those notes open for days, perhaps weeks, so having a robust sync tool combined with an editor that auto-reloads when files change is key.

Their notes are separately backed up, so cloud storage isn’t essential for my use case.

Something else?

Whatever else I pick, it’s really got to fit that model and requirements, or it’ll be quite a stretch for me to change. One option I considered and test-drove is NotepadNext. It’s an open-source re-implementation of Notepad++, written in C++ and Qt.

A while back, I packaged up and published it as a snap, to make it easy to install and update. It fits many of the above requirements already, with the bonus of being open-source, but sadly, there is no Grammarly support there either.

I’d prefer no :::: W I D E - L O A D :::: electron monsters. Also, not Notion or Obsidian, as I’ve already tried them, and I’m not a fan. In addition, no, not Vim or Emacs.

Bonus points if you have a suggestion where one of the selling points isn’t “AI”§.

Perhaps there isn’t a great plain text editor that fulfills all my requirements. I’m open to hearing suggestions from readers of this blog or the newsletter. My contact details are here somewhere.


† - Please direct missives about how terrible Grammarly is to /dev/null. Thanks. Further, suggestions that I shouldn’t rely on Grammarly or other tools and should just “Git Gud” (as the youths say) may be inserted into the A1481 on the floor.

‡ - I know a laptop is technically a “mobile” device.

§ - Yes, I know that “Not wanting AI” and “Wanting a tool like Grammarly” are possibly conflicting requirements.

â—‡ - For this blog post I copy and pasted the entire markdown source into a Google doc, used all the spelling and grammar tools, then pasted it back into Sublime, pushed to git, and I’m done. Maybe that’s all I need to do? Keep my favourite editor, and do all the grammar in one chunk at the end in a tab of a browser I already had open anyway. Beat that!

Alan PopeApplication Screenshots on macOS

I initially started typing this as short -[ Contrafibularities ]- segment for my free, weekly newsletter. But it got a bit long, so I broke it out into a blog post instead.

About that newsletter

The newsletter is emailed every Friday - subscribe here, and is archived and available via RSS a few days later. I talked a bit about the process of setting up the newsletter on episode 34 of Linux Matters Podcast. Have a listen if you’re interested.

Linux Matters 34

Patreon supporters of Linux Matters can get the show a day or so early, and without adverts. �

Going live!

I have a work-supplied M3 MacBook Pro. It’s a lovely device with ludicrous battery endurance, great screen, keyboard and decent connectivity. As an ex-Windows user at work, and predominantly Linux enthusiast at home, macOS throws curveballs at me on a weekly basis. This week, screenshots.

I scripted a ‘going live’ shell script for my personal live streams. For the title card, I wanted the script to take a screenshot of the running terminal, Alacritty. I went looking for ways to do this on the command line, and learned that macOS has shipped a screencapture command-line tool for some time now. Amusingly the man page for it says:

DESCRIPTION
 The screencapture utility is not very well documented to date.
 A list of options follows.

and..

BUGS
 Better documentation is needed for this utility.

This is 100% correct.

How hard can it be?

Perhaps I’m too used to scrot on X11, that I have used for over 20 years. If I want a screenshot of the current running system, just run scrot and bang there’s a PNG in the current directory showing what’s on screen. Easy peasy.

On macOS, run screencapture image.png and you’ll get a screenshot alright, of the desktop, your wallpaper. Not the application windows carefully arranged on top. To me, this is somewhat obtuse. However, it is also possible to screenshot a window, if you know the <windowid>.

From the screencapture man page:

 -l <windowid> Captures the window with windowid.

There appears to be no straightforward way to actually get the <windowid> on macOS, though. So, to discover the <windowid> you might want the GetWindowID utility from smokris (easily installed using Homebrew).

That’s fine and relatively straightforward if there’s only one Window of the application, but a tiny bit more complex if the app reports multiple windows - even when there’s only one. Alacritty announces multiple windows, for some reason.

$ GetWindowID Alacritty --list
"" size=500x500 id=73843
"(null)" size=0x0 id=73842
"alan@Alans-MacBook-Pro.local (192.168.1.170) - byobu" size=1728x1080 id=73841

FINE. We can deal with that:

$ GetWindowID Alacritty --list | grep byobu | awk -F '=' '{print $3}'
73841

You may then encounter the mysterious could not create image from window error. This threw me off a little, initially. Thankfully I’m far from the first to encounter this.

Big thanks to this rancher-sandbox, rancher-desktop pull request against their screenshot docs. Through that I discovered there’s a macOS security permission I had to enable, for the terminal application to be able to initiate screenshots of itself.

A big thank-you to both of the above projects for making their knowledge available. Now I have this monstrosity in my script, to take a screenshot of the running Alacritty window:

screencapture -l$(GetWindowID Alacritty --list | \
 grep byobu | \
 awk -F '=' '{print $3}') titlecard.png

If you watch any of my live streams, you may notice the title card. Now you know how it’s made, or at least how the screenshot is created, anyway.

Andy SmithDaniel Kitson – Collaborator (work in progress)

Collaborators

Last night we went to see Daniel Kitson's "Collaborator" (work in progress). I'd no idea what to expect but it was really good!

A photo of the central area of a small theatre in the round. There are          four tiers of seating and then an upper balcony.Most seats are filled.          The central stage area is empty except for four large stacks of          paper.
The in-the-round setup of Collaborator at The Albany Theatre, Deptford, London

It has been reviewed at 4/5 stars in Chortle and positively in the Guardian, but I don't recommend reading any reviews because they'll spoil what you will experience. We went in to it blind as I always prefer that rather than thorough research of a show. I think that was the correct decision. I've been on Daniel's fan newsletter for ages but hadn't had chance to see him live until now.

While I've seen some comedy gigs that resembled this, I've never seen anything quite like it.

At £12 a ticket this is an absolute bargain. We spent more getting there by public transport!

Shout out to the nerds

If you're a casual comedy enjoyer looking for something a bit different then that's all you need to know. If like me however you consider yourself a bit of a wanky appreciator of comedy as an art form, I have some additional thoughts!

Collaborator wasn't rolling-on-the-floor-in-tears funny, but was extremely enjoyable and Jenny and I spent the whole way home debating how Kitson designed it and what parts of it really meant. Not everyone wants that in comedy, and that's fine. I don't always want it either. But to get it sometimes is a rare treat.

It's okay to enjoy a McIntyre or Peter Kay crowd-pleaser about "do you have a kitchen drawer full of junk?" or "do you remember white dog poo?" but it's also okay to appreciate something that's very meta and deconstructive. Stewart Lee for example is often accused of being smug and arrogant when he critiques the work of other comedians, and his fans to some extent are also accused of enjoying feeling superior more than they enjoy a laugh - and some of them who miss the point definitely are like this.

But acts like Kitson and Lee are constructed personalities where what they claim to think and how they behave is a fundamental part of the performance. You are to some extent supposed to disagree with and be challenged by their views and behaviours — and I don't just mean they are edgelording with some "saying the things that can't be said" schtick. Sometimes it's fun to actually have thoughts about it. It's a different but no less valid (or more valid!) experience. A welcome one in this case!

I mean, I might have to judge you if you enjoy Mrs Brown's Boys, but I accept it has an audience as an art form.

White space

There was a comment on Fedi about how the crowd pictured here appears to be a sea of white faces, despite London being a fairly diverse city. This sort of thing hasn't escaped me. I've found it to be the case in most of the comedy gigs I've attended in person, where the performer is white. I don't know why. In fact, acts like Stewart Lee and Richard Herring will frequently make reference to the fact that their stereotypical audience member is a middle aged white male computer toucher with lefty London sensibilities. So, me then.

Don't get me wrong, I do try to see some diverse acts and have been in a demographic minority a few times. Sadly enough, just going to see a female act can be enough to put you in an audience of mostly women. That happened when we went to see Bridget Christie's Who Am I? ("a menopause laugh a minute with a confused, furious, sweaty woman who is annoyed by everything", 4 stars, Chortle), and it's a shame that people seem to stick in their lanes so much.

References

Josh HollandEven more on git scratch branches: using Jujutsu

Even more on git scratch branches: using Jujutsu

This is the third post in an impromptu series:

  1. Use a scratch branch in git
  2. More on git scratch branches: using stgit

It seems the main topic of this blog is now git scratch branches and ways to manage them, although the main prompt for this one is discovering someone else had exactly the same idea, as I found from a blog post extolling Jujutsu.

I don’t have much to add to the posts from qword and Sandy, beyond the fact that Jujutsu really is the perfect tool to make this workflow straightforward. The default change selection logic in jj rebase means that 9 times out of 10 it’s enough just to run jj rebase -d master to get everything up to date with the master branch, and the Jujutsu workflow as a whole really is a great experience.

So go forth, use Jujutsu to manage your dev branch, and hopefully I’ll never have to write another post on this, and you can have the traditional “I rewrote my blogging engine from scratch again” post that I’ve been owing for a month or two now.

Ross YoungerFault-finding at the ends of the earth

This is a tale from many months ago, working on an embedded ARM target.

In my private journal I wrote:

Today I feel like I saddled up and rode my horse to the literal ends of the earth. I was fault-finding in the setting-up-of-the-universe that happens before your program starts up, and in the tearing-it-down-again that happens after you declare you’re finished.

If you know C++, you might guess that this was a story about static object allocation and deallocation. You’d be right. So, destructors belonging to static-allocated objects. You’d never think they’d run on a bare-metal embedded target.

Well, they can. If your target supports exit() - e.g. if you are running with newlib - then an atexit handler is set up for you, and that will be set up to run the static destructors. If your program then calls exit() (as, say, your on-silicon unit tests might, at the end of a test run) then things are at risk of turning to custard.

You might have enabled an interrupt for some peripheral on the silicon. In order to do anything really useful, the ISR might reference a static object. If you do this, you’d damn well better make sure the object has a static destructor that disables the interrupt, or hilarity is one day going to ensue. You know, the sort of hilarity that involves being savaged by a horde of angry rampaging badgers, or your socks catching fire.

But wait, I hear you say, it called exit! The program no longer exists! Well, sure it doesn’t; but what happens on exit? On this particular ARM target, running tests via a debugger as part of a CI chain, when the atexit handlers have run the process signals final completion with a semihosting call, which is a special flavour of debug breakpoint. It is… not fast. If your interrupt happens regularly, the goblins are going to get you before the pseudo-system-call completes. Your test framework will fail the test executable for hanging, despite somehow passing all of its tests.

There was an actual bug in there, and it was mine. Class X, which contained an RTOS queue and enabled an interrupt, only had a default destructor. On exit, somewhen between static destructors and completion of the semihosting exit call, the ISR fired. It duly failed to insert an item into the now-destroyed queue, so jumped to the internal panic routine. That routine contained a breakpoint and then went nowhere fast, waiting for a debugger command that was never going to arrive — hence the time-out. Maybe it would have been useful to have a library option to skip the static destructors, but I probably wouldn’t have been aware of it ahead of time anyway.

The static destructor ordering fiasco can also be yours for the taking, but thankfully that hadn’t bitten me. Nevertheless, it was a rough day.

Cover image: Cyber Bug Search, by juicy_fish on Freepik

Chris WallaceNon-Eulerian paths

I’ve been doing a bit of work on Non-Eulerian paths.  I haven't made any algorithmic progress...

Chris WallaceMore Turtle shapes

I’ve come across some classic curves using arcs which can be described in my variant of Turtle...

Chris WallaceMy father in Tairua

My father in Tairua in 1929. Paku is in the background. My father, Francis Brabazon Wallace came...

Chris Wallace“Characters” a Scroll Saw Project

Now that the Delta Scroll saw is working, I was looking for a project to build up my skill. Large...

Phil SpencerWho’s sick of this shit yet?

I find some headlines just make me angry these days, especially ones centered around hyper late stage capitalism.


This one about Apple and Microsoft just made me go “Who the fuck cares?” and seriously, why should I care. those two idiot companies having insane and disgustingly huge market caps isn’t something I’m impressed by.

If anything it makes me furious.

Do something useful besides making iterations of the same ol junk. Make a few thousand houses, make an affordable grocery supply chain.

If you’re doing anything else you’re a waste of everyones time….as I type this on my Apple computer. Still, that bit of honesty aside I don’t give a fuck about either companies made up valuation.

Ross YoungerThe Road Less Travelled

I’ve started an occasional YouTube series about quirky, off-the-beaten-track places.

At the time of writing there are 5 episodes online. I’m still experimenting with style and techniques. My inspiration is a combination of The Tim Traveller and Tom Scott.

So far it consists of a few random places around South Island. There’s no particular timescale for publishing new episodes; I have many demands on my time, and this project isn’t yielding an income at the moment. (It may never; I’m doing it for fun.) Presenting is taking some getting used to!

Here’s the playlist:

Phil SpencerNew year new…..This

I have made a new years goal to retire this server before March, the OS has been upgraded many many times over the years and various software I’ve used has come and gone so there is lots of cruft. This server/VM started in San Fransisco and then my provider stopped offering VMs in CA and moved my VM to the UK which is where it has been ever since. This VM started its life in Jan 2008 and it is time to die.

During my 2 week xmas break I have been updating web facing software as much as I could so that when I do put the bullet in the head of this thing I can transfer my blog, wiki, and a couple other still active sites to the new OS without minimal tweaking in the new home.

So far the biggest issues I ran into were with my mediawiki, that entire site is very old, from around 2006 2 years before I started hosting it for someone and then I inherited it entirely around 2009 so the database is very finicky to upgrade and some of the extensions are no longer maintained. What I ended up doing was setting up a docker instance at home to test upgrading and working through the kinks and I have put together a solid step by step on how to move/upgrade it to latest.

I have also gotten sick of running my own e-mail servers, the spam management, certificates, block lists…..it’s annoying. I found out recently that iCloud which I already have a subscription to allows up to 5 custom e-mail domains so I retired my Philtopia e-mail to it early in December and as of today I moved the vo-wiki domain to it as well. Much less hassle for me, I already work enough for work I don’t need to work at home as well.

The other work continues, site by site but I think I am on track to put an end to this ol server early in the year.

Phil Spencer8bit party

It’s been a few years…four? since my Commodore 64 collection started and I’ve now got 2 working C64’s and a C128 that functions along with 2 disk drives, a tape drive and a collection of addon hardware and boxed games.

That isn’t all I am collecting however I also have my Nintendo Entertainment System and even more recently I acquired a Sega Master System. The 8bit era really seems to catch my eye far more than anything that came after. I suppose it’s because the whole era made it on hacks and luck.

In any case here are some pictures of my collection, I don’t collect for the sake of collecting. Everything I have I use or play cause otherwise why bother having it?

Enjoy

My desk
NES
Sega Master System
Commodore 64
Games

Phil SpencerI think it’s time the blog came back

It’s been a while since I’ve written a blog post, almost 4 years in fact but I think it is time for a comeback.

The reason for this being that social media has become so locked down you can’t actually give a valid opinion about something without someone flagging your comment or it being caught by a robot. Oddly enough it seems the right wing folks can say whatever they want against the immigrant villain of the month or LGTBQIA+ issues without being flagged but if you dare stand up to them or offer an opposing opinion. 30 day ban!

So it is time to dust off the ol blog and put my opinions to paper somewhere else just like the olden days before social media! It isn’t all bad of course, I’ve found mastodon quite open to opinions but the fediverse is getting a lot of corporate attention these days and i’m sure it’s only a year or two before even that ends up a complete mess.

Crack open the blogs and let those opinions fly

Paul RaynerPrint (only) my public IP

Every now and then, I need to know my public IP. The easiest way to find it is to visit one of the sites which will display it for you, such as https://whatismyip.com. Whilst useful, all of the ones I know (including that one) are chock full of adverts, and can’t easily be scraped as part of any automated scripting.

This has been a minor irritation for years, so the other night I decided to fix it.

http://ip.pr0.uk is my answer. It’s 50 lines of rust, and is accessible via tcp on port 11111, and via http on port 8080.

use std::io::Write;

use std::net::{IpAddr, Ipv4Addr, Ipv6Addr, SocketAddr, TcpListener, TcpStream};
use chrono::Utc;
use threadpool::ThreadPool;

fn main() {
    let worker_count = 4;
    let pool = ThreadPool::new(worker_count);
    let tcp_port = 11111;
    let socket_v4_tcp = SocketAddr::new(IpAddr::V4(Ipv4Addr::new(0, 0, 0, 0)), tcp_port);

    let http_port = 8080;
    let socket_v4_http = SocketAddr::new(IpAddr::V4(Ipv4Addr::new(0, 0, 0, 0)), http_port);

    let socket_addrs = vec![socket_v4_tcp, socket_v4_http];
    let listener = TcpListener::bind(&socket_addrs[..]);
    if let Ok(listener) = listener {
        println!("Listening on {}:{}", listener.local_addr().unwrap().ip(), listener.local_addr().unwrap().port());
        for stream in listener.incoming() {
            let stream = stream.unwrap();
            let addr =stream.peer_addr().unwrap().ip().to_string();
            if stream.local_addr().unwrap_or(socket_v4_http).port() == tcp_port {
                pool.execute(move||send_tcp_response(stream, addr));
            } else {
                //http might be proxied via https so let anything which is not the tcp port be http
                pool.execute(move||send_http_response(stream, addr));
            }
        }
    } else {
        println!("Unable to bind to port")
    }
}

fn send_tcp_response(mut stream:TcpStream, addr:String) {
    stream.write_all(addr.as_bytes()).unwrap();
}

fn send_http_response(mut stream:TcpStream, addr:String) {

    let html = format!("<html><head><title>{}</title></head><body><h1>{}</h1></body></html>", addr, addr);
    let length = html.len();
    let response = format!("HTTP/1.1 200 OK\r\nContent-Length: {length}\r\n\r\n{html}" );
    stream.write_all(response.as_bytes()).unwrap();
    println!("{}\tHTTP\t{}",Utc::now().to_rfc2822(),addr)
}

A little explanation is needed on the array of SocketAddr. This came from an initial misreading of the docs, but I liked the result and decided to keep it that way. Calls to listen() will only listen on one port - the first one in the array which is free. The result is that when you run this program, it listens on port 11111. If you keep it running and start another copy, that one listens on port 80 (because it can’t bind to port 11111). So to run this on my server, I just have systemd keep 2 copies alive at any time.

The code and binaries for Linux and Windows are available on Github.

Next steps

I might well leave it there. It works for me, so it’s done. Here are some things I could do though:

1) Don’t hard code the ports 2) Proxy https 3) make a client 4) make it available as a binary for anyone to run on crates.io 5) Optionally print the ttl. This would be mostly useful to people running their own instance.

Boring Details

Logging

I log the IP, port, and time of each connection. This is just in case it ever gets flooded and I need to block an IP/range. The code you see above is the code I run. No browser detection, user agent or anythign like that is read or logged. Any data you send with the connection is discarded. If I proxied https via nginx, that might log a bit of extra data as a side effect.

Systemd setup

There’s not much to this either. I have a template file:

[Unit]
Description=Run the whatip binary. Instance %i
After=network.target

[Service]
ExecStart=/path/to/whatip
Restart=on-failure

StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=whatip%i

[Install]
WantedBy=multi-user.target

stored at /etc/systemd/system/whatip@.service and then set up two instances to run:

systemctl enable whatip@1
systemctl enable whatip@2

Thanks for reading

David Leadbeater"[31m"?! ANSI Terminal security in 2023 and finding 10 CVEs

A paper detailing how unescaped output to a terminal could result in unexpected code execution, on many terminal emulators. This research found around 10 CVEs in terminal emulators across common client platforms.

Alun JonesMessing with web spiders

You've surely heard of ChatGPT and its ilk. These are massive neural networks trained using vast swathes of text. The idea is that if you've trained a network on enough - about 467 words

Alun JonesI wrote a static site generator

Back in 2019, when Google+ was shut down, I decided to start writing this blog. It seemed better to take my ramblings under my own control, rather than posting content - about 716 words

Alun JonesSolar panels

After an 8 month wait, we finally had solar panels installed at the start of July. We originally ordered, from e-on, last November, and were told that there was a - about 542 words

Alex HudsonJobs in the AI Future

Everyone is talking about what AI can do right now, and the impact that it is likely to have on us. This weekends’s Semafor Flagship (which is an excellent newsletter; I recommend subscribing!) asks a great question: “What do we teach the AI generation?”. As someone who grew up with computers, knowing he wanted to write software, and knowing that tech was a growth area, I never had to grapple with this type of worry personally. But I do have kids now. And I do worry. I’m genuinely unsure what I would recommend a teenager to do today, right now. But here’s my current thinking.

Paul RudkinYour new post

Your new post

This is a new blog post. You can author it in Markdown, which is awesome.

David LeadbeaterNAT-Again: IRC NAT helper flaws

A Linux kernel bug allows unencrypted NAT'd IRC sessions to be abused to access resources behind NAT, or drop connections. Switch to TLS right now. Or read on.

Paul RaynerPutting dc in (chroot) jail

A little over 4 years ago, I set up a VM and configured it to offer dc over a network connection using xinetd. I set it up at http://dc.pr0.uk and made it available via a socket connection on port 1312.

Yesterday morning I woke to read a nice email from Sylvan Butler pointing out that users could run shell commands from dc…

I had set up the dc command to run as a user “dc”, but still, if someone could run a shell command they could, for example, put a key in the dc user’s .ssh config, run sendmail (if it was set up), try for privelidge escalations to get root etc.

I’m not sure what the 2017 version of me was thinking (or wasn’t), but the 2022 version of me is not happy to leave it like this. So here’s how I put dc in jail.

Firstly, how do you run shell commands from dc? It’s very easy. Just prefix with a bang:

$ dc
!echo "I was here" > /tmp/foo
!cat /tmp/foo
I was here

So, really easy. Even if it was hard, it would still be bad.

This needed to be fixed. Firstly I thought about what else was on the VM - nothing that matters. This is a good thing because the helpful Sylvan might not have been the first person to spot the issue (although network dc is pretty niche). I still don’t want this vulnerability though as someone else getting access to this box could still use it to send spam, host malware or anything else they wanted to do to a cheap tiny vm.

I looked at restricting the dc user further (it had no login shell, and no home directory already), but it felt like I would always be missing something, so I turned to chroot jails.

A chroot jail lets you run a command, specifying a directory which is used as / for that command. The command (In theory) can’t escape that directory, so can’t see or touch anything outside it. Chroot is a kernel feature, and forms a basic security feature of linux, so should be good enough to protect network dc if set up correctly, even if it’s not perfect.

Firstly, let’s set up the directory for the jail. We need the programs to run inside the jail, and their dependent libraries. The script to run a networked dc instance looks like this:

#!/bin/bash
dc --version
sed -u -e 's/\r/\n/g' | dc

Firstly, I’ve used bash here, but this script is trivial, so it can use sh instead. We also need to keep the sed (I’m sure there are plenty of ways to do the replace not using sed, but it’s working fine as it is). For each of the 3 programs we need to run the script, I ran ldd to get their dependencies:

$ ldd /usr/bin/dc
	linux-vdso.so.1 =>  (0x00007fffc85d1000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fc816f8d000)
	/lib64/ld-linux-x86-64.so.2 (0x0000555cd93c8000)
$ ldd /bin/sh
	linux-vdso.so.1 =>  (0x00007ffdd80e0000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fa3c4855000)
	/lib64/ld-linux-x86-64.so.2 (0x0000556443a1e000)
$ ldd /bin/sed
	linux-vdso.so.1 =>  (0x00007ffd7d38e000)
	libselinux.so.1 => /lib/x86_64-linux-gnu/libselinux.so.1 (0x00007faf5337f000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007faf52fb8000)
	libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x00007faf52d45000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007faf52b41000)
	/lib64/ld-linux-x86-64.so.2 (0x0000562e5eabc000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007faf52923000)
$

So we copy those files to the exact directory structure inside the jail directory. Afterwards it looks like this:

$ ls -alR
.:
total 292
drwxr-xr-x 4 root root   4096 Feb  5 10:13 .
drwxr-xr-x 4 root root   4096 Feb  5 09:42 ..
-rwxr-xr-x 1 root root  47200 Feb  5 09:50 dc
-rwxr-xr-x 1 root root     72 Feb  5 10:13 dctelnet
drwxr-xr-x 3 root root   4096 Feb  5 09:49 lib
drwxr-xr-x 2 root root   4096 Feb  5 09:50 lib64
-rwxr-xr-x 1 root root  72504 Feb  5 09:58 sed
-rwxr-xr-x 1 root root 154072 Feb  5 10:06 sh

./lib:
total 12
drwxr-xr-x 3 root root 4096 Feb  5 09:49 .
drwxr-xr-x 4 root root 4096 Feb  5 10:13 ..
drwxr-xr-x 2 root root 4096 Feb  5 10:01 x86_64-linux-gnu

./lib/x86_64-linux-gnu:
total 2584
drwxr-xr-x 2 root root    4096 Feb  5 10:01 .
drwxr-xr-x 3 root root    4096 Feb  5 09:49 ..
-rwxr-xr-x 1 root root 1856752 Feb  5 09:49 libc.so.6
-rw-r--r-- 1 root root   14608 Feb  5 10:00 libdl.so.2
-rw-r--r-- 1 root root  468920 Feb  5 10:00 libpcre.so.3
-rwxr-xr-x 1 root root  142400 Feb  5 10:01 libpthread.so.0
-rw-r--r-- 1 root root  146672 Feb  5 09:59 libselinux.so.1

./lib64:
total 168
drwxr-xr-x 2 root root   4096 Feb  5 09:50 .
drwxr-xr-x 4 root root   4096 Feb  5 10:13 ..
-rwxr-xr-x 1 root root 162608 Feb  5 10:01 ld-linux-x86-64.so.2
$

and here is the modified dctelnet command:

#!/sh
#dc | dos2unix 2>&1
./dc --version
./sed -u -e 's/\r/\n/g' | ./dc

I’ve switched to using sh instead of bash, and all of the commands are now relative paths, as they are just in the root directory.

First attempt

Now I have a directory that I can use for a chrooted dc network dc. I need to set up the xinetdconfig to use chroot and the jail I have set up:

service dc
{
	disable		= no
	type		= UNLISTED
	id		= dc-stream
	socket_type	= stream
	protocol	= tcp
	server		= /usr/sbin/chroot
	server_args	= /home/dc/ ./dctelnet
	user		= root
	wait		= no
	port		= 1312
	rlimit_cpu	= 60
	env		= HOME=/ PATH=/
}

I needed to set the HOME and PATH environment variables otherwise (not sure whether it was sh,sed or dc causing it) I got a segfault, and to run chroot, you need to be root, so I could no longer run the service as the user dc. This shouldn’t be a problem because the resulting process is constrained.

A bit more security

Chroot jails have a reputation for being easy to get wrong, and they are not something I have done a lot of work with, so I want to take a bit of time to think about whether I’ve left any glaring holes, and also try to improve on the simple option above a bit if I can.

Firstly, can dc still execute commands with the ! operation?

 ~> nc -v dc.pr0.uk 1312
Connection to dc.pr0.uk 1312 port [tcp/*] succeeded!
dc (GNU bc 1.06.95) 1.3.95

Copyright 1994, 1997, 1998, 2000, 2001, 2004, 2005, 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE,
to the extent permitted by law.
!ls
^C⏎

Nope. Ok, that’s good. The chroot jail has sh though, and has it in the PATH, so can it still get a shell and call dc, sh and sed?

 ~> nc -v dc.pr0.uk 1312
Connection to dc.pr0.uk 1312 port [tcp/*] succeeded!
dc (GNU bc 1.06.95) 1.3.95

Copyright 1994, 1997, 1998, 2000, 2001, 2004, 2005, 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE,
to the extent permitted by law.
!pwd
^C⏎

pwd is a builtin, so it looks like the answer is no, but why? Running strings on my version of dc, there is no mention of sh or exec, but there is a mention of system. From the man page of system:

The system() library function uses fork(2) to create a child process that executes the shell  command  specified in command using execl(3) as follows:

           execl("/bin/sh", "sh", "-c", command, (char *) 0);

So dc calls system() when you use !, which makes sense. system() calls /bin/sh, which does not exist in the jail, breaking the ! call.

For a system that I don’t care about, that is of little value to anyone else, that sees very little traffic, that’s probably good enough, but I want to make it a bit better - if there was a problem with the dc program, or you could get it to pass something to sed, and trigger an issue with that, you could mess with the jail file system, overwrite the dc application, and likely break out of jail as the whole thing is running as root.

So I want to do two things. Firstly, I don’t want dc running as root in the jail. Secondly, I want to throw away the environment after each use, so if you figure out how to mess with it you don’t affect anyone else’s fun.

Here’s a bash script which I think does both of these things:

#!/bin/bash
set -e
DCDIR="$(mktemp -d /tmp/dc_XXXX)"
trap '/bin/rm -rf -- "$DCDIR"' EXIT
cp -R /home/dc/ $DCDIR/
cd $DCDIR/dc
PATH=/
HOME=/
export PATH
export HOME
/usr/sbin/chroot --userspec=1001:1001 . ./dctelnet
  • Line 2 - set -e causes the script to exit on the first error
  • Lines 3 & 4 - make a temporary directory to run in, then set a trap to clean it up when the script exits.
  • I then copy the required files for the jail to the new temp directory, set $HOME and SPATH and run the jail as an unprivileged user (uid 1001).

Now to make some changes to the xinetd file:

service dc
{
        disable         = no
        type            = UNLISTED
        id              = dc-stream
        socket_type     = stream
        protocol        = tcp
        server          = /usr/local/bin/dcinjail
        user            = root
        wait            = no
        port            = 1312
        rlimit_cpu      = 60
        log_type        = FILE /var/log/dctelnet.log
        log_on_success  = HOST PID DURATION
        log_on_failure  = HOST
}

The new version just runs the script from above. It still needs to run as root to be able to chroot.

I’ve also added some logging as this has piqued my interest and I want to see how many people (other than me) ever connect, and for how long.

As always, I’m interested in feedback or questions. I’m no expert in this setup so may not be able to answer questions, but if you see something that looks wrong (or that you know is wrong), please let me know. I’m also interested to hear other ways of process isolation - I know I could have used containers, and think I could have used systemd or SELinux features (or both) to further lock down the dc user and achive a similar result.

Thanks for reading.

Christopher RobertsFixing SVG Files in DokuWiki

Featured Image

Having upgraded a DokuWiki server from 16.04 to 18.04, I found that SVG images were no longer displaying in the browser. As I was unable to find any applicable answers on-line, I thought I should break my radio silence by detailing my solution.

Inspecting the file using browser tools, Network and refreshing the page showed that the file was being downloaded as octet-stream. Sure enough using curl showed the same.

curl -Ik https://example.com/file.svg

All the advice on-line is to ensure that /etc/nginx/mime-types includes the line:

image/svg+xml   svg svgz;

But that was already in place.

I decided to try uploading the SVG file again, in case the Inkscape format was causing breakage. Yes, a long-shot indeed.

The upload was rejected by DokuWiki, as SVG was not in the list of allowed file extensions; so I added the following line to /var/www/dokuwiki/conf/mime.local.conf:

svg   image/svg_xml

Whereon the images started working again. Presumably Dokuwiki was seeing the mime-type as image/svg instead of image/svg+xml and this mismatch was preventing nginx serving up the correct content-type.

Hopefully this will help others, do let me know if it has helped you.

Paul RaynerSnakes and Ladders, Estimation and Stats (or 'Sometimes It Takes Ages')

Snakes And Ladders

Simple kids game, roll a dice and move along a board. Up ladders, down snakes. Not much to it?

We’ve been playing snakes and ladders a bit (lot) as a family because my 5 year old loves it. Our board looks like this:

Some games on this board take a really long time. My son likes to play games till the end, so until all players have finished. It’s apparently really funny when everyone else has finished and I keep finding the snakes over and over. Sometimes one player finishes really quickly - they hit some good ladders, few or no snakes and they are done in no time.

This got me thinking. What’s the distribution of game lengths for snakes and ladders? How long should we expect a game to take? How long before we typically have a winner?

Fortunately for me, snakes and ladders is a very simple game to model with a bit of python code.

Firstly, here are the rules we play:

1) Each player rolls a normal 6 sided dice and moves their token that number of squares forward. 2) If a player lands on the head of a snake, they go down the snake 3) If a player lands on the bottom of a ladder, they go up to the top of the ladder. 4) If a player rolls a 6, they get another roll 5) On this board, some ladders and snakes interconnect - the bottom of a snake is the head of another, or the top of a ladder is also the head of a snake. When this happens, you do all of the actions in turn, so down both snakes or up the ladder, down the snake. 6) You don’t need an exact roll to finish, once you get 100 or more, you are done.

To model the board in python, all we really need are the coordinates of the snakes and the ladders - their starting and ending squares.

def get_snakes_and_ladders():

   snakes = [
        (96,27),
        (88,66),
        (89,46),
        (79,44),
        (76,19),
        (74,52),
        (57,3),
        (60,39),
        (52,17),
        (50,7),
        (32,15),
        (30,9)
    ]
    ladders = [
        (6,28),
        (10,12),
        (18,37),
        (40,42),
        (49,67),
        (55,92),
        (63,76),
        (61,81),
        (86,94)
    ]
    return snakes + ladders

Since snakes and ladders are both mappings from one point to another, we can combine them in one array as above.

The game is moddeled with a few lines of python:

class Game:

    def __init__(self) -> None:
        self.token = 1
        snakes_and_ladders_list = get_snakes_and_ladders()
        self.sl = {}
        for entry in snakes_and_ladders_list:
            self.sl[entry[0]] = entry[1]

    def move(self, howmany):
        self.token += howmany
        while (self.token in self.sl):
            self.token = self.sl[self.token]
        return self.token

    def turn(self):
        num = self.roll()
        self.move(num)
        if num == 6:
            self.turn()
        if self.token>=100:
            return True
        return False

    def roll(self):
        return randint(1,6)

A turn consists of all the actions taken by a player before the next player gets their turn. This can consist of multiple moves if the player rolles one or more sixes, as rolling a six gives you another move.

With this, we can run some games and plot them. Here’s what a sample looks like.

The Y axis is the position on the board, and the X axis is the number of turns. This small graphical representation of the game shows how variable it can be. The red player finishes in under 20 moves, whereas the orange player takes over 80.

To see how variable it is, we can run the simulation a large number of times and look at the results. Running for 10,000 games we get the following:

function result
min 5
max 918
mean 90.32
median 65

So the fastest finish in 10,000 games was just 5 turns, and the slowest was an awful (if you were rolling the dice) 918 turns.

Here are some histograms for the distribution of game lengths, the distribution of number of turns for a player to win in a 3 person game, and the number of turns for all players to finish in a 3 person game.

The python code for this post is at snakes.py

Alex HudsonIntroduction to the Metaverse

You’ve likely heard the term “metaverse” many times over the past few years, and outside the realm of science fiction novels, it has tended to refer to some kind of computer-generated world. There’s often little distinction between a “metaverse” and a relatively interactive virtual reality world.

There are a huge number of people who think this simply a marketing term, and Facebook’s recent rebranding of its holding company to “Meta” has only reinforced this view. However, I think this view is wrong, and I hope to explain why.

Alex HudsonIt's tough being an Azure fan

Azure has never been the #1 cloud provider - that spot continues to belong to AWS, which is the category leader. However, in most people’s minds, it has been a pretty reasonable #2, and while not necessarily vastly differentiated from AWS there are enough things to write home about.

However, even as a user and somewhat of a fan of the Azure technology, it is proving increasing difficult to recommend.

Josh HollandMore on git scratch branches: using stgit

More on git scratch branches: using stgit

I wrote a short post last year about a useful workflow for preserving temporary changes in git by using a scratch branch. Since then, I’ve come across stgit, which can be used in much the same way, but with a few little bells and whistles on top.

Let’s run through a quick example to show how it works. Let’s say I want to play around with the cool new programming language Zig and I want to build the compiler myself. The first step is to grab a source code checkout:

$ git clone https://github.com/ziglang/zig
        Cloning into 'zig'...
        remote: Enumerating objects: 123298, done.
        remote: Counting objects: 100% (938/938), done.
        remote: Compressing objects: 100% (445/445), done.
        remote: Total 123298 (delta 594), reused 768 (delta 492), pack-reused 122360
        Receiving objects: 100% (123298/123298), 111.79 MiB | 6.10 MiB/s, done.
        Resolving deltas: 100% (91169/91169), done.
        $ cd zig
        

Now, according to the instructions we’ll need to have CMake, GCC or clang and the LLVM development libraries to build the Zig compiler. On NixOS it’s usual to avoid installing things like this system-wide but instead use a file called shell.nix to specify your project-specific dependencies. So here’s the one ready for Zig (don’t worry if you don’t understand the Nix code, it’s the stgit workflow I really want to show off):

$ cat > shell.nix << EOF
        { pkgs ? import <nixpkgs> {} }:
        pkgs.mkShell {
          buildInputs = [ pkgs.cmake ] ++ (with pkgs.llvmPackages_12; [ clang-unwrapped llvm lld ]);
        }
        EOF
        $ nix-shell
        

Now we’re in a shell with all the build dependencies, and we can go ahead with the mkdir build && cd build && cmake .. && make install steps from the Zig build instructions1.

But now what do we do with that shell.nix file?

$ git status
        On branch master
        Your branch is up to date with 'origin/master'.
        
        Untracked files:
          (use "git add <file>..." to include in what will be committed)
                shell.nix
        
        nothing added to commit but untracked files present (use "git add" to track)
        

We don’t really want to add it to the permanent git history, since it’s just a temporary file that is only useful to us. But the other options of just leaving it there untracked or adding it to .git/info/exclude are unsatisfactory as well: before I started using scratch branches and stgit, I often accidentally deleted my shell.nix files which were sometimes quite annoying to have to recreate when I needed to pin specific dependency versions and so on.

But now we can use stgit to take care of it!

$ stg init # stgit needs to store some metadata about the branch
        $ stg new -m 'add nix config'
        Now at patch "add-nix-config"
        $ stg add shell.nix
        $ stg refresh
        Now at patch "add-nix-config"
        

This little dance creates a new commit adding our shell.nix managed by stgit. You can stg pop it to unapply, stg push2 to reapply, and stg pull to do a git pull and reapply the patch back on top. The main stgit documentation is helpful to explain all the possible operations.

This solves all our problems! We have basically recreated the scratch branch from before, but now we have pre-made tools to apply, un-apply and generally play around with it. The only problem is that it’s really easy to accidentally push your changes back to the upstream branch.

Let’s have another example. Say I’m sold on the stgit workflow, I have a patch at the bottom of my stack adding some local build tweaks and, on top of that, a patch that I’ve just finished working on that I want to push upstream.

$ cd /some/other/project
        $ stg series # show all my patches
        + add-nix-config
        > fix-that-bug
        

Now I can use stg commit to turn my stgit patch into a real immutable git commit that stgit isn’t going to mess around with any more:

$ stg commit fix-that-bug
        Popped fix-that-bug -- add-nix-config
        Pushing patch "fix-that-bug" ... done
        Committed 1 patch
        Pushing patch "add-nix-config ... done
        Now at patch "add-nix-config"
        

And now what we should do before git pushing is stg pop -a to make sure that we don’t push add-nix-config or any other local stgit patches upstream. Sadly it’s all too easy to forget that, and since stgit updates the current branch to point at the current patch, just doing git push here will include the commit representing the add-nix-config patch.

The way to prevent this is through git’s hook system. Save this as pre-push3 (make sure it’s executable):

#!/bin/bash
        # An example hook script to verify what is about to be pushed.  Called by "git
        # push" after it has checked the remote status, but before anything has been
        # pushed.  If this script exits with a non-zero status nothing will be pushed.
        #
        # This hook is called with the following parameters:
        #
        # $1 -- Name of the remote to which the push is being done
        # $2 -- URL to which the push is being done
        #
        # If pushing without using a named remote those arguments will be equal.
        #
        # Information about the commits which are being pushed is supplied as lines to
        # the standard input in the form:
        #
        #   <local ref> <local sha1> <remote ref> <remote sha1>
        
        remote="$1"
        url="$2"
        
        z40=0000000000000000000000000000000000000000
        
        while read local_ref local_sha remote_ref remote_sha
        do
            if [ "$local_sha" = $z40 ]
            then
                # Handle delete
                :
            else
                # verify we are on a stgit-controlled branch
                git show-ref --verify --quiet "${local_ref}.stgit" || continue
                if [ $(stg series --count --applied) -gt 0 ]
                then
                    echo >&2 "Unapplied stgit patch found, not pushing"
                    exit 1
                fi
            fi
        done
        
        exit 0
        

Now we can’t accidentally4 shoot ourselves in the foot:

$ git push
        Unapplied stgit patch found, not pushing
        error: failed to push some refs to <remote>
        

Happy stacking!


  1. At the time of writing, Zig depends on the newly-released LLVM 12 toolchain, but this hasn’t made it into the nixos-unstable channel yet, so this probably won’t work on your actual NixOS machine.↩︎

  2. an unfortunate naming overlap between pushing onto a stack and pushing a git repo↩︎

  3. A somewhat orthogonal but also useful tip here so that you don’t have to manually add this to every repository is to configure git’s core.hooksDir to something like ~/.githooks and put it there.↩︎

  4. You can always pass --no-verify if you want to bypass the hook.↩︎

Jon FautleyUsing the Grafana Cloud Agent with Amazon Managed Prometheus, across multiple AWS accounts

Observability is all the rage these days, and the process of collecting metrics is getting easier. Now, the big(ger) players are getting in on the action, with Amazon releasing a Managed Prometheus offering and Grafana now providing a simplified “all-in-one” monitoring agent. This is a quick guide to show how you can couple these two together, on individual hosts, and incorporating cross-account access control. The Grafana Cloud Agent Grafana Labs have taken (some of) the best bits of the Prometheus monitoring stack and created a unified deployment that wraps the individual moving parts up into a single binary.

Footnotes