August 13, 2025

Phil Spencer – Social media crackdowns

It’s become painfully obvious how easy it is for us the average internet user to have all the content we see be very carefully filtered so we don’t know what’s going on. Last week I broke through back into the protest side of tiktok and saw that the protests were still going on strongly in many US cities. The term Music Festival was used by organizers to avoid filtering but this just highlights the huge problem we currently have using global social media platforms.

The US citizens are still protesting in all major cities
ICE is still kidnapping people
Israel is still committing genocide on Palestinians

I keep saying this every couple of years but we do need to go back to older forms of communication on the internet like blogs, forums , self hosting etc. The corporate social media is all compromised as far as legitimate content goes. The web needs to web again, link to your friends sites directly. Don’t rely on search engines.

Decentralize yourself

Mastodon in some ways is still immune to this because you can run your own instance but admin infighting leads to gaps between instances. Bluesky makes itself sound like it’s decentralized but in the end it is a central business and will in all likelihood become compromised.
Self host your web server when possible, if you use a big provider you’ll probably get filtered by them as well. This is something I never thought would happen but some of it can still be within our control if we choose to.

Be Safe, decentralize yourself.

by KingPhil at August 13, 2025 11:28 PM

August 06, 2025

BitFolk Wiki – Hardware refresh, 2025-2026

Progress

← Older revision		Revision as of 18:24, 6 August 2025
Line 13:		Line 13:
	\|}		\|}

	As of 2025-08-05 BitFolk is working on server "talisker". Notifications have been sent out and those who ~~resond~~ early will be migrated, with the rest starting a week later.		As of 2025-08-05 BitFolk is working on server "talisker". Notifications have been sent out and those who respond early will be migrated, with the rest starting a week later.

	<br clear="all"/>		<br clear="all"/>

by S5h at August 06, 2025 06:24 PM

August 05, 2025

BitFolk Wiki – Hardware refresh, 2025-2026

Progress: Now working on server "talisker"

← Older revision		Revision as of 19:57, 5 August 2025
Line 13:		Line 13:
	\|}		\|}

	As of 2025-07-27 BitFolk ~~has finished removing customers from~~ server "~~elephant~~". ~~The next steps~~ will be:		As of 2025-08-05 BitFolk is working on server "talisker". Notifications have been sent out and those who resond early will be migrated, with the rest starting a week later.

	~~# Install 25Gbps networking hardware in elephant~~
	~~# Reinstall elephant~~
	~~# Move all customers from another server~~, ~~upgrading them in~~ the ~~process.~~

	~~The next server to be emptied is likely to be "talisker"~~.

	<br clear="all"/>		<br clear="all"/>

by Strugglers at August 05, 2025 07:57 PM

July 27, 2025

BitFolk Wiki – Hardware refresh, 2025-2026

Progress: Next server likely talisker

← Older revision		Revision as of 13:11, 27 July 2025
Line 13:		Line 13:
	\|}		\|}

	As of 2025-07-23 BitFolk has ~~run out of volunteers to be migrated~~ from server "elephant", ~~but upgrades continue~~ in ~~an order of BitFolk's choosing~~. ~~Customers on "elephant" wishing~~ to be ~~upgraded sooner should respond~~ to ~~the ticket email~~.		As of 2025-07-27 BitFolk has finished removing customers from server "elephant". The next steps will be:

			# Install 25Gbps networking hardware in elephant
			# Reinstall elephant
			# Move all customers from another server, upgrading them in the process.

			The next server to be emptied is likely to be "talisker".

	<br clear="all"/>		<br clear="all"/>

by Strugglers at July 27, 2025 01:11 PM

BitFolk Wiki – Hardware refresh, 2025-2026

Progress: Next server likely talisker

← Older revision		Revision as of 13:11, 27 July 2025
(2 intermediate revisions by 2 users not shown)
Line 13:		Line 13:
	\|}		\|}

	As of 2025-07-14 BitFolk has ~~run out of volunteers to be migrated~~ from server "elephant", ~~so progress~~ is ~~stalled until there are more volunteers or the initial week of notice runs out on 2025-07-20~~.		As of 2025-07-27 BitFolk has finished removing customers from server "elephant". The next steps will be:

			# Install 25Gbps networking hardware in elephant
			# Reinstall elephant
			# Move all customers from another server, upgrading them in the process.

			The next server to be emptied is likely to be "talisker".

	<br clear="all"/>		<br clear="all"/>

by Strugglers at July 27, 2025 01:11 PM

July 23, 2025

BitFolk Wiki – Hardware refresh, 2025-2026

Progress: progress continues

← Older revision		Revision as of 17:09, 23 July 2025
Line 13:		Line 13:
	\|}		\|}

	As of 2025-07-14 BitFolk has run out of volunteers to be migrated from server "elephant", ~~so progress is stalled until there are more volunteers or the initial week~~ of ~~notice runs out~~ on ~~2025-07-20~~.		As of 2025-07-23 BitFolk has run out of volunteers to be migrated from server "elephant", but uphrades continue in an order of BitFolk's choosing. Customers on "elephant" wishing to be upgraded sooner should respond to the ticket email.

	<br clear="all"/>		<br clear="all"/>

by Strugglers at July 23, 2025 05:09 PM

July 21, 2025

Ross Younger – A web toy: Forced ranking assistant

I wanted to arrange some items in a strongly ranked order of preference, but found it quite hard. So I made a little web tool that tries to help.

The idea? A sort of battle royale.

Consider two of the items chosen at random. Which do you prefer? Give it one point.
Repeat for every possible pairing of the items, and add up the scores.

The tool: https://crazyscot.github.io/forced-rank/

It’s a Javascript single-page application written in VueJS. Source code

by Ross Younger at July 21, 2025 05:21 AM

July 09, 2025

David Leadbeater – CVE-2025-48384: Breaking git with a carriage return and cloning RCE

tl;dr: On Unix-like platforms, if you use git clone --recursive on an untrusted repo, it could achieve remote code execution. Update to a fixed version of git and other software that embeds Git (including GitHub Desktop).

by David Leadbeater at July 09, 2025 05:30 PM

July 05, 2025

Josh Holland – Worstsort yet again: polymorphic recursion

Worstsort yet again: polymorphic recursion

5 July 2025

A long time ago, I wrote a couple of posts about a maximally slow (but still non-pathological) sorting algorithm which I’d tried to port to Rust. It compiled fine, until I tried to write a test for it and it suddenly blew up a recursion limit when trying to compile it. Eventually I realised this was due to Rust’s usage of monomorphisation meaning that it couldn’t statically compile infinitely many different instances of the function each time we wanted a version for a more nested list type. I was somewhat happy with that explanation, but I was left a little confused about why Brent Yorgey’s Haskell implementation worked fine. I concluded:

So, there are two key questions: why was this not a problem for Haskell, and is there a way to get round it in Rust? At the moment, I don’t know! I hope to get some free time to investigate this at some point, and I’ll definitely write up whatever I find on here.

That day has, apparently, and by complete chance, finally come, when I stumbled upon the key phrase: “polymorphic recursion”. Wikipedia defines it as “a recursive parametrically polymorphic function where the type parameter changes with each recursive invocation made, instead of staying constant”. In the case of Worstsort, the generic type is changing with each call from T or a (in the Rust and Haskell respectively) to &mut [T] or [a]. It’s actually mentioned in Brent Yorgey’s original post in an aside that I must have glossed over at the time, or not thought of it as a set term with a particular technical meaning. But the other day, somewhere, I encountered a link to a StackOverflow question (I forget where now) asking about polymorphic recursion in Rust and the penny dropped that this is the name of the property that’s missing from Rust. It seems to be an issue that a few people have run into: there’s an issue that’s been open for a while asking for better error messages (possibly even including the term!) when the compiler detects it, and there is a steady stream of other issues being marked as duplicates of it.

My initial thoughts now I revisit this 6 years later is that there’s nothing intrinsic to Rust’s type system that would prevent this from being compiled, and it’s more an implementation artifact. However, it definitely feels against the spirit of Rust to wrap all the type information up somewhere behind indirection and let functions work on pointers to things. I wonder if it’d be possible to work around it at the programmer level somehow, although since it’d all have to be at run-time you’d be working entirely round the type checker. Another side project to look into at some point I suppose.

by Josh Holland at July 05, 2025 12:00 AM

June 24, 2025

David Leadbeater – Can your terminal do emojis? How big?

Ancient history meets modern terminals... Looking at varying support for DECDHL in terminals.

by David Leadbeater at June 24, 2025 01:58 AM

June 22, 2025

BitFolk Issue Tracker – Auth. DNS - Feature #220: Configure a.authns.bitfolk.co.uk to not send notifies based on NS records

Thanks!

by halleck at June 22, 2025 06:58 AM

June 21, 2025

BitFolk Issue Tracker – Auth. DNS - Feature #220 (Resolved): Configure a.authns.bitfolk.co.uk to not send notifies based ...

Okay, seems reasonable. I've now made this change.

Thanks,
Andy

by admin at June 21, 2025 10:39 PM

June 17, 2025

David Leadbeater – Blink and you'll miss it — a URL handler surprise

Blink, the mobile shell for iOS had a URL handler that handled more than expected.

by David Leadbeater at June 17, 2025 10:51 PM

June 16, 2025

Josh Holland – KubeCon Europe 2025: days 2-3 (at last)

KubeCon Europe 2025: days 2-3 (at last)

16 June 2025

I’ve (finally) finished writing my my notes from the remainder of KubeCon; see my previous post for some background and initial thoughts. This is a lot later than I’d like, I was pretty tired after each day in the end and then life and other work has taken over a little bit since then. Hopefully it’s not all too murky to make sense of my notes. At least by now, the videos are up, so I can jog my memory as I go. I’ll add links to the recording of each talk, and also go back and do the same for my previous post.

This’ll be a pretty long one, as a paragraph or two for all the sessions I attended adds up fast. My general summary is sadly a bit removed from my feelings at the time, but I’ll do my best to remember my feelings at the time.

Day 2

Keynotes

This was the first day we made it to the keynotes. Unfortunately, there were a few issues with the clicker and screens for the speakers today. Today’s keynotes were the “end-user showcase”, where “end-user” in CNCF jargon means “company which has deployed Kubernetes” (not “someone who visits a website served by Kubernetes”). As a result, I didn’t get as much from these keynotes as I did from the talks in general or the next day’s, especially as almost all of the end-users were operating on a vastly larger scale than we do.

My notes from the flurry of end-users on stage are pretty brief, as each one only had a few minutes, and I had to take their (obviously sugar-coated) takes on Kubernetes tech at face value without being able to get into much detail.

HSBC

My only notes here are “stunning shirt” and “way bigger than anything we do”.

Peptone

As they did for the whole conference, my ears metaphorically pricked up when I saw a research-y organisation appear on stage, in this case a biotech company modelling disordered proteins or something (biology was never my strongest science). My main takeaway from this one was that they found it easy to migrate from dev machines with local GPUs to using Nvidia cards in the DGX cloud.

Spotify

This talk was focused on their Backstage “Internal Developer Platform”, which is a UI to let developers (and presumably SREs, platform engineers, devops engineers and whatever the trendy job title is these days) see what is going on across Kubernetes. It was interesting to hear how they had to balance company-internal needs and those coming from the open-source community as they built improvements.

Apple

The first of many times that I saw Katie Gamanji on stage at this conference, today with her Apple hat on (rather than as part of the Technical Oversight Committee as she appeared later on). Her topic was about how Apple used gRPC to connect the Apple Intelligence system (running locally on iPhones/iPads/Macs) to private clusters. Surprisingly enough for an Apple project it was written in Swift, and as a result it was too hard to resist mentioning that it was Memory Safe (as opposed to all the other CNCF projects written in Go).

End User Awards

This was mostly a promotion for various engagement activities run by the CNCF for end users. It was the first time I’d heard “observability” being reduced to the numeronym “o11y”, which for some terrible reason the speaker pronounced as “olly”. Numeronyms in general have a slightly bad reputation, which might not be entirely deserved, but I don’t think there’s ever a reason not to pronounce them as the thing they abbreviate.

Some other initiatives that seemed interesting were the Academic Accreditation Program to certify academic courses at universities and other educational institutes, and GitJobs.dev, a jobs board promoting careers opportunities with time allocated to working on open source and upstream projects.

CERN/Linux Foundation/OpenInfra

I thought the HSBC talk earlier was a big estate, but CERN’s system puts almost everything else to shame. The top level numbers for (I think) just the ATLAS project come to over 10,000 servers with several petabytes of RAM between them. They were dealing with petabytes of data in the early 2000s, and are now looking at exabytes.

Michelin

This was another end-user example running on a scale much larger than our dozen-node, several-hundred-pod deployment. Still, the problem they were talking about, vendor migrations, is something that applies at all scales, and building a solution entirely on open-source technologies is a great way to remain vendor-agnostic.

Red Hat

It had been a while since anyone talked about AI, so it was about time for a run of talks about running AI models on Kubernetes. The trend is for pressures on cloud infrastructure and technology to push towards solutions based on agents and orchestration models working with smaller models, which in theory is a good fit for Kubernetes, but the problem is how to handle state, at a much larger and more dynamic way than traditional workloads have demanded.

Panel discussion

The panel featured Joseph Sandoval from Adobe, Liz Rice from Cisco, and the return of Katie Gamanji of Apple to talk about the history of the CNCF around its tenth anniversary. The CNCF took credit for the convergence of the industry on Kubernetes as the standard orchestration engine, as a vendor-neutral foundation with ownership of the project. Focuses for the future include multi-cluster observability across providers; cost management, sustainability and hardware management; and secret management (which is definitely something I’d be interested to do better!).

Solo.io

This was pretty much just an advert for Solo.io’s sidecarless service mesh. I still don’t know what a service mesh is or why I’d use one. It also includes some Kgateway thing that lets you connect to LLM providers, because it had been too long since anyone had mentioned AI.

Mirantis

This was more or less another advert, one much more focused on AI, showing how their new control plane is great for firing up LLM workloads. It included the amazing line “Kubernetes has the opportunity to win open source”. I wasn’t previously aware that open source was a competition you could win, but apparently Kubernetes can do it.

NAV

This was probably the keynote I found the most interesting, perhaps because the public sector is a lot closer to academia than industry is. They were presenting the PaaS they had built for the Norwegian government, built by the Norwegian welfare service. They had begun by creating a community for public-sector cloud users in Norway, with in-person meetups and a Slack. Then they built their “Nais” platform. In true public sector form, it’s a backronym from the word “nice”. NAV (the welfare service) in-sourced development of their services, built and open-sourced the Nais service and now has over 3000 open-source code repos. In 2024 the service did something like 3000 production deploys per week. They had a similar experience to us, finding the non-code stuff harder than writing code.

OTel sucks!

I’ve subtitled this one “and so does the conference wifi”, since no amount of technical expertise on the event side will compete with 13,000 geeks trying to use several devices each at the same time. I was hoping this would be something of an introduction to OpenTelemetry, which seemed to be everywhere. Monitoring and alerting is something I’ve wanted to improve at work and I was hopeful that OTel was something we could put to use on that front. The format used in this “X Sucks!” talk is to have a bunch of community members give (edited) complaints about X, then re-play the full video in which it is revealed that the complaints are actually strong points. My notes contain a lot more downsides than advantages, which is perhaps indicative of how this format can end up a bit back-slappy.

The quote I’ve jotted down perhaps sums up OTel: “It is a lot”. The speaker counted up 13 distinct APIs and a huge number of SDKs for different languages. Autoinstrumentation can be a bit overzealous in collecting unnecessary data and overload the collection process. The one pro I’ve got is that the “stability of the semantic convention is taken seriously” which seems a little opaque. The semantic convention came up almost every time OTel was mentioned, and it’s clear that it’s a very important idea.

Strengthening Auth in Kubernetes

I’m not really sure why I decided to go to this session: perhaps I was hoping to understand better how to handle multiple users in one Flux cluster. In any case, I did pick up one or two interesting and useful things from this talk. I’ve split my notes into APIs which have “graduated” and those which are “upcoming”.

In the graduated section, I have noted down a new way to pass CA roots into Pods, which is not something I see myself using any time soon (although we have occasionally run into issues before we migrated to LetsEncrypt with the certificate on our GitLab container registry expiring). There was also something about new authorisation¹ for the Kubelet API where it’s now possible to restrict things to the /configz, /healthz and/or /pods endpoints.

The “upcoming” part had some things which were a bit more relevant. It turns out that if an image which requires authentication had already been downloaded, the Pod’s credentials aren’t checked. I was pretty surprised that hadn’t already been the case, but soon that little hole will be closed. I’ve also written down “Service Account pull credentials rather than separate secret options”. I’m not sure what this means, and I can’t find the point in the talk recording which prompted me to make this note. Hopefully it means something to someone.

KubeCon Family Fortune

Another “fun” presentation, taking the popular game show format (also known as “Family Feud”) to Kubernetes. I appreciated them framing it as “Tabs v Spaces”, although a serious opportunity was missed by putting Tabitha on the “Spaces” team, or at the very least not by pointing out this contradiction.

Navigating the inevitable: Kubernetes Breaking Changes Behind the Scenes

This talk on how Kubernetes (or at least the core project) manages breaking changes definitely taught me a few interesting things. The speaker started by distinguishing two types of breaking change: those with and without a mitigation, where a mitigation is some sort of config change or similar that lets you still more or less use the feature in question. The Kubernetes project categorises them as Major Changes or Removals, which map more or less one-one onto the with/without mitigation dichotomy.

Operationally, breaking changes in Kubernetes always go through a deprecation cycle, although there’s often not a lot of incentive to act on deprecations until the feature is finally removed or changed. As always, it’s a good idea to look at the changelog for both removals and deprecations when upgrading. A change in Kubernetes starts as a Kubernetes Enhancement Proposal (KEP), so if you really want to stay ahead of the game, then watching the KEPs is one way to do that.

Another key aspect to be aware if is Kubernetes-hosted infrastructure, which has no particular guarantee or SLA. If you do rely on anything like that, it’s very important to keep an eye on the Kubernetes news channels and run a mirror if necessary. Two key news channels are the kubernetes-announce mailing list and Last Week in Kubernetes Development.

Image Snapshotters for Efficient Container Execution in Particle Physics

Another talk by CERN was my last one of day 2. They started with the stat that 85% of the bytes downloaded are not used by a container. Physicists’ containers often end up gigabytes in size in such a way that they can’t be slimmed down. This means that images have to wait a long time before the whole container is downloaded before they can start. One way to reduce this startup latency is lazy pulling, where the storage layer intelligently fetches only the bytes that the container actually needs, at the cost of some runtime performance.

Lazy pulling requires some metadata indices to the container so that the runtime knows what bytes are needed for each request by the container. The main solution presented here was CUMFS, a FUSE-based filesystem developed at CERN. It’s a fully global lazy filesystem, which is generally mounted at /cumfs by nodes at CERN. It serves something like 4 billion files running to multiple petabytes (they said it had been proven up to 100 PB). They showed some every impressive looking graphs, but warned about using lots of small files, and the fly crawling across the lens of the projector was a little distracting.

Day 3

Keynotes

Google/Bytedance

This was basically an announcement for their new Inference Gateway for self-hosting LLMs (the first hint at something of a theme for the keynotes this morning). One issue is how LLM requests are quite variable and unpredictable in terms of input and output size. There are also issues where, for supply chain and hardware availability reasons, clusters might run a range of different GPUs. Overall, traditional microservice requests (small, predictable, cheap) are very different to LLM requests (large, variable, expensive), so the Inference Gateway provides more intelligent load balancing.

Oracle/Red Bull Racing

This was a pretty content-free talk really, just Oracle making the most of sponsoring the F1 team. They spoke a little about how they were using AI for real time insights for race strategy, and also getting input on FIA regulatory desicions. They also managed to make it seem like they were rather ignorant of recent politics by saying that this was all happening “right here in the EU” on a stage in post-Brexit London.

Panel: cloud native in telecoms

I’m not sure how informative this panel discussion was really. The main takeaways I had were that telcos tend to be reasonably conservative, with physical boxes wired together virtually. They spoke about an ambition to upgrade to 6G by some sort of rolling upgrade or operator subscription, which sounds pretty ambitious.

Cutting Through the Fog: Clarifying CRA Compliance in Cloud Native

This was the first detail I’d heard about the Cyber Resiliance Act, upcoming EU legislation for “products with digital elements”. It’s unclear to me what a scientific institute working (generally) outside the EU, albeit often partnering with EU-based universities and research centres, has to worry about the CRA. Still, it is a codification of mostly good practices, but how that shakes out for open source projects rather than companies remains to be seen. One interesting aspect of this is the notion of “stewards” as an intermediate between maintainers and manufacturers, which is likely the role that the CNCF and Linux Foundation would assume.

Lessons Learned in LLM Prompt Security

It had been a whole two talks since we last talked about LLMs, so it was time to bring them back up again. Despite being a sponsored keynote, I did find this one a little bit interesting, although more in the way of morbid fascination. If you are running an LLM to which users can submit queries, how can you make sure their prompts aren’t dangerous (whatever that means in your usage context)? The obvious way, if you are that deep into the hype cycle, is of course to run another LLM to classify prompts as “safe” or “not safe”². However, as we saw in the earlier talk, LLM queries tend to be slow and expensive, which is not what you want when you are building a filter into your load balancer. More “advanced” techniques, such as text filtering(!), are required.

221B Cloud Native Street

It’s Katie Gamanji again! This time, it was a presentation from the Technical Oversight Committee (TOC), giving their update on what they’d been up to. They explained what the CNCF project maturity levels meant and let us know where the Technical Advisory Group (TAG) reform was going. They also gave an update on the End User Technical Advisory Board, which is currently working on “Feedback Loops”, “Reference Architectures” and “Gaps”.

Science at Light Speed

Another talk focused on science, but unfortunately again at a scale way beyond the few gigabytes a year we’ll be collecting. The Square Kilometre Array (SKA) is a radio telescope currently under construction in Australia and South Africa. It will produce something like 600 PB of data each year, and this presentation went into how you can possibly build infrastructure to handle that. The starting point will be local processing centres in the same country as the telescope and then distributed resources around the world.

The project has 14 SKA Regional Centres (SRCs), which independently use whatever infrastructure they choose to provide a unified service layer, as much as possible with off-the-shelf tools. The speaker was from the Swiss SRC and explained how they used worker nodes from Swiss supercompute facilities. There was also a quick demo of using the network to access some data, and, as seems to be the case for every single data science demo, the example used was to cookie-cut a subset of the data out.

Ensuring Quality in Kubernetes: The Graduation Process From Alpha To GA

This talk pairs with the other one I attended about the lifecycle of deprecations in Kubernetes, and describes how APIs go from alpha to beta to general availability, and a bit of describing what each step along that process means.

Alpha APIs should work and not be buggy, although the interface is not necessarily stable yet. They emphasised the point that API stability is not just about syntax, but also semantics and behaviour.

Alpha and beta APIs are disabled by default, and part of the consideration as they progress is the scalability and performance requirements. The testing requirements also increase as the feature moves towards GA. The talk finished with a bit of a discussion about the project’s strong policy about avoiding flaky tests, and how much work had been done to reach that standard.

The State of Prometheus and OpenTelemetry Interoperability

I got most of the sort of introductory overview of Prometheus and OTel I was hoping for out of this talk, even if it wasn’t intended as such. There was a lot of describing the philosophical differences between the two systems.

The first difference is how Prometheus pulls data while OTel pushes it. Pulling data means that the instrumented service has to keep its metrics in memory for when it is queried, and doesn’t let you only submit data when it changes, which are two points in favour of push-based architecture. A key downside of pushing data is that your instrumentation finds it difficult to tell the difference between your app not existing and your app being down. I’ve scribbled down the following table to compare the use cases for the two:


Prometheus	OpenTelemetry
Time series database	Instrumentation framework
Monitoring and alerting	Creation, collection, processing and export of metrics

Another issue of interoperability is metric names: OTel supports the whole of Unicode, whereas (until version 3.0) Prometheus only supports alphanumerics, _ and -.

A goal of the Prometheus project is to be the best backend for OpenTelemetry. One way to bring the two together is to use versioned reads in Prometheus to access metrics which were renamed from the Prometheus naming style to the OTel one. OTel deltas can be converted to cumulative metrics, but resource attributes (like information about OS, architecture, process IDs or Docker) have taken a few iterations to translate into Prometheus:

turn them all into labels, but this led to cardinality issues
create a target_info metric which holds all the resource attributes and use PromQL JOINs to access them, but PromQL JOINs are very hard to use and form a barrier for many users
let the user configure which resource attributes to turn into labels, but this is then an admin nightmare

User research for the best solution is still ongoing.

Finally, since we are using Parquet for storing time series in the current project at work, my ears pricked up when they mentioned a Parquet storage working group, even if it is still early days.

And that’s it! After that last Prometheus/OTel talk we had to leave the conference to catch the train back to Lancaster. It had been an exhausting few days but I think I learned a lot about Kubernetes. It was also quite eye-opening to see a conference of that scale, drink all the coffee, eat all the cakes and meet lots of people. Maybe I’ll go again next time, and not take 3 months to write up all of my notes.

It always feels weird spelling “authorisation” in the British English way when it’s so frequently abbreviated to “authz”.↩︎
What could possibly go wrong‽‽‽‽↩︎

by Josh Holland at June 16, 2025 12:00 AM

June 14, 2025

BitFolk Issue Tracker – Auth. DNS - Feature #220 (Resolved): Configure a.authns.bitfolk.co.uk to not send notifies based ...

First of all I do realize that this is an issue that is both very minor and rather niche. Yet since the fix is potentially correspondingly trivial I figured that I might as well raise the issue.

Judging by what one can observe from the outside a.authns.bitfolk.co.uk is configured both to send notifies based on NS records as well as to hard coded send notifies to b.authns.bitfolk.com and to c.authns.bitfolk.com?

The former, sending notifies based on NS records, becomes an issue when one is using the Bitfolk DNS secondaries together with other DNS secondaries. Then those other secondaries might end with the following log noise.

Jun 14 04:45:17 secondary named[719]: client @0x754b5817dae8 85.119.80.222#33317: received notify for zone 'erid.se'
Jun 14 04:45:17 secondary named[719]: zone erid.se/IN: refused notify from non-primary: 85.119.80.222#33317
Jun 14 04:45:17 secondary named[719]: client @0x754b5817bd68 2001:ba8:1f1:f085::53#33162: received notify for zone 'erid.se'
Jun 14 04:45:17 secondary named[719]: zone erid.se/IN: refused notify from non-primary: 2001:ba8:1f1:f085::53#33162

Assuming that there are no zones depending on those implicit NS notifies then the BIND config fix is potentially as trivial as follows.

options {
    ...
    notify explicit;
    ...
};

by halleck at June 14, 2025 05:22 AM

April 28, 2025

David Leadbeater – Using HAProxy to protect me from scrapers

A simple anti-scraper solution for haproxy. The goal is to be as simple as possible, so this can be implemented alongside other haproxy rules to control traffic.

by David Leadbeater at April 28, 2025 06:16 AM

April 06, 2025

Jon Spriggs – How I deploy Vaultwarden to provide a Bitwarden compatible service in Kubernetes with Monitoring and Backups

This initially was going to be a mammoth blog post going through all of the lines of code in how I’ve built a Vaultwarden service in Kubernetes rather than just writing what I’ve done. You can just look at the git repo and see what’s there! Ask for comments on that if you need more details!

So, instead, let me link you to the helm chart and docker containers I created, and I’ll pull out some notes on some of the specific details in there.

https://github.com/JonTheNiceGuy/vaultwarden-helm-chart

This helm chart comprises of the 4 services I feel you need:

vaultwarden <- The actual password safe service
vwmetrics <- Prometheus Metrics for the service
vaultwarden-sync <- A packaged deployment of the directory synchronization tool from Bitwarden
vaultwarden-backup <- A tool to backup the data directory and the database from Bitwarden.

In addition, the chart allows you to provision dynamically allocated Persistent Volumes through a StorageClass, and flexibility to set all of the variables in the Vaultwarden settings file.

The biggest “weird-ish” thing I’ve done is to create a configuration file as a secret, and mounted that configuration file into the vaultwarden container. This prevents compromised hosts from being able to extract admin tokens and database credentials from process environment variables. That said, it would be better to somehow make this a Read-Once value, which I believe is possible with something like Hashicorp Vault, or SOPS. If you’ve got any advice on how to do this, I’d be very grateful for your advice!

I’m not exactly overjoyed with the vwmetrics, as it doesn’t expose any internal metrics, just a count of the number of each type of asset in the database, but the project are clear they don’t want to add any additional tracing to the application, so this is the best we can do.

vaultwarden-backup is a script I wrote which reads the vaultwarden environment file to get the database credentials and data path, and then backs up both database and non-database files (following the official guidance). In this invocation, the only fields required from the environment file are the path to the data directory and the database credentials are required, so the config secret stores those as a separate key. It also means that this can be just a Read-Only database credential too.

I wrote this script because no-one had released a containerised script that performed the database backup in something other than sqlite that I’d seen.

vaultwarden-sync is a wrapper I wrote to get the Bitwarden Directory Connector, and setup the configuration files to support performing LDAP sync. The other directories have not been tested, but are configured according to the changes to the configuration file when you configure them in the Bitwarden Directory Connector GUI.

I wrote this script because I couldn’t see any way to run the Directory Connector as part of an all-in-one set of containers for my cluster.

Both the backup and sync tools use the livenessProbe feature of Kubernetes to execute themselves, and use the termination log as their output method. This is a method one of my colleagues found when we were setting up some inter-cluster communication tests a while ago, and it works really well where you need to see the status of a long running loop.

I should stress, this is not a “fully-packaged” helm chart. It’s a learning aid, both for someone who hasn’t written many helm charts, and for me, to get feedback from people who *do* write lots of helm charts, and are prepared to tell me how I can do better!

Featured image is â€œRiggs Bank Vault in Washington D.C.â€� by â€œSteve Jurvetsonâ€� on Flickr and is released under a CC-BY license.

by JonTheNiceGuy at April 06, 2025 10:45 AM

April 02, 2025

Josh Holland – KubeCon Europe 2025: day 1

KubeCon Europe 2025: day 1

2 April 2025

This is a fairly non-structured log of my thoughts and notes from attending KubeCon Europe 2025 ¹. I was lucky to be sponsored by work, so I made at least some attempt to get to talks and sessions which seemed relevant to that. Unfortunately getting to the ExCeL venue from Lancaster on the Wednesday morning meant that I missed the morning keynotes.

I wasn’t sure what to expect from the event really: it was the same week that I returned from a month of leave (my attendance had been arranged while I was away by a “do you want to go” text), and it’s by far the largest tech conference I’ve been to. So my preparation (going through the schedule and saving all the talks with a vaguely interesting title and/or abstract) was a bit limited, and I didn’t have as long as I’d have liked to consider what I really wanted to get out of it. The rough list of things I was keen to learn more about was:

GitOps tooling around Flux, including secret management and the best way of managing both a production and staging cluster in a reasonably harmonious way
The experience of other scientific institutions in using Kubernetes (for example some of the many talks on using GPUs in a cluster)
Any other useful tools and knowledge I can pick up
Having a go at the CTF as I’ve never taken part in one before

I was less interested in some of the talks about scaling Kubernetes to huge deployments, as all the stuff we run is fairly small-scale, and didn’t really have much time for the LLM/AI stuff that filled up a huge portion of the schedule. It’s probably going to be pretty unavoidable throughout the event though.

The rest of this post consists of mildly edited thoughts from my handwritten notes taken during the sessions I was in.

Explain How Kubernetes Works With GPU Like I’m 5

Carlos Santana, AWS

The first talk we made it to was an introduction to the various layers involved in running GPUs on Kubernetes. The speaker broke down the layers from device driver, CUDA runtime, container toolkit to node and GPU feature discovery and the device plugin to handle scheduling and access to the compute resources. It was a good introduction to the various things that are needed to set up GPU-enabled workflows, but what caught me the most was a passing comment about how EKS hybrid nodes allowed an AWS-managed EKS cluster to include nodes running remotely (for example in a homelab) over a VPN like Wireguard.

Bringing Agentic AI to Cloud Native - Introducing kagent

Christian Posta, Solo.io

I missed the start of this one as I was wandering around trying to find the room for the CTF intro, but after giving up on that until the later session I ended up watching this “sponsored demo” of an LLM attached to a Kubernetes cluster. The introduction that I missed presumably explained exactly what I was looking at, but it seemed to be a web interface for a chatbot that was connected to a Kubernetes cluster which could explain the state of resources in a namespace, preview and apply changes (yes, it did have kubectl and working credentials), as well as reading and summarising online documentation for the user. The speaker also made a big deal of its support for MCP, which apparently stands for the “Model Control Framework” for “domain-specific extensions”. I have no idea what that is, but I’m sure if you are into LLMs that is a good thing.

Booths - Clickhouse & Wiz

Over the lunch break, we wandered around some of the booths in the exhibitor area. We had a look at the Clickhouse one, as it seems like it could be a better way to do some of the columnar querying needed in one of our projects than the current solution of a hand-rolled connection pool to a load of Parquet files in S3. They said their server was open-source, so perhaps it’s worth a bit of an experiment, especially as it can ingest Parquet files and query them from a client library.

We also chatted to the people on the Wiz stand, mostly because we’d heard they had recently been acquired for an ungodly amount of money but didn’t really know much more than that. I’m not sure how much need or budget we have for compliance and security scanning, but they didn’t look completely horrified when we said “non-profit” or “scientific research” so perhaps they have pricing options that might be compatible with what we do.

Poster - Enhancing Research and Data Delivery With the Data Delivery System (DDS)

Álvaro Revuelta, SciLifeLab Data Centre & Valentin Georgiev, Uppsala University

This was one of the things I thought would be the closest thing to the work we do in providing scientific data to other researchers. It turned out that it wasn’t really the same thing, more a tool for short-term sharing of datasets with known collaborators who have requested it specifically, rather than publishing ongoing data publicly for anyone to access.

An Introduction to Capture The Flag

Andy Martin & Kevin Ward, ControlPlane

Having successfully found the right room, there was a short introduction from the team running the CTF at KubeCon this year, in which they provided an overview of how it worked, one hint² and then some background music for a roomful of people trying to break into an imaginary Kubernetes cluster in a scenario used at a previous conference. I initially felt like I wasn’t making much headway, and was aware of the time ticking away until the next talk I wanted to go to, before I suddenly found the first flag just before I had to pack up and go back downstairs. I guess that’s how these things often go. I learned a lot more about Hashicorp Vault than I was expecting to and I look forward to having a bash at the “real thing” tomorrow.

The Life (or Death) of a Kubernetes Request, 2025 Edition

Abu Kashem, Red Hat Inc. & Stefan Schimanski, Upbound

This talk was framed as the answer to a hypothetical interview question of “what happens when you create a new resource with kubectl apply -f job.yaml?”. It gave a good tour of what happens inside the request handler in the apiserver, mostly covering the various validations, timeouts and audit logs that are added, as well as what “creating a new resource” actually entails in the registry and etcd. There were a lot of details that I’m unlikely to remember, but it’s almost certainly useful to have a sense of what’s going on in there, as well as some trivia like the differences between kinds and resources and what is going on with different apiVersions.

Flux Ecosystem Evolution

Stefan Prodan, ControlPlane & Sanskar Jaiswal, Kong

Again I missed the start of this talk having accidentally walked to the wrong end of the conference centre to find the room. Luckily, I don’t think I missed too much and was able to figure out that Flagger is a system for doing canary rollouts that we are unlikely to ever use at our scale. It’s not something I’ve ever looked into in detail, and while I’m sure it’s obvious to people who do do this sort of thing, the idea of progressively rolling a new version out automatically as long as the metrics look good is not something I’d considered before.

The main thing I was interested in from this talk was Flux, something we definitely do use. There were a lot of exciting-sounding new features discussed, mostly enabled by the Flux Operator. Ephemeral environments for PRs/MRs is something I’ve thought about before for when we are reviewing changes, and it seems like these should be fairly straightforward to set up with the operator, as well as making Flux component upgrades a lot easier than re-running the bootstrap to update the component manifests in the git repository. Even the presenter said that it was scary and easy to blow up your own cluster before!

The Ultimate Container Challenge: An Interactive Trivia Game on OCI, Podman, Docker…

Aurélie Vache, OVHCloud & Sherine Khoury, Red Hat

It was definitely by now time for the “fun” talks, starting with this interactive quiz about Docker and OCI containers. For a moment before the questions got too hard I made it onto the top 5 leaderboard, but then we got onto the things I’d actually come along to learn about. It was a good format to have the audience answer a question, then give the answer and a live demo to explain it even more. I was also very impressed with whatever technology they were using to handle typing the demo commands into the terminal, as it clearly was actually doing the work live but also seemed to grab the hashes out of the command output for use in later commands.

Museum of Weird Bugs: Our Favorites From 8 Years of Service Mesh Debugging

Alex Leong, Buoyant

This one was a bit more of a punt, as I didn’t really know what a Service Mesh was or how you’d debug one, but I always enjoy hearing war stories about this sort of thing. The morals of the two bugs presented are almost just about relevant to some of the things I do, and are probably general enough to think about: make sure you aren’t calling blocking functions in places that blocking would lead to deadlock/client service denial, and be careful with different versions of CRDs. I’d also not heard of HTTP2 flow control before, which is something good to be aware of before I encounter some weird bug caused by it in the future.

Clash Loop Back Off

The final session of the day was a fun game-show type system, which challenged two Kubernetes experts to solve a problem (from a shortlist of 3 where they didn’t know which would be picked) competitively live on stage. They had to provision a cluster, install a stateful workflow, back it up, delete it and then restore from the backup, all within 25 minutes and while being entertaining on stage. I had grabbed some dinner to eat while it was going on, and I really should have cashed in my first beer token as it was very light-hearted. A fun way to finish off the day.

That was my first day at KubeCon. I feel like I made it through more talks than I was expecting: I often find that regardless of how interesting the content of the talk is, unless the speaker is extremely engaging (by which I really just mean upbeat and hyperactive) I often find it hard not to drift off while sitting and listening. Pehaps having a notebook and pen to hand, even if I’m not compulsively taking full notes, wards off that sort of drowsiness, or maybe I just had enough coffee to keep me going.

Tomorrow we’ll be there in good time to see all the morning keynotes. The container quiz I saw towards the end was in the main auditorium, which was frankly outrageously large. If it’s close to being full then that will be far too many people in one place. Now, time to press publish and get some sleep…

If you are reading this in the future, the link may not be to the 2025 event that I attended; if I realise I’ll update it to a permalink.↩︎
“Try running kubectl auth can-i --list to see what credentials you have”↩︎

by Josh Holland at April 02, 2025 12:00 AM

March 25, 2025

Alun Jones – Faking a JPEG

× ⇩ Click to expand I've been wittering on about Spigot for a while. It's small web application which generates a fake hierarchy of web pages, on the fly, using - about 1174 words

by Alun Jones at March 25, 2025 12:00 AM

February 17, 2025

Jon Spriggs – Talk Summary – An Eulogy for Auntie Pat

Format: Theatre Style room. ~30 attendees.

Slides: No slides provided (nothing to present on!), but the script is here

Video: Not recorded.

Slot: 11 AM, 10th February 2025, 10 minutes

Notes: This is a little unusual. both because I’m posting it as a “Talk Summary” but also because it was a Eulogy. Auntie Pat died in December. The talk I delivered was my memories of her, augmented by a few comments from her next nearest relative, the daughter of her cousin. The room was mostly filled with people I didn’t know, except for one row with my brother and his family. Following the funeral, several people suggested I’d done very well. One person remarked they hadn’t heard the talk because they forgot to wear their hearing aid. I guess when someone passes away in their 80’s, most of their friends will be too. Several people expressed sadness that they hadn’t known all the things I shared about her. We all enjoyed memories of her.

by JonTheNiceGuy at February 17, 2025 02:59 PM

Jon Spriggs – Building a Linux Firewall with AlmaLinux 9, NetworkManager, BGP, DHCP and NFTables with Puppet

I’m in the process of building a Network Firewall for a work environment. This blog post is based on that work, but with all the identifying marks stripped off.

For this particular project, we standardised on Alma Linux 9 as the OS Base, and we’ve done some testing and proved that the RedHat default firewalling product, Firewalld, is not appropriate for this platform, but did determine that NFTables, or NetFilter Tables (the successor to IPTables) is.

I’ll warn you, I’m pretty prone to long and waffling posts, but there’s a LOT of technical content in this one. There is also a Git repository with the final code. I hope that you find something of use in here.

This document explains how it is using Vagrant with Virtualbox to build a test environment, how it installs a Puppet Server and works out how to calculate what settings it will push to it’s clients. With that puppet server, I show how to build and configure a firewall using Linux tools and services, setting up an NFTables policy and routing between firewalls using FRR to provide BGP, and then I will show how to deploy a DHCP server.

Let’s go!

The scenario

A network diagram, showing a WAN network attached to the top of firewall devices and out via the Host machine, a transit network linking the bottom of the firewall devices, and attached to the side, networks identified as "Prod", "Dev" and "DHCP" each with IP allocations indicated.

To prove the concept, I have built two Firewall machines (A and B), plus six hosts, one attached to each of the A and B side subnets called “Prod”, “Dev” and “Shared”.

Any host on any of the “Prod” networks should be able to speak to any host on any of the other “Prod” networks, or back to the “Shared” networks. Any host on any of the “Dev” networks should be able to speak to any host on the other “Dev” networks, or back to the “Shared” networks.

Any host in Prod, Dev or Shared should be able to reach the internet, and shared can reach any of the other networks.

To ensure I can guarantee the MAC addresses I will be using, I am using a standard Virtual Machine prefix: 16:0D:EC:AF: followed by an octet to identify the firewall ID, fwA is 11 and fwB is 12, and then the interface ID as the last octet. The WAN interface gets 01, prod gets 02, dev gets 03, shared 04 and transit 05. This also means that when I move from deploying this on my laptop with Vagrant, to deploying it on my actual lab environment, I can apply the same MAC addressing scheme, and guarantee that I’ll know which interface is which, no matter what order they’re detected by the guest VM.

A note on IP addresses and DNS names used in this document

In this blog post, “Private” IP addresses are using the “Inter-networking” network assignment from IANA, as documented in RFC2544, while the “Public” IP addresses are using the default Vagrant “Host” network of 10.0.2.0/24 with the host being assigned 10.0.2.2 and would provide the default gateway to the guests.

In the actual lab environment, these addresses would be replaced by assigned network segments in RFC1918 “Private” address spaces, or by ranges allocated by the upstream ISP. Please *DO NOT* build your network assuming these addresses are appropriate for your use! In addition, DNS names will use example.org following the advice of RFC2606.

Building the Proof of Concept

I’m using Vagrant with Virtualbox to build my firewall and some test boxes. The “WAN” interface will be simulated by the NAT interface provided by Vagrant’s first interface (which is required for provisioning anyway), and will receive a DHCP address. All other interfaces will be private, host-only networks using the Virtualbox network manager. Once the firewall is built and running, it will serve DHCP to all downstream clients.

All of the following code can be found on my Github repository: JonTheNiceGuy/vagrant-puppet-firewall

Working from a common base

To start with, I build out my Vagrantfile (link to the code). A Vagrantfile is used to define how Vagrant will build one or more virtual machines, similar to how you might use a Terraform HCL file to deploy some cloud assets. I’ll show several sections from this file as we go along, but here’s the start of it. This part won’t be used to provision any virtual machines, and is instead just Boilerplate for the hosts which follow.

############################################################
############################## Define variables up-front
############################################################
vms_A_number    = 11
vms_B_number    = 12
global_mac_base = "160DECAF"
vms_A_mac_base  = "#{global_mac_base}#{vms_A_number < 10 ? '0' : ''}#{vms_A_number}"
vms_B_mac_base  = "#{global_mac_base}#{vms_B_number < 10 ? '0' : ''}#{vms_B_number}"
############################################################
############################## Standard VM Settings
############################################################
Vagrant.configure("2") do |config|
  ############################ Default options for all hosts
  config.vm.box = "almalinux/9"
  config.vm.synced_folder ".", "/vagrant", type: :nfs, mount_options: ['rw', 'tcp', 'nolock']
  config.vm.synced_folder "../..", "/etc/puppetlabs/code/environments/production/src_modules/", type: :nfs, mount_options: ['rw', 'tcp', 'nolock']
  config.vm.provision "shell", path: 'client/make_mount.py'
  config.vm.provider :virtualbox do |vb|
    vb.memory = 2048
    vb.cpus = 2
    vb.linked_clone = true
  end
  ############################ Install nginx to host a simple webserver
  config.vm.provision "shell", inline: <<-SCRIPT
    # Setup useful tools
    if ! command -v fping >/dev/null
    then
      dnf install -y epel-release && dnf install -y fping mtr nano nginx && systemctl enable --now nginx
      # Configure web server to reply with servername
      printf '<!DOCTYPE html><head><title>%s</title></head><body><h1>\n%s\n</h1></body></html>' "$(hostname -f)" > /usr/share/nginx/html/index.html
    fi
SCRIPT
  ############################ Vagrant Cachier Setup
  if Vagrant.has_plugin?("vagrant-cachier")
    config.cache.scope = :box
    # Note that the DNF plugin was only finalised after the last
    # release of vagrant-cachier before it was discontinued. As such
    # you must do `vagrant plugin install vagrant-cachier` and then
    # find where it has been installed (usually
    # ~/.vagrant/gems/*/gems/vagrant-cachier-*) and replace it with
    # the latest commit from the upstream git project. Or uninstall
    # vagrant-cachier :)
    config.cache.enable :dnf
    config.cache.synced_folder_opts = {
      type: :nfs,
      mount_options: ['rw', 'tcp', 'nolock']
    }
  end
end

This does several key things. Firstly it defines the size of the virtual machines which will be deployed, and installs some common testing tools, it sets up some variables for use later in the script (around MAC addresses and IP offsets) and it makes sure that mounted directories are always remounted (because Vagrant isn’t very good at doing that following a reboot).

There’s one script in here called make_mount.py, which I won’t go into in detail, but essentially, it just creates all the NFS mounts that Vagrant setup for the subsequent reboots. Unfortunately, I couldn’t do something similar for the Virtualbox Shared Folders. Feel free to bring this up in the comments if you want to know more.

Building a Puppetserver for testing your module

As I do more with Puppet, I’ve realised that being able to test a manual deployment of a set of modules with puppet apply /path/to/manifest.pp doesn’t actually test how your manifests will work in a real environment. To solve this, each of the test environments I build deploy a puppet server as well as the test machine or machines, and then I join the devices to the puppet server to let them deploy.

Let’s setup that Puppetserver, starting with the Vagrantfile definition. This snippet goes inside the section Vagrant.configure("2") do |config| block, just at the end of the code snippet I pasted before.

  config.vm.define "puppet" do |config|
    config.vm.hostname = "puppet"
    # \/ The puppetserver needs more memory
    config.vm.provider "virtualbox" do |vb|
        vb.memory = 4096
    end
    # \/ Fixed IP address needed for Vagrant-Cachier-NFS
    config.vm.network "private_network", ip: "192.168.56.254", name: "vboxnet0"
    # \/ Install and configure the Puppet server, plus the ENC.
    config.vm.provision "shell", path: "puppetserver/setup.sh"
  end

This showcases some really useful parts about Vagrant. Firstly, you can override the memory allocation, going from 2048 (which we had set as a default), and also you can define new networks to attach the VMs to. In this case, we have a “private_network”, configured to be a “host only network” in Virtualbox lingo, which means it’s attached not-only to the Virtual Machine, but also to the host machine.

When we run vagrant up with just this machine attached, it will run the scripts defined before, and then starting this setup script. Let’s dig into that for a second.

Setting up a Puppet Server

A puppet server is basically just a Certificate Authority, plus a web server to return the contents of your manifests to your client, plus some settings to use with that. Here’s a simple setup for that.

#!/bin/bash
START_PUPPET=0
################################################################
######### Install the Puppet binary and configure it as a server
################################################################
if ! command -v puppetserver >/dev/null
then
    rpm -Uvh https://yum.puppet.com/puppet8-release-el-9.noarch.rpm
    dnf install -y puppetserver puppet-agent
    alternatives --set java "$(alternatives --list | grep -E 'jre_17.*java-17' | awk '{print $3}')/bin/java"
    /opt/puppetlabs/bin/puppet config set server puppet --section main
    /opt/puppetlabs/bin/puppet config set runinterval 60 --section main
    /opt/puppetlabs/bin/puppet config set autosign true --section server
    START_PUPPET=1
fi

Like this, we can setup the puppet server to accept, automatically, any connecting client. There are security implications here!

Getting modules into the server

In a real-world deployment, you’ll have your Puppet Server which will have modules full of manifests attached to it. You may use some sort of automation to install or refresh those manifests, for example, in our lab, we use a script called r10k to update puppet modules on the host.

Instead of doing that, for the test, in the Vagrant file I mounted my “puppet modules” directory from the host machine to the puppet server, and then link each directory from the mounted path into where the puppet modules reside. This means we can also install public released modules, like puppetlabs-stdlib which has a series of standard resources, into the puppet server, without impacting my puppet modules directory. Here’s that code:

cd /etc/puppetlabs/code/environments/production/src_modules || exit 1
for dirname in puppet-module*
do
    TARGET="/etc/puppetlabs/code/environments/production/modules/$(echo "$dirname" | sed -E -e 's/.*puppet-module-//')"
    if [ ! -e "$TARGET" ]
    then
        ln -s "/etc/puppetlabs/code/environments/production/src_modules/${dirname}" "$TARGET"
    fi
done

################################################################
######### Install common modules
################################################################
/opt/puppetlabs/bin/puppet module install puppetlabs-stdlib

Defining what the clients will get

The puppet server then needs to know which manifests and settings to deploy to any node which connects to it. This is called an “External Node Classifier” or ENC.

The ENC receives the certificate name of the connecting host, and matches that against some internal logic to work out what manifests, in what environment they are coming from, and what settings to ship to the node. It then returns this as a JSON string for the Puppet Server to compile and send to the client.

The ENC defined in this dummy puppet server is extremely naive, and basically just reads a JSON file from disk. Here’s how it’s installed from the setup script

if ! [ -e /opt/puppetlabs/enc.sh ]
then
    cp /vagrant/puppetserver/enc.sh /opt/puppetlabs/enc.sh && chmod +x /opt/puppetlabs/enc.sh
    /opt/puppetlabs/bin/puppet config set node_terminus exec --section master
    /opt/puppetlabs/bin/puppet config set external_nodes /opt/puppetlabs/enc.sh --section master
    START_PUPPET=1
fi

Then here is the enc.sh script

#!/bin/bash

if [ -e "/vagrant/enc.${1}.json" ]
then
    cat "/vagrant/enc.${1}.json"
    exit 0
fi
if [ -e "/vagrant/enc.json" ]
then
    cat "/vagrant/enc.json"
    exit 0
fi
printf '{"classes": {}, "environment": "production", "parameters": {}}'

And finally, here’s the enc.json for this test environment:

{
    "classes": {
        "nftablesfirewall": {},
        "basevm": {},
        "hardening": {}
    },
    "environment": "production",
    "parameters": {}
}

So, we now have enough to provision a device connecting into the puppet server. Now we need to build our first Firewall.

Building a firewall

First we need the Virtual Machine to build. Again, we’re using the Vagrantfile to define this.

  config.vm.define :fwA do |config|
    # eth0 mgmt via vagrant ssh, simulating "WAN", DHCP to 10.0.2.x                        # eth0 wan
    config.vm.network "private_network", auto_config: false, virtualbox__intnet: "prodA"   # eth1 prod
    config.vm.network "private_network", auto_config: false, virtualbox__intnet: "devA"    # eth2 dev
    config.vm.network "private_network", auto_config: false, virtualbox__intnet: "sharedA" # eth3 prod
    config.vm.network "private_network", auto_config: false, virtualbox__intnet: "transit" # eth4 transit
    config.vm.provider "virtualbox" do |vb|
      vb.customize ["modifyvm", :id, "--macaddress1", "#{vms_A_mac_base}01"] # wan
      vb.customize ["modifyvm", :id, "--macaddress2", "#{vms_A_mac_base}02"] # prod
      vb.customize ["modifyvm", :id, "--macaddress3", "#{vms_A_mac_base}03"] # dev
      vb.customize ["modifyvm", :id, "--macaddress4", "#{vms_A_mac_base}04"] # shared
      vb.customize ["modifyvm", :id, "--macaddress5", "#{vms_A_mac_base}05"] # transit
    end
    config.vm.network "private_network", ip: "192.168.56.#{vms_A_number}", name: "vboxnet0" # Only used in this Vagrant environment for Puppet
    config.vm.hostname = "vms#{vms_A_number}fw#{vms_A_number}"
    config.vm.provision "shell", path: "puppetagent/setup-and-apply.sh"
  end

This gives us enough to build Firewall A, to build Firewall B, replace any “A” string (like “sharedA”, “vms_A_mac_base” or “vms_A_number”) with B (so “SharedB” and so on). The firewall has 5 interfaces, which are:

wan; technically a NAT interface in Vagrant, but in our lab would be completely exposed to the internet for ingress and egress traffic.
transit; used to pass traffic between VLANs (shared, prod and dev)
shared, prod and dev which carry the traffic for the machines classified as “production” or “development”, or for the shared management and access to them.

The puppet manifests we’ll see in a minute rely on those interfaces having the last 4 hexadecimal digits of the MAC address defined with specific values in order to identify the machine ID and the interface association. Fortunately, Virtualbox can assign these interfaces specific MAC addresses! Another win for Vagrant+Virtualbox. As before, we also add the private network which gives access to Puppet, which would normally be accessed over the WAN interface.

In here we have another shell script, this time puppetagent/setup-and-apply.sh. This one joins the puppet worker to the server, links the build modules (like we did with the puppet server) to replicate the build process with Packer, and then applies “standard” configuration from the local machine. Finally, it asks the server to apply the server configuration (using the ENC script we setup before). The local build modules (called “basevm” and “hardening”) I won’t go into here, because in this context they’re basically just saying “I ran” and then ending. But let’s take a look at the puppet module itself

Initialising the Puppet Module

There are six files in the puppet manifests, starting with init.pp. If you’ve not written any Puppet before, a module is defined as a manifest class with some optional parameters passed to it. You can also define default values using hiera to retrieve values from the data directory. The manifest can call out so subclasses, and can also transfer files and build templates. Let’s take a look at that init.pp file.

# @summary Load various sub-manifests
class nftablesfirewall {
  # Setup interfaces
  class { 'nftablesfirewall::interfaces': }

  # make this server route traffic
  class { 'nftablesfirewall::routing':
    require => Class['nftablesfirewall::interfaces'],
  }
  class { 'nftablesfirewall::bgp':
    require => Class['nftablesfirewall::interfaces'],
  }

  # Allow traffic flows across the firewall
  class { 'nftablesfirewall::policy':
    require => Class['nftablesfirewall::interfaces'],
  }

  # make this server assign IP addresses
  class { 'nftablesfirewall::dhcpd':
    require => Class['nftablesfirewall::interfaces'],
  }
}

The class calls subclasses by using the construct class { 'class::subclass': } and in some cases, use the “meta parameters” require, before or notify to establish order of running. The subclasses are named according to the subclass name, so let’s take a look at these.

Defining the interfaces

The later subclasses need the interfaces to be defined properly first, so when we take a look at interfaces.pp, it does one of three things. Let’s pull these apart one at a time.

If there is an interface called eth0, then we’ve not renamed these interfaces, so we need to do that first of all. Let’s take a look at that:

  if ($facts['networking']['interfaces']['eth0']) {
    #################################################################
    ######## Using the MAC address we've configured, define each
    ######## network interface. On cloud platforms, we'd need to
    ######## figure out a better way of doing this!
    #################################################################
    # This relies HEAVILY on the mac address for the device on eth0 
    # following this format:       16:0D:EC:AF:xx:01
    # The first 8 hex digits (160DECAF) don't really matter, but the
    # 9th and 10th are the VM number and the 11th and 12 are the 
    # interface ID. This MAC prefix I found is a purposefully 
    # unallocated prefix for virtual machines.
    #
    # Puppet magic to turn desired interface names etc into MAC
    # addresses, thanks to ChatGPT.
    #
    # https://chatgpt.com/share/67ae1617-a398-8002-807b-4bc4298b40bb
    $interface_map = {
      'wan'     => '01',
      'prod'    => '02',
      'dev'     => '03',
      'shared'  => '04',
      'transit' => '05',
    }
    $interfaces = $interface_map.map |$role, $suffix| {
      $match = $facts['networking']['interfaces'].filter |$iface, $details| {
        $details['mac'] and $details['mac'] =~ "${suffix}$"
      }

      if !empty($match) {
        { $role => $match.values()[0]['mac'] }  # Store the MAC address
      } else {
        {}
      }
    }.reduce |$acc, $entry| {
      $acc + $entry  # Merge all key-value pairs into a final hash
    }

    file { '/etc/udev/rules.d/70-persistent-net.rules':
      ensure  => present,
      owner   => root,
      group   => root,
      mode    => '0644',
      content => template('nftablesfirewall/etc/udev/rules.d/70-persistent-net.rules.erb'),
      notify  => Exec['Reboot'],
    } -> exec { 'Reboot':
      command     => '/bin/bash -c "(sleep 30 && reboot) &"',
      # We delay 30 seconds so the reboot doesn't kill puppet and report an error.
      refreshonly => true
    }
  }

I’m unashamed to say that I asked ChatGPT for some help here! I wanted to figure out how to name the interfaces without knowing the exact MAC address. Fortunately, Puppet identifies lots of details about the system, referred to as facts (you can read all of the facts your Puppet system knows about a node by running facter -p on a system with Puppet installed). In this case, we’re asking Puppet to parse all of the interfaces, and check the details of the MAC address to figure out which one was which. Once it knows that, it creates a file in udev, a system which identifies how to initialize the components, and in some cases, rename how they are seen by the system. When we do this, the system won’t recognise the changes until it’s been rebooted, so if we create or modify that file, sleep 30 seconds (to let the puppet module finish running) and then reboot.

What does the template for that udev file look like? Pretty simple actually.

<% @interfaces.each do |interface,mac| -%>
SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="<%= mac %>", NAME="<%= interface %>"
<% end %>

Once that’s run, it looks like this:

SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="16:0D:EC:AF:11:01", NAME="wan"
SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="16:0D:EC:AF:11:02", NAME="prod"
SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="16:0D:EC:AF:11:03", NAME="dev"
SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="16:0D:EC:AF:11:04", NAME="shared"
SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="16:0D:EC:AF:11:05", NAME="transit"

Once the system comes back up, Puppet will run immediately, and we take advantage of this! At the end of code block identifying if the interface is still called eth0, if it isn’t called eth0, we can set some IP addresses here. To do that, we use the MAC address allocation again! This time we’re using the second-from-last pair of hex digits to work out the firewall ID, and we use that firewall ID to identify the subnets to use, by adding this ID to the base value for the third IP octet in the local subnets (shared, prod, dev) and the last IP octet in the connecting subnets (wan and transit). Let’s take a look at just this bit. It starts at the top of the file, where we pass some parameters into the class:

class nftablesfirewall::interfaces (
  String  $network_base   = '198.18',
  Integer $prod_base      = 32, # Start of Supernet
  Integer $prod_mask      = 24,
  Integer $dev_base       = 64, # Start of Supernet
  Integer $dev_mask       = 24,
  Integer $shared_base    = 96, # Start of Supernet
  Integer $shared_mask    = 24,
  Integer $transit_actual = 255,
  Integer $transit_mask   = 24,
) {

And then later, we have this:

    #################################################################
    # This block here works out which host we are, based on the 5th
    # octet of the MAC address
    #################################################################
    $vm_offset = Integer(
      regsubst(
        $facts['networking']['interfaces']['wan']['mac'],
        '.*:([0-9A-Fa-f]{2}):[0-9A-Fa-f]{2}$',
        '\1'
      )
    )

    #################################################################
    # Next calculate the IP addresses to assign to each NIC
    #################################################################
    $transit_ip    = "${network_base}.${transit_actual}.${vm_offset}/${transit_mask}"
    $dev_actual    = $dev_base + $vm_offset
    $dev_ip        = "${network_base}.${dev_actual}.1/${dev_mask}"
    $prod_actual   = $prod_base + $vm_offset
    $prod_ip       = "${network_base}.${prod_actual}.1/${dev_mask}"
    $shared_actual = $shared_base + $vm_offset
    $shared_ip     = "${network_base}.${shared_actual}.1/${dev_mask}"

Once we have these values, we can start assigning IP addresses. In the diagram at the top of the page, I used the offsets 11 for fwA and 12 for fwB, and in the diagram it shows the IP addresses allocated to each of those networks; for fwA, wan gets a DHCP address, prod gets 198.15.43.1/24 ,dev gets 198.15.75.1/24, and shared gets 198.15.107.1/24, transit gets 198.15.255.11/24. These are all offset from the supernet allocation. If you were expecting more hosts than 32 in your supernet (the array starts at “0”, so offsets of 0 to 31) then you could allocate different ranges!

Anyway, to allocate the addresses to the interfaces, I want to use NetworkManager, as it’s built into these systems, and has some pretty good tooling around it. You can either mangle text files and re-apply them, or interact with a command line tool called nmcli. Rather than putting a whole load of work into building the text files, or executing lots of nested nmcli commands, I wrote a single python script, called configure_nm_if.py, and we execute this from the manifest, both as a test, to see if we need to make any changes, and to make the change itself.

    exec { 'Configure WAN Interface': # wan interface uses DHCP, so set to auto
      require => File['/usr/local/sbin/configure_nm_if.py'],
      command => '/usr/local/sbin/configure_nm_if.py wan auto',
      unless  => '/usr/local/sbin/configure_nm_if.py wan auto --test',
      notify  => Exec['Reboot'],
    }
    exec { 'Configure Dev Interface':
      require => File['/usr/local/sbin/configure_nm_if.py'],
      command => "/usr/local/sbin/configure_nm_if.py dev ${dev_ip}",
      unless  => "/usr/local/sbin/configure_nm_if.py dev ${dev_ip} --test",
      notify  => Exec['Reboot'],
    }

The script starts by working out which interfaces are configured by checking all the files in /etc/NetworkManager/system-connections and /run/NetworkManager/system-connections. In each of those files, lines are generally split into a key (like “interface” or “uuid”) and a value, which is what we’re looking for. Here’s that bit of code:

class nm_profile:
    file = None
    settings = {}

    def __init__(self, search_string: str):
        if search_string is None:
            raise ArgumentException('Invalid Search String')
        search = re.compile(f'^([^=]+)=(.*)\s*$')
        nm_dir = pathlib.Path("/run/NetworkManager/system-connections")
        for file_path in nm_dir.glob("*.nmconnection"):
            if file_path == search_string:
                self.file = file_path
            else:
                with open(file_path, "r") as f:
                    lines = f.readlines()
                    for line in lines:
                        compare = search.match(line)
                        if compare and compare.group(2) == search_string:
                            self.file = file_path
                            break
            if self.file is not None:
                break
        # Do the same thing for /etc/NetworkManager (cropped for brevity)
        if self.file is None:
            raise ProfileNotFound(
                f'Unable to find a profile matching the search string "{search_string}"')

        nmcli = subprocess.run(
            ["/bin/nmcli", "--terse", "connection", "show", self.file],
            capture_output=True, text=True
        )
        for line in nmcli.stdout.splitlines():
            data = line.split(":", 1)
            value = data[1].strip()
            if value == '':
                value = None
            self.settings[data[0].strip()] = value

This means that when we find the file with the configuration we want, run nmcli to get the full, calculated collection of settings for that file. Next we work out if anything would change from what it is (from the nmcli connection show command) and what we want it to be (from the arguments we pass into the script). That’s here:

def main():
    parser = argparse.ArgumentParser(
        description="Modify NetworkManager connection settings.")
    parser.add_argument("ifname", help="Interface name")
    parser.add_argument("ip", help="IP address or 'auto'")
    parser.add_argument("--dryrun", action="store_true",
                        help="Enable dry run mode")
    parser.add_argument("--test", action="store_true",
                        help="Enable test mode")
    args = parser.parse_args()

    actions = {}

    nm = nm_profile(args.ifname)

    current_id = nm.settings.get("connection.id")
    next_id = args.ifname
    if current_id != next_id:
        logging.debug(f'Change id from "{current_id}" to "{next_id}"')
        actions['connection.id'] = next_id

    current_method = nm.settings.get("ipv4.method")
    next_method = "manual" if args.ip != "auto" else "auto"

    if current_method != next_method:
        logging.debug(f'Change method from {current_method} to {next_method}')
        actions['ipv4.method'] = next_method

    current_ip = nm.settings.get("ipv4.addresses")
    next_ip = args.ip if args.ip != "auto" else None
    if next_ip is None and current_ip is not None:
        logging.debug(f'Change ipv4.address from {current_ip} to ""')
        actions['ipv4.addresses'] = ""
    elif next_ip != current_ip:
        logging.debug(
            f'Change ipv4.address from {current_ip if not None else "None"} to {next_ip}')
        actions['ipv4.addresses'] = next_ip

And then we use the fact that there are, or aren’t changes to be made, and either, if we’re testing for those changes, return a “success” or a “failure” (to provoke the manifest to trigger the change), or make the change. That’s here:

    if len(actions) > 0:
        if args.test:
            logging.debug('There are outstanding actions, exit rc 1')
            sys.exit(1)

        command = [
            '/bin/nmcli', 'connection', 'modify', 
            nm.settings.get('connection.uuid', str(nm.file))
        ]

        for action in actions.keys():
            command.append(action)
            command.append(actions[action])
        logging.info(f'About to run the following command: {command}')

        if not args.dryrun:
            nmcli = subprocess.run(
                command,
                capture_output=True, text=True
            )
            if nmcli.returncode > 0:
                raise NmcliFailed(
                    f'Failed to run command {command}, RC: {nmcli.returncode} StdErr: {nmcli.stderr} StdOut: {nmcli.stdout}')

And then if we’ve made changes, we restart the connection, which provides us with a test that the change is a valid one!

        command = [
            '/bin/nmcli', 'connection', 'down', nm.settings.get(
                'connection.uuid', str(nm.file))
        ]
        logging.info(f'About to run the following command: {command}')

        if not args.dryrun:
            nmcli = subprocess.run(
                command,
                capture_output=True, text=True
            )
            if nmcli.returncode > 0:
                raise NmcliFailed(
                    f'Failed to run command {command}, RC: {nmcli.returncode} StdErr: {nmcli.stderr} StdOut: {nmcli.stdout}')

            command = [
                '/bin/nmcli', 'connection', 'up', nm.settings.get(
                    'connection.uuid', str(nm.file))
            ]
            logging.info(f'About to run the following command: {command}')
            nmcli = subprocess.run(
                command,
                capture_output=True, text=True
            )
            if nmcli.returncode > 0:
                raise NmcliFailed(
                    f'Failed to run command {command}, RC: {nmcli.returncode} StdErr: {nmcli.stderr} StdOut: {nmcli.stdout}')

Once that script has executed for each of the interfaces, we trigger a reboot (30 seconds after the Puppet agent has finished running, again). This is because the Puppet agent only gathers the details of the interfaces when it first runs, and so the subsequent manifests need these interfaces to be detected properly.

I mentioned before that the interfaces subclass needed to do three things. The last thing it “should” do is nothing, because this subclass is heavily reliant on reboots! If there are no changes it needs to make, just let the code carry on so we can start working with the other aspects, and we’ll go next to BGP.

A brief note on my understanding of BGP

I want to take a quick diversion here before I get started on the puppet code here. I’m not hugely comfortable with BGP, or, in fact, any of the dynamic routing protocols. I do understand that it’s a core and key part of the internet, and without it networking teams across the world would be lost!

That said, I’ve relied heavily on advice from a colleague at this point, so while this file does work, it may not be best practice. Please speak to someone more competent and confident with routing to help you if you have ANY issues what-so-ever at this point!

Routing with BGP and FRR

I’m using FRR to setup BGP peers. Each peer advertises it’s own network segment to all it’s peers. Like with the interface subclass manifest, we calculate the network segments in the same way for the BGP subclass manifest. We also build a list of all of the peers (the other firewalls in the supernets)

  if ($facts['networking']['interfaces']['transit'] and $facts['networking']['interfaces']['transit']['ip']) {
    $vm_lan_ip_address = $facts['networking']['interfaces']['transit']['ip']

    #################################################################
    ######## Work out the offset to get the firewall ID
    #################################################################
    $split_ip = split($vm_lan_ip_address, '[.]')
    # Extract the last octet, ensuring it exists
    if $split_ip and size($split_ip) == 4 {
      $vm_last_octet = Integer($split_ip[3])

      # Time to add the other important addresses for this device
      $dev_address    = "${network_base}.${$dev_offset + $vm_last_octet}.0/24"
      $prod_address   = "${network_base}.${$prod_offset + $vm_last_octet}.0/24"
      $shared_address = "${network_base}.${$shared_offset + $vm_last_octet}.0/24"

      # Calculate the peers from the range 0..31 (excluding this one)
      $peer_addresses = range(0, 31).map |$i| {
        "${network_base}.${transit_octet}.${i}"
      }.filter |$ip| { $ip != $vm_lan_ip_address }

We can start to build our configuration file… after we’ve defined a handful of initial variables:

class nftablesfirewall::bgp (
  String  $bgp_our_asn            = '65513',
  Boolean $bgp_our_peer_enabled   = true,
  Boolean $bgp_advertise_networks = true,
  Boolean $bgp_cloud_peer_enabled = false,
  String  $bgp_cloud_peer_asn     = '65511',
  Array   $bgp_cloud_peer_ips     = ['198.18.0.2', '198.18.0.3'],
  String  $network_base           = '198.18',
  Integer $transit_octet          = 255,
  Integer $prod_offset            = 32,
  Integer $dev_offset             = 64,
  Integer $shared_offset          = 96,
) {

The ASNs are in the range of “Private ASNs” from 64512-65535 allocated by IANA in RFC1930, and are roughly equivalent to the IP allocation 10.0.0.0/8.

FRR configuration looks a little like a Cisco Router, and starts off as a template, like this:

! ######################################################
! # Basic Setup
! ######################################################
!
log syslog informational
frr defaults traditional
!
! ######################################################
! # Our BGP side
! ######################################################
!
router bgp <%= @bgp_our_asn %>
no bgp ebgp-requires-policy
bgp router-id <%= @vm_lan_ip_address %>
!
<%- if @bgp_our_peer_enabled -%>
! ######################################################
! # Firewall BGP peers (how we find our own routes)
! ######################################################
!
neighbor FW-PEERS peer-group
neighbor FW-PEERS remote-as <%= @bgp_our_asn %>
<% @peer_addresses.each do |ip| -%>
neighbor <%= ip %> peer-group FW-PEERS
<% end -%>
!
<%- end -%>
<%- if @bgp_cloud_peer_enabled -%>
! ######################################################
! # Cloud BGP peers (how Cloud finds us)
! ######################################################
!
neighbor CLOUD-PEERS peer-group
neighbor CLOUD-PEERS remote-as <%= @bgp_cloud_peer_asn %>
<% @bgp_cloud_peer_ips.each do |ip| -%>
neighbor <%= ip %> peer-group CLOUD-PEERS
<% end -%>
!
<%- end -%>
<%- if @bgp_advertise_networks -%>
! ######################################################
! # Our local networks
! ######################################################
!
address-family ipv4 unicast
    network <%= @dev_address %>
    network <%= @prod_address %>
    network <%= @shared_address %>
!
<%- end -%>
<%- if @bgp_our_peer_enabled -%>
! ######################################################
! Firewall BGP peers
! ######################################################
!
neighbor FW-PEERS activate
!
<%- end -%>
<%- if @bgp_cloud_peer_enabled -%>
! ######################################################
! Cloud BGP peers
! ######################################################
!
neighbor CLOUD-PEERS activate
!
<%- end -%>
exit-address-family
!
! ######################################################
! We don't use IPv6 yet
! ######################################################
!
address-family ipv6 unicast
exit-address-family
!

When rendered down, it looks like this:

aa

When FRR is running we can access the “virtual TTY” interface of FRR by running vtysh and issuing commands. The main one I’ve been using is show ip bgp summary which tells you if your peer is connected, like this:

[root@vms11fw11 frr]# vtysh 

Hello, this is FRRouting (version 8.5.3).
Copyright 1996-2005 Kunihiro Ishiguro, et al.

vms11fw11# show ip bgp summary

IPv4 Unicast Summary (VRF default):
BGP router identifier 198.15.255.11, local AS number 65513 vrf-id 0
BGP table version 6
RIB entries 11, using 2112 bytes of memory
Peers 31, using 22 MiB of memory
Peer groups 1, using 64 bytes of memory

Neighbor        V         AS   MsgRcvd   MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd   PfxSnt Desc
198.18.255.0    4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.1    4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.2    4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.3    4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.4    4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.5    4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.6    4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.7    4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.8    4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.9    4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.10   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.12   4      65513         4         4        0    0    0 00:00:27            3        3 N/A
198.18.255.13   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.14   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.15   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.16   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.17   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.18   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.19   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.20   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.21   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.22   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.23   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.24   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.25   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.26   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.27   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.28   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.29   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.30   4      65513         0         0        0    0    0    never       Active        0 N/A
198.18.255.31   4      65513         0         0        0    0    0    never       Active        0 N/A

Total number of neighbors 31
vms11fw11#

There is a separate block of configuration which allows for the upstream cloud provider to also offer BGP, but this is not available in Vagrant!

What’s next?

Defining your Firewall policy

I mentioned at the top of the post that I was using NFTables, which is the successor to IPTables. The policy we are defining is very simple, but you can see quite quickly how this policy can be enhanced. This isn’t a template (although it could be), it’s just a plain file that Puppet installs via the policy subclass manifest, and then configures the default value in sysconfig to load that policy file.

How does that policy look? It has four pieces, variable definitions, the input policy, the forward policy and the postrouting masquerading (or NAT) chain. Let’s pick these apart separately.

In each block (table inet filter for policy elements and table ip nat for Masquerading) we can define some variables. They are separate and distinct from each other. Here I’ll specify all of the supernets (both “this cloud” and “another cloud”) in the policy and just the relevant local supernets for the masquerading.

#!/usr/sbin/nft -f

flush ruleset

table inet filter {

    ##############################################################################
    # Define network objects to be used later
    ##############################################################################
    set management_networks {
      type ipv4_addr
      flags interval
      ##############################################
      # NOTES BELOW ON WHY EACH NETWORK IS SPECIFIED
      ##############################################
      #            Vagrant      Local          Cloud
      elements = { 10.0.2.0/24, 198.18.0.0/19, 198.19.0.0/19 }
    }

    set prod_networks {
      type ipv4_addr
      flags interval
      ##############################################
      # NOTES BELOW ON WHY EACH NETWORK IS SPECIFIED
      ##############################################
      #            Local           Cloud
      elements = { 198.18.32.0/19, 198.19.32.0/19 }
    }

    set dev_networks {
      type ipv4_addr
      flags interval
      ##############################################
      # NOTES BELOW ON WHY EACH NETWORK IS SPECIFIED
      ##############################################
      #            Local           Cloud
      elements = { 198.18.64.0/19, 198.19.64.0/19 }
    }

    set shared_networks {
      type ipv4_addr
      flags interval
      ##############################################
      # NOTES BELOW ON WHY EACH NETWORK IS SPECIFIED
      ##############################################
      #            Local           Cloud
      elements = { 198.18.96.0/19, 198.19.96.0/19 }
    }

    # ...... Chains follow

    chain output {
        type filter hook output priority 0; policy accept;
    }
}

table ip nat {
    ##############################################################################
    # Define network objects to be used later
    ##############################################################################
    set masq_networks {
      type ipv4_addr
      flags interval
      ##############################################
      # NOTES BELOW ON WHY EACH NETWORK IS SPECIFIED
      ##############################################
      #            Prod            Dev             Shared
      elements = { 198.18.32.0/19, 198.18.64.0/19, 198.18.96.0/19 }
    }

    # ...... Chain follows
}

Input refers to what traffic is connecting to the host in question. As it’s a firewall, we want as little available as possible; ICMP, SSH from management addresses, DHCP assignment for VMs attached to this firewall, and BGP, to allow the peers to see each other. We should also allow established traffic to flow, and the “loopback” lo interface, should be allowed to talk to anything on this host. This is actually combined with the previous code block, and I’ll indicate where that has happened, like I did before.

table inet filter {
    # ...... Variables as before
    chain input {
        type filter hook input priority 0; policy drop;

        # Allow loopback traffic
        iifname "lo" accept

        # Allow established and related connections
        ct state { established, related } accept

        # Allow ICMP traffic
        ip protocol icmp accept

        # Allow SSH (TCP/22) from specific subnets
        ip saddr @management_networks tcp dport 22 log prefix "A-NFT-input.management: " accept
        ip saddr @shared_networks     tcp dport 22 log prefix "A-NFT-input.shared: " accept

        # Allow DHCP and BOOTP traffic
        # This means that the nodes attached to this device can get IP addresses.
        ip protocol udp udp sport 68 udp dport 67 accept
        ip protocol udp udp sport 67 udp dport 68 accept

        # Allow BGP across the Transit interface
        iifname "transit" ip protocol tcp tcp dport 179 accept
        oifname "transit" ip protocol tcp tcp dport 179 accept

        # Drop everything else
        log prefix "DROP_ALL-NFT-input: " drop
    }
    # ...... Forward chain follows
}

Forwarding relates to what passes over this box. We want:

all established traffic to be allowed to pass
almost all ICMP traffic to be permitted
the shared supernet to be able to talk to any host
the dev supernet to be able to talk to any other host in the dev supernet, or to any host in the shared supernet
the prod supernet to be able to talk to any other host in the prod supernet, or to any host in the shared supernet
any host in the shared, dev and prod supernets to be able to talk to any host on the internet (except excluded network ranges)
excluded network ranges to be dropped

Let’s take a look at that.

table inet filter {
    # ...... Variables as before
    # ...... Input chain as before
    chain forward {                                         # Forward is "What can go THROUGH this host"
        type filter hook forward priority 0; policy drop;

        # Allow established and related connections
        ct state { established, related } accept

        # ICMP rules
        ip protocol icmp icmp type { echo-reply, echo-request, time-exceeded, destination-unreachable } accept

        # Shared network can talk out to anything
        ip saddr @shared_networks log prefix "A-NFT-forward.shared-any: " accept
        
        # Allow intra-segment traffic
        ip saddr @dev_networks    ip daddr @dev_networks  log prefix "A-NFT-forward.dev-dev: "   accept
        ip saddr @prod_networks   ip daddr @prod_networks log prefix "A-NFT-forward.prod-prod: " accept
        
        # Allow Prod, Dev access to Shared
        ip saddr @dev_networks    ip daddr @shared_networks log prefix "A-NFT-forward.dev-shared: " accept
        ip saddr @prod_networks   ip daddr @shared_networks log prefix "A-NFT-forward.prod-shared: " accept

        # Allow all segments access to the Internet, block the following subnets
        ip daddr != {
          0.0.0.0/8,                                      # RFC1700 (local network)
          10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16,      # RFC1918 (private networks)
          169.254.0.0/16,                                 # RFC3300 (link local)
          192.0.0.0/24,                                   # RFC5736 ("special purpose") 
          192.0.2.0/24, 198.51.100.0/24, 203.0.113.0/24,  # RFC5737 ("TEST-NET")
          192.88.99.0/24,                                 # RFC3068 ("6to4 relay")
          198.18.0.0/15,                                  # RFC2544 ("Inter-networking tests")
          224.0.0.0/4, 240.0.0.0/4                        # RFC1112, RFC6890 ("Special Purpose" and Multicast)
        } log prefix "A-NFT-forward.all-internet: " accept

        # Drop everything else
        log prefix "DROP_ALL-NFT-forward: " drop
    }
}

And lastly, we take a look at the Masquerading part of this. Here we want to masquerade (or “Hide NAT”) any traffic leaving on the WAN interface.

table ip nat {
    # ...... Variables as before
    chain postrouting {
        type nat hook postrouting priority 100; policy accept;

        # Masquerade all traffic going out of the WAN interface
        ip saddr @masq_networks oifname "wan" masquerade
    }
}

As you can see, the language of these policies is quite easy:

iifname the interface the traffic came in on
oifname the interface the traffic exits on
ip saddr the IP address, subnets or variable name the source address is in
ip daddr the IP address, subnets or variable name the destination address is in
ip protocol {udp|tcp|icmp} the protocol that the service travels over
{tcp|udp} sport The source TCP or UDP port
{tcp|udp} dport The destination TCP or UDP port
ct state the connection status of a packet
accept Permit the traffic to flow
drop Stop the traffic from flowing
log prefix "Some String" Add a prefix to the log line

By making this file executable, the #!/usr/sbin/nft -f at the start of the file means that to apply this policy, you just need to execute it! Dead simple.

The only thing left to do is to setup DHCP for the nodes and to test it!

DHCPd

DHCP is a protocol for automatically assigning IP addresses to nodes. In this case, we’re using dnsmasq, which is a small server that performs DNS resolution, as well as DHCP and, if we need it later, TFTP. This is a simple package-install away, and a very simple configuration file template too.

dhcp-option=option:dns-server,<%= @dns_servers %>

# Listen only on the specified interfaces
interface=<%= @dev_nic %>
dhcp-range=<%= @dev_nic %>,<%= @dev_subnet %>.10,<%= @dev_subnet %>.250,255.255.255.0,6h
dhcp-option=<%= @dev_nic %>,option:router,<%= @dev_gateway %>

interface=<%= @prod_nic %>
dhcp-range=<%= @prod_nic %>,<%= @prod_subnet %>.10,<%= @prod_subnet %>.250,255.255.255.0,6h
dhcp-option=<%= @prod_nic %>,option:router,<%= @prod_gateway %>

interface=<%= @shared_nic %>
dhcp-range=<%= @shared_nic %>,<%= @shared_subnet %>.10,<%= @shared_subnet %>.250,255.255.255.0,6h
dhcp-option=<%= @shared_nic %>,option:router,<%= @shared_gateway %>

Here we define the network interface to listen on (ending _nic) and the subnet range to allocate (ending _subnet) as well as the gateway address of this host (ending _gateway). We’ve also told it where to get it’s DNS records from too (dns_servers).

Building the testing hosts

Back to our Vagrantfile. We define the “VM” entries in the diagram at the top, attached to each of the networks, (Prod, Dev and Shared) on the A and B sides. The configuration is largely the same between each of these items, so I’ll only show one of them:

  config.vm.define :prodA do |config|
    config.vm.network "private_network", auto_config: false, virtualbox__intnet: "prodA"
    config.vm.network "private_network", ip: "192.168.56.#{vms_A_number + 10}", name: "vboxnet0"
    config.vm.hostname = "prod-#{vms_A_number}"
    config.vm.provision "shell", path: "client/manage_routes.sh"
  end

Honestly, the hostname didn’t need to be set, but makes life easier, and the private_network on vboxnet0 is just there for the DNF Cache, as we’re not using Puppet here. The only thing the client/manage_routes.sh script does is to remove the default route that Vagrant puts in to connect the node to the host for outbound NAT, ensuring it all goes through the firewall!

So, once we’ve got all of that, we can test it!

Testing your Lab

Running vagrant up will start all the VMs. Each node has 2GB of RAM, plus the puppet server which has 4GB, so make sure your host OS has at least 20GB RAM. Once you’re done with your test, destroy it with vagrant destroy and it will ask you if you’re sure. If you’ve done some tweaking, and need to re-provision something, run vagrant provision or vagrant up --provision. You can also just do vagrant up hostname (like vagrant up puppet, vagrant up puppet fwA or vagrant up puppet fwA fwB, for example) or vagrant destroy hostname to manage individual nodes.

Because of how Puppet works, if you do this, be aware you may need to remove puppet certificates with puppetserver ca clean --certname hostname.as.fqdn (you’ll see the hostnames when puppet agent is run). Honestly, I ended up recreating everything if I was doing that much tweaking!

Once you’ve got nodes up and running, you can run vagrant ssh hostname (like vagrant ssh prodA) and execute commands on there. Remember up near the top of this, I created an nginx server? With this, and running for node in prodA prodB devA devB sharedA sharedB; do echo $node ; vagrant ssh $node -- ip -4 -br a ; done to get a list of the IP addresses, you can run vagrant ssh prodA -- curl http://ip.for.sharedA.node (like vagrant ssh prodA -- curl http://198.18.107.123) to make sure that your traffic across the firewalls is working right.

You can also do vagrant ssh fwA and then run sudo journalctl -k | grep A-NFT-forward to see packets flowing across the firewall, sudo journalctl -k | grep DROP_ALL-NFT to see packets being dropped, and sudo journalctl -k | grep A-NFT-input to see packets destined for the firewall. Beware with that last one, you’ll also see all your new SSH connections into it!

Wow! This is a BIG one! I hope you’ve found it useful. It took a while to build, and even longer to test! Enjoy!!

by JonTheNiceGuy at February 17, 2025 09:07 AM

February 16, 2025

Chris Wallace – Art at Southmead Hospital

During my short stay with a bout of pneumonia, I spent each of four nights in a different ward. I...

by Chris Wallace at February 16, 2025 04:06 PM

Chris Wallace – Art at Southmead Hospital

During my short stay with a bout of pneumonia, I spent each of four nights in a different ward. I...

by Chris Wallace at February 16, 2025 04:06 PM

February 13, 2025

Alan Pope – Spotlighting Community Stories

tl;dr I’m hosting a Community Spotlight Webinar today at Anchore featuring Nicolas Vuilamy from the MegaLinter project. Register here.

Throughout my career, I’ve had the privilege of working with organizations that create widely-used open source tools. The popularity of these tools is evident through their impressive download statistics, strong community presence, and engagement both online and at events.

During my time at Canonical, we saw the tremendous reach of Ubuntu, along with tools like LXD, cloud-init, and yes, even Snapcraft.

At Influxdata, I was part of the Telegraf team, where we witnessed substantial adoption through downloads and active usage, reflected in our vibrant bug tracker.

Now at Anchore, we see widespread adoption of Syft for SBOM generation and Grype for vulnerability scanning.

What makes Syft and Grype particularly exciting, beyond their permissive licensing, consistent release cycle, dedicated developer team, and distinctive mascots, is how they serve as building blocks for other tools and services.

Syft isn’t just a standalone SBOM generator - it’s a library that developers can integrate into their own tools. Some organizations even build their own SBOM generators and vulnerability tools directly from our open source foundation!

$ docker-scout version
      â¢€â¢€â¢€             â£€â£€â¡¤â£”â¢–â£–â¢½â¢�
   â¡ â¡¢â¡£â¡£â¡£â¡£â¡£â¡£â¡¢â¡€    â¢€â£ â¢´â¡²â£«â¡ºâ£œâ¢�â¢®â¡³â¡µâ¡¹â¡…
  â¡œâ¡œâ¡œâ¡œâ¡œâ¡œâ œâ ˆâ ˆ        â �â ™â ®â£ºâ¡ªâ¡¯â£ºâ¡ªâ¡¯â£º
 â¢˜â¢œâ¢œâ¢œâ¢œâ œ               â ˆâ ªâ¡³â¡µâ£¹â¡ªâ ‡
 â ¨â¡ªâ¡ªâ¡ªâ ‚    â¢€â¡¤â£–â¢½â¡¹â£�â¡�â£–â¢¤â¡€    â ˜â¢�â¢®â¡š       _____                 _
  â ±â¡±â �    â¡´â¡«â£�â¢®â¡³â£�â¢®â¡ºâ£ªâ¡³â£�â¢¦    â ˜â¡µâ �      / ____| Docker        | |
   â �    â£¸â¢�â£•â¢—â¡µâ£�â¢®â¡³â£�â¢®â¡ºâ£ªâ¡³â££    â �      | (___   ___ ___  _   _| |_
        â£—â£�â¢®â¡³â£�â¢®â¡³â£�â¢®â¡³â£�â¢®â¢®â¡³            \___ \ / __/ _ \| | | | __|
   â¢€    â¢±â¡³â¡µâ£¹â¡ªâ¡³â£�â¢®â¡³â£�â¢®â¡³â¡£â¡�    â¡€       ____) | (_| (_) | |_| | |_
  â¢€â¢¾â „    â «â£�â¢®â¡ºâ£�â¢®â¡³â£�â¢®â¡³â£�â �    â¢ â¢£â¢‚     |_____/ \___\___/ \__,_|\__|
  â¡¼â£•â¢—â¡„    â ˆâ “â �â¢®â¡³â£�â ®â ³â ™     â¢ â¢¢â¢£â¢£
 â¢°â¡«â¡®â¡³â£�â¢¦â¡€              â¢€â¢”â¢•â¢•â¢•â¢•â …
 â¡¯â£�â¢¯â¡ºâ£ªâ¡³â£�â¢–â£„â£€        â¡€â¡ â¡¢â¡£â¡£â¡£â¡£â¡£â¡ƒ
â¢¸â¢�â¢®â¡³â£�â¢®â¡ºâ£ªâ¡³â •â —â ‰â �    â ˜â œâ¡œâ¡œâ¡œâ¡œâ¡œâ¡œâ œâ ˆ
â¡¯â¡³â ³â �â Šâ “â ‰             â ˆâ ˆâ ˆâ ˆ



version: v1.13.0 (go1.22.5 - darwin/arm64)
git commit: 7a85bab58d5c36a7ab08cd11ff574717f5de3ec2

$ syft /usr/local/bin/docker-scout | grep syft
 âœ” Indexed file system /usr/local/bin/docker-scout
 âœ” Cataloged contents f247ef0423f53cbf5172c34d2b3ef23d84393bd1d8e05f0ac83ec7d864396c1b
   â”œâ”€â”€ âœ” Packages                        [274 packages]
   â”œâ”€â”€ âœ” File digests                    [1 files]
   â”œâ”€â”€ âœ” File metadata                   [1 locations]
   â””â”€â”€ âœ” Executables                     [1 executables]
github.com/anchore/syft     v1.10.0     go-module

(I find it delightfully meta to discover syft inside other tools using syft itself)

This collaborative building upon existing tools mirrors how Linux distributions often build upon other Linux distributions. Like Ubuntu and Telegraf, we see countless individuals and organizations creating innovative solutions that extend beyond the core capabilities of Syft and Grype. It’s the essence of open source - a multiplier effect that comes from creating accessible, powerful tools.

While we may not always know exactly how and where these tools are being used (and sometimes, rightfully so, it’s not our business), there are many cases where developers and companies want to share their innovative implementations.

I’m particularly interested in these stories because they deserve to be shared. I’ve been exploring public repositories like the GitHub network dependents for syft, grype, sbom-action, and scan-action to discover where our tools are making an impact.

The adoption has been remarkable!

I reached out to several open source projects to learn about their implementations, and Nicolas Vuilamy from MegaLinter was the first to respond - which brings us full circle.

Today, I’m hosting our first Community Spotlight Webinar with Nicolas to share MegaLinter’s story. Register here to join us!

If you’re building something interesting with Anchore Open Source and would like to share your story, please get in touch. ğŸ™�

by Alan Pope at February 13, 2025 10:00 AM

January 21, 2025

Alun Jones – Lithium Ion Discharge Curve

Note: I started writing this post in July 2023, and forgot to finish it. The live battery graphs, linked below, give a bunch of extra info for guesstimating discharge curves - about 476 words

by Alun Jones at January 21, 2025 12:00 AM

January 19, 2025

Chris Wallace – Trees of Essaouira

Palms and Norfolk pines dominate the street scene. The Norfolk pines (Auracaria hetrophylla) do...

by Chris Wallace at January 19, 2025 08:51 PM

Chris Wallace – Trees of Essaouira

Palms and Norfolk pines dominate the street scene. The Norfolk pines (Auracaria hetrophylla) do...

by Chris Wallace at January 19, 2025 08:51 PM

January 11, 2025

BitFolk Issue Tracker – Billing - Feature #219 (Closed): Add host name to data transfer reports

Seems to be working

by admin at January 11, 2025 04:35 PM

BitFolk Issue Tracker – Billing - Feature #219 (In Progress): Add host name to data transfer reports

by admin at January 11, 2025 03:44 PM

January 02, 2025

Alun Jones – Managing load from abusive web bots

A few months back I created a small web application which generated a fake hierarchy of web pages, on the fly, using a Markov Chain to make gibberish content that - about 1987 words

by Alun Jones at January 02, 2025 12:00 AM

December 31, 2024

David Leadbeater – Déjà vu: Ghostly CVEs in my terminal title

Exploring a security bug in Ghostty that is eerily familiar.

by David Leadbeater at December 31, 2024 10:11 PM

December 29, 2024

Jon Spriggs – Quick Tip: Don’t use concat in your spreadsheet, use textjoin!

I found this on Threads today

CONCAT vs TEXTJOIN – The ultimate showdown!
TEXTJOIN is the GOAT:
=TEXTJOIN(“, “, TRUE, A1:A10)
â—� Adds delimiters automatically
â—� Ignores empty cells
â—� Works with ranges
Goodbye CONCAT, you won’t be missed!

And I’ve tested it this morning. I don’t have excel any more, but it works on Google Sheets, no worries!

by JonTheNiceGuy at December 29, 2024 09:07 AM

December 17, 2024

Andy Smith – I recommend avoiding the need to have panretinal photocoagulation (PRP) laser treatment

WARNING

This article contains descriptions of medical procedures on the eye. If that sort of thing makes you squeamish you may want to give it a miss.

Yesterday I had panretinal photocoagulation (PRP) laser treatment in both eyes, and it was quite unpleasant. I recommend trying really hard to avoid having to ever have that if possible.

PRP is used to manage symptoms of proliferative diabetic retinopathy. A laser is used to burn abnormal new blood vessels around the retina.

Having had a different kind of laser treatment before I wasn't expecting this to be a big deal. Unfortunately I was wrong and it was a bit of an ordeal.

As usual at these eye examinations I had drops to dilate my pupils and a bunch of different scans and photographs of the back of my eye taken so they knew what they were dealing with. Then in preparation for the procedure, some numbing eye drops. It's an odd sensation not being able to feel your eyelids or the skin around your eyes, but that part wasn't uncomfortable.

Next up the consultant held some sort of eyepiece firmly against the surface of my eyeball and applied a decent amount of pressure to keep it in place.

Then the laser pulses began. Many, many pulses. Each caused an unpleasant stabbing sensation in my eyeball with a dull ache following it. It wasn't so much that it was painful — Wikipedia describes this as "stinging" and in isolation I'd agree with that description. However while this was taking place my head was in a chin rest with an eyepiece thing pressed against my eyeball and the knowledge that if I moved unexpectedly then I risked having my vision destroyed by the laser. And these laser pulses were coming multiple times per second.

I was doing some grunting at the discomfort of each laser pulse when…

Consultant: What! I'm on 30% power. If I make it any lower it'll be homeopathy, know what I mean? It needs to be effective.

Me, through gritted teeth: Just do what you need to do.

Another thing I was not prepared for was total blindness during the procedure and for a few minutes after. He was telling me to look in certain directions, but my vision had gone completely black due to the laser so I couldn't actually tell which direction I was looking in.

Then when it was finally over for one eye, I still could not see anything and due to the anaesthetic could not even tell if my eye was open or not as I couldn't feel my eyelid. Thankfully that recovered after a couple of minutes so he could begin on the other eye…

Post procedure was not too bad. It's an outpatient procedure and I was immediately able to go home on the bus! My eyes just felt tired and took a lot longer to recover from the dilation drops than they usually do (I have vision tests several times a year and they always involve dilation drops). A headache between the temples did force me to go to bed early, but feel fine today.

So… if you have diabetes then blood sugar control is important to help avoid having to go through something like this. If you lose a genetic lottery then after decades living with diabetes you may need it anyway, or if you win then perhaps you never do, but I just suggest doing what you can to improve your odds.

This is still only the second most unpleasant procedure I've had on my eye though!

by Andy at December 17, 2024 12:00 AM

December 03, 2024

Jon Spriggs – A few weird issues in the networking on our custom AWS EKS Workers, and how we worked around them

For “reasons”, at work we run AWS Elastic Kubernetes Service (EKS) with our own custom-built workers. These workers are based on Alma Linux 9, instead of AWS’ preferred Amazon Linux 2023. We manage the deployment of these workers using AWS Auto-Scaling Groups.

Our unusal configuration of these nodes mean that we sometimes trip over configurations which are tricky to get support on from AWS (no criticism of their support team, if I was in their position, I wouldn’t want to try to provide support for a customer’s configuration that was so far outside the recommended configuration either!)

Over the past year, we’ve upgraded EKS1.23 to EKS1.27 and then on to EKS1.31, and we’ve stumbled over a few issues on the way. Here are a couple of notes on the subject, in case they help anyone else in their journey.

All three of the issues below were addressed by running an additional service on the worker nodes in a Systemd timed service which triggers every minute.

Incorrect routing for the 12th IP address onwards

Something the team found really early on (around EKS 1.18 or somewhere around there) was that the AWS VPC-CNI wasn’t managing the routing tables on the node properly. We raised an issue on the AWS VPC CNI (we were on CentOS 7 at the time) and although AWS said they’d fixed the issue, we currently need to patch the routing tables every minute on our nodes.

What happens?

When you get past the number of IP addresses that a single ENI can have (typically ~12), the AWS VPC-CNI will attach a second interface to the worker, and start adding new IP addresses to that. The VPC-CNI should setup routing for that second interface, but for some reason, in our case, it doesn’t. You can see this happens because the traffic will come in on the second ENI, eth1, but then try to exit the node on the first ENI, eth0, with a tcpdump, like this:

[root@test-i-01234567890abcdef ~]# tcpdump -i any host 192.0.2.123
tcpdump: data link type LINUX_SLL2
dropped privs to tcpdump
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes
09:38:07.331619 eth1  In  IP ip-192-168-1-100.eu-west-1.compute.internal.41856 > ip-192-0-2-123.eu-west-1.compute.internal.irdmi: Flags [S], seq 1128657991, win 64240, options [mss 1359,sackOK,TS val 2780916192 ecr 0,nop,wscale 7], length 0
09:38:07.331676 eni989c4ec4a56 Out IP ip-192-168-1-100.eu-west-1.compute.internal.41856 > ip-192-0-2-123.eu-west-1.compute.internal.irdmi: Flags [S], seq 1128657991, win 64240, options [mss 1359,sackOK,TS val 2780916192 ecr 0,nop,wscale 7], length 0
09:38:07.331696 eni989c4ec4a56 In  IP ip-192-0-2-123.eu-west-1.compute.internal.irdmi > ip-192-168-1-100.eu-west-1.compute.internal.41856: Flags [S.], seq 3367907264, ack 1128657992, win 26847, options [mss 8961,sackOK,TS val 1259768406 ecr 2780916192,nop,wscale 7], length 0
09:38:07.331702 eth0  Out IP ip-192-0-2-123.eu-west-1.compute.internal.irdmi > ip-192-168-1-100.eu-west-1.compute.internal.41856: Flags [S.], seq 3367907264, ack 1128657992, win 26847, options [mss 8961,sackOK,TS val 1259768406 ecr 2780916192,nop,wscale 7], length 0

The critical line here is the last one – it’s come in on eth1 and it’s going out of eth0. Another test here is to look at ip rule

[root@test-i-01234567890abcdef ~]# ip rule
0:	from all lookup local
512:	from all to 192.0.2.111 lookup main
512:	from all to 192.0.2.143 lookup main
512:	from all to 192.0.2.66 lookup main
512:	from all to 192.0.2.113 lookup main
512:	from all to 192.0.2.145 lookup main
512:	from all to 192.0.2.123 lookup main
512:	from all to 192.0.2.5 lookup main
512:	from all to 192.0.2.158 lookup main
512:	from all to 192.0.2.100 lookup main
512:	from all to 192.0.2.69 lookup main
512:	from all to 192.0.2.129 lookup main
1024:	from all fwmark 0x80/0x80 lookup main
1536:	from 192.0.2.123 lookup 2
32766:	from all lookup main
32767:	from all lookup default

Notice here that we have two entries from all to 192.0.2.123 lookup main and from 192.0.2.123 lookup 2. Let’s take a look at what lookup 2 gives us, in the routing table

[root@test-i-01234567890abcdef ~]# ip route show table 2
192.0.2.1 dev eth1 scope link

Fix the issue

This is pretty easy – we need to add a default route if one doesn’t already exist. Long before I got here, my boss created a script which first runs ip route show table main | grep default to get the gateway for that interface, then runs ip rule list, looks for each lookup <number> and finally runs ip route add to put the default route on that table, the same as on the main table.

ip route add default via "${GW}" dev "${INTERFACE}" table "${TABLE}"

Is this still needed?

I know when we upgraded our cluster from EKS1.23 to EKS1.27, this script was still needed. When I’ve just checked a worker running EKS1.31, after around 12 hours of running, and a second interface being up, it’s not been needed… so perhaps we can deprecate this script?

Dropping packets to the containers due to Martians

When we upgraded our cluster from EKS1.23 to EKS1.27 we also changed a lot of the infrastructure under the surface (AlmaLinux 9 from CentOS7, Containerd and Runc from Docker, CGroups v2 from CGroups v1, and so on). We also moved from using an AWS Elastic Load Balancer (ELB) or “Classic Load Balancer” to AWS Network Load Balancer (NLB).

Following the upgrade, we started seeing packets not arriving at our containers and the system logs on the node were showing a lot of martian source messages, particularly after we configured our NLB to forward original IP source addresses to the nodes.

What happens

One thing we noticed was that each time we added a new pod to the cluster, it added a new eni[0-9a-f]{11} interface, but the sysctl value for net.ipv4.conf.<interface>.rp_filter (return path filtering – basically, should we expect the traffic to be arriving at this interface for that source?) in sysctl was set to 1 or “Strict mode” where the source MUST be the coming from the best return path for the interface it arrived on. The AWS VPC-CNI is supposed to set this to 2 or “Loose mode” where the source must be reachable from any interface.

In this case you’d tell this because you’d see this in your system journal (assuming you’ve got net.ipv4.conf.all.log_martians=1 configured):

Dec 03 10:01:19 test-i-01234567890abcdef kernel: IPv4: martian source 192.168.1.100 from 192.0.2.123, on dev eth1

The net result is that packets would be dropped by the host at this point, and they’d never be received by the containers in the pods.

Fix the issue

This one is also pretty easy. We run sysctl -a and loop through any entries which match net.ipv4.conf.([^\.]+).rp_filter = (0|1) and then, if we find any, we run sysctl -w net.ipv4.conf.\1.rp_filter = 2 to set it to the correct value.

Is this still needed?

Yep, absolutely. As of our latest upgrade to EKS1.31, if this value isn’t set, then it will drop packets. VPC-CNI should be fixing this, but for some reason it doesn’t. And setting the conf.ipv4.all.rp_filter to 2 doesn’t seem to make a difference, which is contrary to the documentation in the relevant Kernel documentation.

After 12 IP addresses are assigned to a node, Kubernetes services stop working for some pods

This was pretty weird. When we upgraded to EKS1.31 on our smallest cluster we initially thought we had an issue with CoreDNS, in that it sometimes wouldn’t resolve IP addresses for services (DNS names for services inside the cluster are resolved by <servicename>.<namespace>.svc.cluster.local to an internal IP address for the cluster – in our case, in the range 172.20.0.0/16). We upgraded CoreDNS to the EKS1.31 recommended version, v1.11.3-eksbuild.2 and that seemed to fix things… until we upgraded our next largest cluster, and things REALLY went wrong, but only when we had increased to over 12 IP addresses assigned to the node.

You might see this as frequent restarts of a container, particularly if you’re reliant on another service to fulfil an init container or the liveness/readyness check.

What happens

EKS1.31 moves KubeProxy from iptables or ipvs mode to nftables – a shift we had to make internally as AlmaLinux 9 no longer supports iptables mode, and ipvs is often quite flaky, especially when you have a lot of pod movements.

With a single interface and up to 11 IP addresses assigned to that interface, everything runs fine, but the moment we move to that second interface, much like in the first case above, we start seeing those pods attached to the second+ interface being unable to resolve service addresses. On further investigation, doing a dig from a container inside that pod to the service address of the CoreDNS service 172.20.0.10 would timeout, but a dig against the actual pod address 192.0.2.53 would return a valid response.

Under the surface, on each worker, KubeProxy adds a rule to nftables to say “if you try and reach 172.20.0.10, please instead direct it to 192.0.2.53”. As the containers fluctuate inside the cluster, KubeProxy is constantly re-writing these rules. For whatever reason though, KubeProxy currently seems unable to determine that a second or subsequent interface has been added, and so these rules are not applied to the pods attached to that interface…. or at least, that’s what it looks like!

Fix the issue

In this case, we wrote a separate script which was also triggered every minute. This script looks to see if the interfaces have changed by running ip link and looking for any interfaces called eth[0-9]+ which have changed, and then if it has, it runs crictl pods (which lists all the running pods in Containerd), looks for the Pod ID of KubeProxy, and then runs crictl stopp <podID> [1] and crictl rmp <podID> [1] to stop and remove the pod, forcing kubelet to restart the KubeProxy on the node.

[1] Yes, they aren’t typos, stopp means “stop the pod” and rmp means “remove the pod”, and these are different to stop and rm which relate to the container.

Is this still needed?

As this was what I was working on all-day yesterday, yep, I’d say so – in all seriousness though, if this hadn’t been a high-priority issue on the cluster, I might have tried to upgrade the AWS VPC-CNI and KubeProxy add-ons to a later version, to see if the issue was resolved, but at this time, we haven’t done that, so maybe I’ll issue a retraction later

Featured image is â€œApoptosis Network (alternate)â€� by â€œSimon Cockellâ€� on Flickr and is released under a CC-BY license.

I just want to note that Will Jessop noticed a significant typo in this post within an hour of my posting. The post was updated accordingly. Will is awesome and super lovely. Thanks Will!

by JonTheNiceGuy at December 03, 2024 11:50 AM

November 30, 2024

Ross Younger – Getting value from CI

Many of my peers within the software world know the value of Continuous Integration and don’t need convincing. This article is for everybody else.

Introduction

In my first job out of college we had what you’d recognise as CI, though the term wasn’t so popular then. It was powerful, very useful, but a source of Byzantine complexity.

I’ve also worked for people who didn’t think CI was worth doing because it was too expensive to set up and maintain. This is not totally unreasonable; the real question is to figure out where the value for your project might lie.

Recently, a friend wrote:

I don't really know very much about CI. I would be interested in knowing more and might even use some of the quick wins (...) I do not want to become completely reliant upon GitHub for anything.

So let’s start with a primer.

Terminology: What is CI?

Unfortunately the term “CI” is sometimes misused and/or confused.

The short answer is that it’s automation that regularly (continuously) does something useful with your codebase. These actions might take place on every commit, nightly, or be activated by some external trigger.

CI usually refers to a spectrum of practices, each step building on the last:

Continuous… Typical activities

Build Builds your code, usually to the unit or module level. Runs unit tests.

Integration Assembles modules to a “finished application”, whatever that means. Runs integration tests.

Test A full suite of automated tests. May include regression, performance, deployability and data migration.

Delivery When the test suite passes, the latest version of the system is automatically released to a staging environment. This might involve building packages and putting them in a download area.

Deployment When the automated tests pass, the software automatically goes live. Hold tight!

Exactly what these phases mean for your project, and how far you go with them, depends on your project.

What suits my embedded firmware probably won’t suit your cloud app or that other person’s desktop app.

The lines between the phases are blurry. For example, it may or may not make sense to build and integrate everything in one go.

â�” Why CI

If deployed appropriately, CI can save time, reduce costs and improve quality. Even on a hobby project, there is often value in saving your time.

1. Automating stuff, so the humans don’t have to

You could use your engineers to do the repetitive drudge work of creating a release across multiple platforms. You could have them run a full barrage of tests before committing a code change… but should you? Engineers are expensive and generally dislike boring stuff, so the smart business move is usually to automate away the repetitive parts and have them focus where they can deliver most value.

If you’re not sure, consider this: how much time does your team spend per release cycle on the repetitive parts? Consider your expected frequency of release cycles, that should lead you to the answer.

2. Automatic analysis and status reporting

One place I worked had a release process which relied on an engineer reading multiple megabytes of log file to see if things had been successful. Many things could go wrong and leave the final output in a plausible but half-broken state. Worse, it wasn’t as simple as running the script in stop-on-error mode, because some of the steps were prone to false alarms.

You may be ahead of me here, but I didn’t think much of that setup.

Compilation failed? Show me the compiler output from the file that failed.

A test failed? I want to see the result of that test (expected & observed results).

Everything passed? Great, but don’t spend megabytes to convey one single bit of information.

At its simplest, a small project will have a single main branch, and the operational information you need can be boiled down to a small number of states:

Something is broken Non-critical warning (not all projects use this) Everything is working

In a non-remote workplace it might make sense to set up some sort of status annunciator.

Some people use coloured lava lamps or similar.

At one place I worked the machinery in the factory had physical traffic light (andon) lamp sets. We set one of these up, driven by a Raspberry Pi wired in to the build server.

Some projects build more elaborate virtual dashboards that suit their needs. Multiple branches, multiple build configurations, whatever makes sense.

3. Improved quality

This one might be self-evident, but I’ll spell it out anyway.

A good CI system will let you incorporate tests of many different types, with variable pass/fail criteria. Think beyond unit and integration testing:

Regression (check that your bugs stay fixed)

Code quality (code/test coverage analysis; static analysis; dynamic memory leak analysis; automated code style checks)

Security analysis (are there any known issues in your dependencies?)

License/SBOM compliance

Fuzz testing (how does it handle randomised, unexpected inputs?)

Performance requirements

“Early warning” performance canaries

Standards compliance

System data migration

On-device testing (might be real, emulated or simulated hardware)

Performance canaries
Particularly where physical devices are involved, you might have a performance margin built in to your hardware spec. As the project evolves, inevitably new features will erode this margin. When you run out things are going to go wrong, so you want to take action before you get there.
An early warning canary is some sort of metric with a threshold. Examples might include free memory, CPU/MCU consumption, or task processing time. When the threshold is passed, that's a sign that things are getting tight and it's time to take pre-emptive action. You might plan to spend some time on algorithmic optimisations, or to kick off a new hardware spin.

If you can automate a really robust set of tests, you can have a lot of confidence in the state of your code at any given time. This gives incredible agility: you can release at any time, if the tests pass. This is the key to moving quickly, and is how a number of tech companies operate.

For a success story involving physical devices, check out the HP LaserJet team’s DevOps transformation.

4. Reduced time to resolve issues

If there’s one thing I’ve learned in the software business, it’s that it’s cheaper to find bugs closer to development - by orders of magnitude.

In other words, reduce the feedback cycle to reduce your costs. This is where automated tests and checks have great value.

If there is something wrong in code I modified a minute ago, I’m still in the right headspace and can usually fix it pretty quickly.

If it takes a few days to get a test result, I won’t remember all the detail and will have to refresh my memory.

If it takes several months to hear that something’s wrong, I may be working on a totally different part of the system and it will take longer to context switch.

If a bug report comes in from the field a year or two later, I might as well be starting again from scratch.

But - as ever - engineering is a trade-off. You can’t write a test to catch a bug you haven’t foreseen. It may be prohibitively expensive to test all possible combinations before release.

â�Œ Why not CI

CI is not suitable for all software projects.

If you’re writing a scratch throw-away project that won’t live for very long, even simple CI may not be worth it.

If you have a legacy codebase that was written without testing in mind, it might be prohibitively expensive to refactor to set these up. Nevertheless, in such projects there is often still some value to be found in a continuous build.

Let’s be pragmatic.

Tests aren’t everything

On the face of it, more testing means greater quality, right? Well… maybe?

Keep the end goal in sight. It’s up to you to decide what makes sense for your situation; I recommend taking a whole-of-organisation view.

You need to balance test runtime against overall feedback cycles. If the tests take too long to run, you’re slowing people down.

Some tests are expensive in terms of time or consuming resources, so you might not want to run them daily.

Tests involving physical devices can be difficult to automate, and risk creating a process bottleneck. (Consider emulation and/or simulation where appropriate.)

Beware of over-testing; you may not need to exhaustively check all the combinations. Statistical techniques might help you out here.

Beware of making your black-box tests too strict; this can lead to brittle tests that are more hassle to maintain than they are worth.

Costs and maintenance

It will take time and effort to set up CI. How much time and effort, I can’t say.

In times past, CI was quite the bespoke effort.

These days there is good tooling support for many environments, so it is usually pretty quick to get something going. From there you can decide how far to go.

It might be too big for your platform

CI platforms are designed for small, lightweight processes. Think seconds to minutes, not hours.

If you need to build a large application or a full Yocto firmware image, it’s going to be tough to make that fit within the limits of a cloud-hosted CI platform. Don’t despair! There are ways out, but you need to be smart. Alternative options include:

self-hosting CI runners that are take part in a cloud source repository;

self-hosting the CI environment (e.g. Gitlab, Jenkins, CircleCI), noting that most source code hosting platforms have integrations;

split up the task into multiple smaller CI jobs making good use of artefacts between stages;

reconsidering what is truly worth automating anyway.

ğŸ‘· Steps you can take

1. Build your units

In most projects you already had to set up a buildsystem. Automating this is usually pretty cheap though you will need to get the tooling right.

Tooling on cloud platforms
On-cloud CI (as provided by Github, Gitlab, Bitbucket and others) is generally containerised. What this means is that your project has to know how to install its own tooling, starting with a minimal (usually Linux) container image.
This is really good practice! Doing so means your required tools are themselves expressed in source code under revision control.

Where this might get tricky is if you have multiple build configurations (platforms or builds with different features). Don’t be surprised if automating reveals shortcomings in your setup.

If you have autogenerated documentation, consider running that too. (In Rust, for example, it could be as easy as adding a cargo doc step.)

2. Test your units

Adding unit tests to CI is usually pretty cheap though it will depend on the language and available test frameworks.

If you want to include language-intrinsic checks (e.g. code style, static analysis) this is a good time to build them in. Some analyses can be quite expensive so it may not make sense to run all the checks at the same frequency.

3. Integrate it

If you’re pulling multiple component parts (microservices, standalone executables) together to make an end result, that’s the next step. Do they play nicely? Do you want to run any isolated tests among them before you move to delivery-level tests?

4. Add more checks

I spoke about these above.

This is where things stop being cheap and you have to start thinking about building out supporting infrastructure.

5. Deliver it

Now we’re getting quite situation-specific. Think about what it means to deliver your project.

Are you building a package for an ecosystem (Rust crate / Python pypi / npm.js / …) ? You might be able to automate the packaging steps and that might be pretty cheap.

Are you building an application? Perhaps you can automate the process of building the installer / container / whatever shape it takes. If you have multiple build configurations or platforms, it could get very tedious to build them all by hand and there is often a win for automation.

Where there's code signing involved, you'll need to decide whether it makes sense to automate that or leave it as a manual release step. Never put private keys or other code signing secrets directly into source! Some platforms have secrets mechanisms that may be of use, but it pays to be cautious. If your secrets leak, how will you repair the situation?

Closing thoughts

Most projects will benefit from a little CI. You don’t need to have unit tests, though they are a good idea.

You’re going to have to maintain your CI, so build it for maintainability like you do your software.

Apply agile to your CI as you do to your deliverables. Perfect is the enemy of good enough. Build something, get feedback, iterate!

CI vendors want to lock you in to their platform. Keep your eyes open.

Don’t let CI become an all-consuming monster that prevents you from delivering in the first place!

Continuous…	Typical activities
Build	Builds your code, usually to the unit or module level. Runs unit tests.
Integration	Assembles modules to a “finished application”, whatever that means. Runs integration tests.
Test	A full suite of automated tests. May include regression, performance, deployability and data migration.
Delivery	When the test suite passes, the latest version of the system is automatically released to a staging environment. This might involve building packages and putting them in a download area.
Deployment	When the automated tests pass, the software automatically goes live. Hold tight!



Something is broken	Non-critical warning (not all projects use this)	Everything is working

by Ross Younger at November 30, 2024 09:53 PM

Andy Smith – Check yo PTRs

Backstory

The other day I was looking through a log file and saw one of BitFolk's IP addresses doing something. I didn't recognise the address so I did a reverse lookup and got 2001-ba8-1f1-f284-0-0-0-2.autov6rev.bitfolk.space â€” which is a generic setting and not very useful.

It's quick to look this up and fix it of course, but I wondered how many other such addresses I had forgotten to take care of the reverse DNS for.

ptrcheck

In order to answer that question, automatically and in bulk, I wrote ptrcheck.

It was able to tell me that almost all of my domains had at least one reference to something without a suitable PTR record.

Though it wasn't all bad news. ğŸ˜€

How it works

See the repository for full details, but briefly: ptrcheck does a zone transfer of the zone you specify and keeps track of every address (A / AAAA) record. It then does a PTR query for each unique address record to make sure it

exists

is "acceptable"

You can provide a regular expression for what you deem to be "unacceptable", otherwise any PTR content at all is good enough.

Why might a PTR record be "unacceptable"??

I am glad you asked.

A lot of hosting providers generate generic PTR records when the customer doesn't set their own. They're not a lot better than having no PTR at all.

Failure to comply is no longer an option (for me)

The program runs silently (unless you use --verbose) so I was able to make a cron job that runs once a day and complains at me if any of my zones ever refer to a missing or unacceptable PTR ever again!

By the way, I ran it against all BitFolk customer zones; 26.5% of them had at least one missing or generic PTR record.

by Andy at November 30, 2024 12:00 AM

November 02, 2024

Ross Younger – Announcing qcp

The QUIC Copier (qcp) is an experimental high-performance remote file copy utility for long-distance internet connections.

Source repository: https://github.com/crazyscot/qcp

ğŸ“‹ Features

ğŸ”§ Drop-in replacement for scp

ğŸ›¡ï¸� Similar security to scp, using existing, well-known mechanisms

ğŸš€ Better throughput on congested networks

ğŸ“– About qcp

qcp is a hybrid protocol combining ssh and QUIC.

We use ssh to establish a control channel to the target machine, then spin up the QUIC protocol to transfer data.

This has the following useful properties:

User authentication is handled entirely by ssh

Data is transmitted over UDP, avoiding known issues with TCP over “long, fat pipe” connections

Data in transit is protected by TLS using ephemeral keys

The security mechanisms all use existing, well-known cryptographic algorithms

For full documentation refer to qcp on docs.rs.

Motivation

I needed to copy multiple large (3+ GB) files from a server in Europe to my home in New Zealand.

Iâ€™ve got nothing against ssh or scp. Theyâ€™re brilliant. Iâ€™ve been using them since the 1990s. However they run on top of TCP, which does not perform very well when the network is congested. With a fast fibre internet connection, a long round-trip time and noticeable packet loss, I was right in the sour spot. TCP did its thing and slowed down, but when the congestion cleared it was very slow to get back up to speed.

If youâ€™ve ever been frustrated by download performance from distant websites, you might have been experiencing this same issue. Friends with satellite (pre-Starlink) internet connections seem to be particularly badly affected.

ğŸ’» Getting qcp

The project is a Rust binary crate.

You can install it:

as a Debian package or pre-compiled binary from the latest qcp release page (N.B. the Linux builds are static musl binaries);

with cargo install qcp (you will need to have a rust toolchain and capnpc installed);

by cloning and building the source repository.

You will need to install qcp on both machines. Please refer to the README for more.

See also

RFC9000 “QUIC: A UDP-Based Multiplexed and Secure Transport”

by Ross Younger at November 02, 2024 09:23 PM

November 01, 2024

Andy Smith – Protecting URIs from Tor nodes with the Apache HTTP Server

Recently I found one of my web services under attack from clients using Tor.

For the most part I am okay with the existence of Tor, but if you're being attacked largely or exclusively through Tor then you might need to take actions like:

Temporarily or permanently blocking access entirely.

Taking away access to certain privileged functions.

Here's how I did it.

Step 1: Obtain a list of exit nodes

Tor exit nodes are the last hop before reaching regular Internet services, so traffic coming through Tor will always have a source IP of an exit node.

Happily there are quite a few services that list Tor nodes. I like https://www.dan.me.uk/tornodes which can provide a list of exit nodes, updated hourly.

This comes as a list of IP addresses one per line so in order to turn it into an httpd access control list:

$ curl -s 'https://www.dan.me.uk/torlist/?exit' | sed 's/^/Require not ip /' | sudo tee /etc/apache2/tor-exit-list.conf >/dev/null

This results in a file like:

$ head -10 /etc/apache2/tor-exit-list.conf Require not ip 102.130.113.9 Require not ip 102.130.117.167 Require not ip 102.130.127.117 Require not ip 103.109.101.105 Require not ip 103.126.161.54 Require not ip 103.163.218.11 Require not ip 103.164.54.199 Require not ip 103.196.37.111 Require not ip 103.208.86.5 Require not ip 103.229.54.107

Step 2: Configure httpd to block them

Totally blocking traffic from these IPs would be easier than what I decided to do. If you just wanted to totally block traffic from Tor then the easy and efficient answer would be to insert all these IPs into an nftables set or an iptables IP set.

For me, it's only some URIs on my web service that I don't want these IPs accessing and I wanted to preserve the ability of Tor's non-abusive users to otherwise use the rest of the service. An httpd access control configuration is necessary.

Inside the virtualhost configuration file I added:

<Location /some/sensitive/thing> <RequireAll> Require all granted Include /etc/apache2/tor-exit-list.conf </RequireAll> </Location>

Step 3: Test configuration and reload

It's a good idea to check the correctness of the httpd configuration now. Aside from syntax errors in the list of IP addresses, this might catch if you forgot any modules necessary for these directives. Although I think they are all pretty core.

Assuming all is well then a graceful reload will be needed to make httpd see the new configuration.

$ sudo apache2ctl configtest Syntax OK $ sudo apache2ctl graceful

Step 4: Further improvements

Things can't be left there, but I haven't got around to any of this yet.

Script the repeated download of the Tor exit node list. The list of active Tor nodes will change over time.

Develop some checks on the list such as:

Does it contain only valid IP addresses?

Does it contain at least min number of addresses and less than max number?

If the list changed, do the config test and reload again. httpd will not include the altered config file without a reload.

If the list has not changed in x number of days, consider the data source stale and think about emptying the list.

Performance thoughts

I have not checked how much this impacts performance. My service is not under enough load for this to be noticeable for me.

At the moment the Tor exit node list is around 2,100 addresses and I don't know how efficient the Apache HTTP Server is about a large list of Require not ip directives. Worst case is that for every request to that URI it will be scanning sequentially through to the end of the list.

I think that using httpd's support for DBM files in RewriteMaps might be quite efficient but this comes with the significant issue that IPv6 addresses have multiple formats, while a DBM lookup will be doing a literal text comparison.

For example, all of the following represent the same IPv6 address:

2001:db8::

2001:0DB8::

2001:Db8:0000:0000:0000:0000:0000:0000

2001:db8:0:0:0:0:0:0

httpd does have built-in functions to upper- or lower-case things, but not to compress or expand an IPv6 address. httpd access control directives are also able to match the request IP against a CIDR net block, although at the moment Dan's Tor node list does only contain individual IP addresses. At a later date one might like to try to aggregate those individual IP addresses into larger blocks.

httpd's RewriteMaps can also query an SQL server. Querying a competent database implementation like PostgreSQL could be made to alleviate some of those concerns if the data were represented properly, though this does start to seem like an awful lot of work just for an access control list!

Over on Fedi, it was suggested that a firewall rule — presumably using an nftables set or iptables IP set, which are very efficient — could redirect matching source IPs to a separate web server on a different port, which would then do the URI matching as necessary.

<nerdsnipe>There does not seem to be an Apache HTTP Server authz module for IP sets. That would be the best of both worlds!</nerdsnipe>

by Andy at November 01, 2024 12:00 AM

October 27, 2024

Chris Wallace – Moving from exist-db 3.0.1 to 6.0.1 6.2.0

Moving from exist-db 3.0.1 to 6.0.1 6.2.0That’s an awful lot of release notes to read through...

by Chris Wallace at October 27, 2024 03:15 PM

October 25, 2024

Andy Smith – Generating a link-local address from a MAC address in Perl

Example

On the host

$ ip address show dev eth0 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether aa:00:00:4b:a0:c1 brd ff:ff:ff:ff:ff:ff [â€¦] inet6 fe80::a800:ff:fe4b:a0c1/64 scope link valid_lft forever preferred_lft forever

Generated by script

$ lladdr.pl aa:00:00:4b:a0:c1 fe80::a800:ff:fe4b:a0c

Code

#!/usr/bin/env perl use warnings; use strict; use 5.010; if (not defined $ARGV[0]) { die "Usage: $0 MAC-ADDRESS" } my $mac = $ARGV[0]; if ($mac !~ /^ \p{PosixXDigit}{2}: \p{PosixXDigit}{2}: \p{PosixXDigit}{2}: \p{PosixXDigit}{2}: \p{PosixXDigit}{2}: \p{PosixXDigit}{2} /ix) { die "'$mac' doesn't look like a MAC address"; } my @octets = split(/:/, $mac); # Algorithm: # 1. Prepend 'fe80::' for the first 64 bits of the IPv6 # 2. Next 16 bits: Use first octet with 7th bit flipped, and second octet # appended # 3. Next 16 bits: Use third octet with 'ff' appended # 4. Next 16 bits: Use 'fe' with fourth octet appended # 5. Next 16 bits: Use 5th octet with 6th octet appended # = 128 bits. printf "fe80::%x%02x:%x:%x:%x\n", hex($octets[0]) ^ 2, hex($octets[1]), hex($octets[2] . 'ff'), hex('fe' . $octets[3]), hex($octets[4] . $octets[5]);

See also

How to do it in bash

Why We Flip the 7th Bit in EUI-64: A Comprehensive Analysis

by Andy at October 25, 2024 12:00 AM

September 13, 2024

Alan Pope – Where are Podcast Listener Communities

Parasocial chat

On Linux Matters we have a friendly and active, public Telegram channel linked on our Contact page, along with a Discord Channel. We also have links to Mastodon, Twitter (not that we use it that much) and email.

At the time of writing there are roughly this â¬‡ï¸� number of people (plus bots, sockpuppets and duplicates) in or following each Linux Matters “official” presence:

Channel Number

Telegram 796

Discord 683

Mastodon 858

Twitter 9919

Preponderance of chat

We chose to have a presence in lots of places, but primarily the ~~talent~~ presenters (Martin, Mark, and myself (and Joe)) only really hang out to chat on Telegram and Mastodon.

I originally created the Telegram channel on November 20th, 2015, when we were publishing the Ubuntu Podcast (RIP in Peace) A.K.A. Ubuntu UK Podcast. We co-opted and renamed the channel when Linux Matters launched in 2023.

Prior to the channel’s existence, we used the Ubuntu UK Local Community (LoCo) Team IRC channel on Freenode (also, RIP in Peace).

We also re-branded our existing Mastodon accounts from the old Ubuntu Podcast to Linux Matters.

We mostly continue using Telegram and Mastodon as our primary methods of communication because on the whole they’re fast, reliable, stay synced across devices, have the features we enjoy, and at least one of them isn’t run by a weird billionaire.

Other options

We link to a lot of other places at the top of the Linux Matters home page, where our listeners can chat, mostly to eachother and not us.

Being over 16, I’m not a big fan of Discord, and I know Mark doesn’t even have an account there. None of us use Twitter much anymore, either.

Periodically I ponder if we (Linux Matters) should use something other than Telegram. I know some listeners really don’t like the platform, but prefer other places like Signal, Matrix or even IRC. I know for sure some non-listeners don’t like Telegram, but I care less about their opinions.

Part of the problem is that I don’t think any of us really enjoy the other realtime chat alternatives. Both Matrix and Signal have terrible user experience, and other flaws. Which is why you don’t tend to find us hanging out in either of those places.

There are further options I haven’t even considered, like Wire, WhatsApp, and likely more I don’t even know or care about.

So we kept using Telegram over any of the above alternative options.

Pondering Posting Polls

I have repeatedly considered asking the listeners about their preferred chat platforms via our existing channels. But that seems flawed, because we use what we like, and no matter how many people prefer something else, we’re unlikely to move. Unless something strange happens ğŸ‘€ .

Plus, often times, especially on decentralised platforms, the audience can be somewhat “over-enthusiastic” about their preferred way being The Wayâ„¢ï¸� over the alternatives. It won’t do us any favours to get data saying 40% report we should use Signal, 40% suggest Matrix and 20% choose XMPP, if the four of us won’t use any of them.

Pursue Podcast Palaver Proposals

So rather than ask our audience, I thought I’d see what other podcasters promote for feedback and chatter on their websites.

I picked a random set from shows I have heard of, and may have listened to, plus a few extra ones I haven’t. None of this is endorsement or approval, I wanted the facts, just the fax, ma’am.

I collated the data in a json file for some reason, then generated the tables below. I don’t know what to do with this information, but it’s a bit of data we may use if we ever decide to move away from Telegram.

Presenting Pint-Sized Payoff

The table shows some nerdy podcasts along with their primary means (as far as I can tell) of community engagement. Data was gathered manually from podcast home pages and “about” pages. I generally didn’t go into the page content for each episode. I made an exception for “Dot Social” and “Linux OTC” because there’s nothing but episodes on their home page.

It doesn’t matter for this research, I just thought it was interesting that some podcasters don’t feel the need to break out their contact details to a separate page, or make it more obvious. Perhaps they feel that listeners are likely to be viewing an episode page, or looking at a specific show metadata, so it’s better putting the contact details there.

I haven’t included YouTube, where many shows publish and discuss, in addition to a podcast feed.

I am also aware that some people exclusively, or perhaps primarily publish on YouTube (or other video platforms). Those aren’t podcasts IMNSHO.

Key to the tables below. Column names have been shorted because it’s a w i d e table. The numbers indicate how many podcasts use that communication platform.

EM - Email address (13/18)

MA - Mastodon account (9/18)

TW - Twitter account (8/18)

DS - Discord server (8/18)

TG - Telegram channel (4/18)

IR - IRC channel (5/18)

DW - Discourse website (2/18)

SK - Slack channel (3/18)

LI - LinkedIn (2/18)

WF - Web form (2/18)

SG - Signal group (3/18)

WA - WhatsApp (1/18)

FB - FaceBook (1/18)

Linux

Show EM MA TW DS TG IR DW SK MX LI WF SG WA FB

Linux Matters âœ… âœ… âœ… âœ… âœ… âœ…

Ask The Hosts âœ… âœ… âœ… âœ… âœ…

Destination Linux âœ… âœ… âœ… âœ… âœ…

Linux Dev Time âœ… âœ… âœ… âœ… âœ…

Linux After Dark âœ… âœ… âœ… âœ… âœ…

Linux Unplugged âœ… âœ… âœ… âœ…

This Week in Linux âœ… âœ… âœ… âœ… âœ…

Ubuntu Security Podcast âœ… âœ… âœ… âœ… âœ…

Linux OTC âœ… âœ… âœ…

Open Source Adjunct

Show EM MA TW DS TG IR DW SK MX LI WF SG WA FB

2.5 Admins âœ… âœ…

Bad Voltage âœ… âœ… âœ… âœ…

Coffee and Open Source âœ…

Dot Social âœ… âœ…

Open Source Security âœ… âœ… âœ…

localfirst.fm âœ…

Other Tech

Show EM MA TW DS TG IR DW SK MX LI WF SG WA FB

ATP âœ… âœ… âœ… âœ…

BBC Newscast âœ… âœ… âœ…

The Rest is Entertainment âœ…

Point

Not entirely sure what to do with this data. But there it is.

Is Linux Matters going to move away from Telegram to something else? No idea.

by Alan Pope at September 13, 2024 04:00 PM

September 12, 2024

Alun Jones – Messing with web spiders

Yesterday I read a Mastodon posting. Someone had noticed that their web site was getting huge amounts of traffic. When they looked into it, they discovered that it was OpenAI's - about 422 words

by Alun Jones at September 12, 2024 11:00 PM

September 06, 2024

Alan Pope – Windows 3.11 on QEMU 5.2.0

This is mostly an informational PSA for anyone struggling to get Windows 3.11 working in modern versions of QEMU. Yeah, I know, not exactly a massively viral target audience.

Anyway, short answer, use QEMU 5.2.0 from December 2020 to run Windows 3.11 from November 1993.

An innocent beginning

I made a harmless jokey reply to a toot from Thom at OSNews, lamenting the lack of native Mastodon client for Windows 3.11.

When I saw Thom’s toot, I couldn’t resist, and booted a Windows 3.11 VM that I’d installed six weeks ago, manually from floppy disk images of MSDOS and Windows.

I already had Lotus Organiser installed to post a little bit of nostalgia-farming on threads - it’s what they do over there.

Post by @popey

View on Threads

I thought it might be fun to post a jokey diary entry. I hurriedly made my silly post five minutes after Thom’s toot, expecting not to think about this again.

Incorrect, brain

I shut the VM down, then went to get coffee, chuckling to my smart, smug self about my successful nerdy rapid-response. While the kettle boiled, I started pondering - “Wait, if I really did want to make a Mastodon client for Windows 3.11, how would I do it?”

I pondered and dismissed numerous shortcuts, including, but not limited to:

Fake it with screenshots doctored in MS Paint

Run an existing DOS Mastodon Client in a Window

Use the Windows Telnet client to connect insecurely to my laptop running the Linux command-line Mastodon client, Toot

Set up a proxy through which I could get to a Mastodon web page

I pondered a different way, in which I’d build a very simple proof of concept native Windows client, and leverage the Mastodon API. I’m not proficient in (m)any programming languages, but felt something like Turbo Pascal was time-appropriate and roughly within my capabilities.

Diversion

My mind settled on Borland Delphi, which I’d never used, but looked similar enough for a silly project to Borland Turbo Pascal 7.0 for DOS, which I had. So I set about installing Borland Delphi 1.0 from fifteen (virtual) floppy disks, onto my Windows 3.11 “Workstation” VM.

Thank you, whoever added the change floppy0 option to the QEMU Monitor. That saved a lot of time, and was reduced down to a loop of this fourteen times:

"Please insert disk 2" CTRL+ALT+2 (qemu) change floppy 0 Disk02.img CTRL+ALT+1 [ENTER]
During my research for this blog, I found a delightful, nearly decade-old video of David Intersimone (“David I”) running Borland Delphi 1 on Windows 3.11. David makes it all look so easy. Watch this to get a moving-pictures-with-sound idea of what I was looking at in my VM.

Once Delphi was installed, I started pondering the network design. But that thought wasn’t resident in my head for long, because it was immediately replaced with the reason why I didn’t use that Windows 3.11 VM much beyond the original base install.

The networking stack doesn’t work. Or at least, it didn’t.

That could be a problem.

Retro spelunking

I originally installed the VM by following this guide, which is notable as having additional flourishes like mouse, sound, and SVGA support, as well as TCP/IP networking. Unfortunately I couldn’t initially get the network stack working as Windows 3.11 would hang on a black screen after the familiar OS splash image.

Looking back to my silly joke, those 16-bit Windows-based Mastodon dreams quickly turned to dust when I realised I wouldn’t get far without an IP address in the VM.

Hopes raised

After some digging in the depths of retro forums, I stumbled on a four year-old repo maintained by Jaap Joris Vens.

Here’s a fully configured Windows 3.11 machine with a working internet connection and a load of software, games, and of course Microsoft BOB ðŸ¤“

Jaap Joris published this ready-to-go Windows 3.11 hard disk image for QEMU, chock full of games, utilities, and drivers. I thought that perhaps their image was configured differently, and thus worked.

However, after downloading it, I got the same “black screen after splash” as with my image. Other retro enthusiasts had the same issue, and reported the details on this issue, about a year ago.

does not work, black screen.

It works for me and many others. Have you followed the instructions? At which point do you see the black screen?

The key to finding the solution was a comment from Jaap Joris pointing out that the disk image “hasn’t changed since it was first committed 3 years ago”, implying it must have worked back then, but doesn’t now.

Joy of Open Source

I figured that if the original uploader had at least some success when the image was created and uploaded, it is indeed likely QEMU or some other component it uses may have (been) broken in the meantime.

So I went rummaging in the source archives, looking for the most recent release of QEMU, immediately prior to the upload. QEMU 5.2.0 looked like a good candidate, dated 8th December 2020, a solid month before 18th January 2021 when the hda.img file was uploaded.

If you build it, they will run

It didn’t take long to compile QEMU 5.2.0 on my ThinkPad Z13 running Ubuntu 24.04.1. It went something like this. I presumed that getting the build dependencies for whatever is the current QEMU version, in the Ubuntu repo today, will get me most of the requirements.

$ sudo apt-get build-dep qemu $ mkdir qemu $ cd qemu $ wget https://download.qemu.org/qemu-5.2.0.tar.xz $ tar xvf qemu-5.2.0.tar.xz $ cd qemu-5.2.0 $ ./configure $ make -j$(nproc)
That was pretty much it. The build ran for a while, and out popped binaries and the other stuff you need to emulate an old OS. I copied the bits required directly to where I already had put Jaap Joris’ hda.img and start script.

$ cd build $ cp qemu-system-i386 efi-rtl8139.rom efi-e1000.rom efi-ne2k_pci.rom kvmvapic.bin vgabios-cirrus.bin vgabios-stdvga.bin vgabios-vmware.bin bios-256k.bin ~/VMs/windows-3.1/
I then tweaked the start script to launch the local home-compiled qemu-system-i386 binary, rather than the one in the path, supplied by the distro:

$ cat start #!/bin/bash ./qemu-system-i386 -nic user,ipv6=off,model=ne2k_pci -drive format=raw,file=hda.img -vga cirrus -device sb16 -display gtk,zoom-to-fit=on
This worked a treat. You can probably make out in the screenshot below, that I’m using Internet Explorer 5 to visit the GitHub issue which kinda renders when proxied via FrogFind by Action Retro.

Share…

I briefly toyed with the idea of building a deb of this version of QEMU for a few modern Ubuntu releases, and throwing that in a Launchpad PPA then realised I’d need to make sure the name doesn’t collide with the packaged QEMU in Ubuntu.

I honestly couldn’t be bothered to go through the pain of effectively renaming (forking) QEMU to something like OLDQEMU so as not to damage existing installs. I’m sure someone could do it if they tried, but I suspect it’s quite a search and replace, or move the binaries somewhere under /opt. Too much effort for my brain.

I then started building a snap of qemu as oldqemu - which wouldn’t require any “real” forking or renaming. The snap could be called oldqemu but still contain qemu-system-i386 which wouldn’t clash with any existing binaries of the same name as they’d be self-contained inside the compressed snap, and would be launched as oldqemu.qemu-system-i386.

That would make for one package to maintain rather than one per release of Ubuntu. (Which is, as I am sure everyone is aware, one of the primary advantages of making snaps instead of debs in the first place.)

Anyway, I got stuck with another technical challenge in the time I allowed myself to make the oldqemu snap. I might re-visit it, especially as I could leverage the Launchpad Build farm to make multiple architecture builds for me to share.

…or not

In the meantime, the instructions are above, and also (roughly) in the comment I left on the issue, which has kindly been re-opened.

Now, about that Windows 3.11 Mastodon client…

by Alan Pope at September 06, 2024 01:40 PM

August 30, 2024

Alan Pope – Virtual Zane Lowe for Spotify

tl;dr

I bodged together a Python script using Spotipy (not a typo) to feed me #NewMusicDaily in a Spotify playlist.

No AI/ML, all automated, “fresh” tunes every day. Tunes that I enjoy get preserved in a Keepers playlist; those I don’t like to get relegated to the Sleepers playlist.

Any tracks older than eleven days are deleted from the main playlist, so I automatically get a constant flow of new stuff.

Nutshell

The script automatically populates this Virtual Zane Lowe playlist with semi-randomly selected songs that were released within the last week or so, no older (or newer).

I listen (exclusively?) to that list for a month, signaling songs I like by hitting a button on Spotify.

Every day, the script checks for ’expired’ songs whose release date has passed by more than 11 days.

The script moves songs I don’t like to the Sleepers playlist for archival (and later analysis), and to stop me hearing them.

It moves songs I do like to the Keepers playlist, so I don’t lose them (and later analysis).

Goto 1.

I can run the script at any time to “top up” the playlist or just let it run regularly to drip-feed me new music, a few tracks at a time.

Clearly, once I have stashed some favourites away in the Keepers pile, I can further investigate those artists, listen to their other tracks, and potentially discover more new music.

Below I explain at some length how and why.

NoCastAuGast

I spent an entire month without listening to a single podcast episode in August. I even unsubscribed from everything and deleted all the cached episodes.

Aside: Fun fact: The Apple Podcasts app really doesn’t like being empty and just keeps offering podcasts it knows I once listened to despite unsubscribing. Maybe I’ll get back into listening to these shows again, but music is on my mind for now.

While this is far from a staggering feat of human endeavour in the face of adversity, it was a challenge for me, given that I listened to podcasts all the time. This has been detailed in various issues of my personal email newsletter, which goes out on Fridays and is archived to read online or via RSS.

In August, instead, I re-listened to some audio books I previously enjoyed and re-listened to a lot of music already present on my existing Spotify playlists. This became a problem because I got bored with the playlists. Spotify has an algorithm that can feed me their idea of what I might want, but I decided to eschew their bot and make my own.

Note: I pay for Spotify Premium, then leveraged their API and built my “application” against that platform. I appreciate some people have Strong Opinions™️ about Spotify. I have no plans to stop using Spotify anytime soon. Feel free to use whatever music service you prefer, or self-host your 64-bit, 192 kHz Hi-Res Audio from HDTracks through an Elipson P1 Pre-Amp & DAC and Cary Audio Valve MonoBlok Power Amp in your listening room. I don’t care.

I’ll be here, listening on my Apple AirPods, or blowing the cones out of my car stereo. Anyway…

I spent the month listening to great (IMHO) music, predominantly released in the (distant) past on playlists I chronically mis-manage. On the other hand, my son is an expert playlist curator, a skill he didn’t inherit from me. I suspect he “gets the aux” while driving with friends, partly due to his Spotify playlist mastery.

As I’m not a playlist charmer, I inevitably got bored of the same old music during August, so I decided it was time for a change. During the month of September, my goal is to listen to as much new (to me) music as I can and eschew the crusty playlists of 1990s Brit-pop and late-70s disco.

How does one discover new music though?

Novel solutions

I wrote a Python script.

Hear me out. Back in the day, there was an excellent desktop music player for Linux called Banshee. One of the great features Banshee users loved was “Smart Playlists.” This gave users a lot of control over how a playlist was generated. There was no AI, no cloud, just simple signals from the way you listen to music that could feed into the playlist.

Watch a youthful Jorge Castro from 13 years ago do a quick demo.

Aside: Banshee was great, as were many other Mono applications like Tomboy and F-Spot. It’s a shame a bunch of blinkered, paranoid, noisy, and wrong Linux weirdos chased the developers away, effectively killing off those excellent applications. Good job, Linux community.

Hey ho. Moving on. Where was I…

Spotify clearly has some built-in, cloud-based “smarts” to create playlists, recommendations, and queues of songs that its engineers and algorithm think I might like. There’s a fly in the ointment, though, and her name is Alexa.

No, Alexa, NO!

We have a “Smart” speaker in the kitchen; the primary music consumers are not me. So “my” listening history is now somewhat tainted by all the Chase Atlantic & Central Cee my son listens to and Michael (fucking) Bublé, my wife, enjoys. She enjoys it so much that Bublé has featured on my end-of-year “Spotify Unwrapped” multiple times.

I’m sure he’s a delightful chap, but his stuff differs from my taste.

I had some ideas to work around all this nonsense. My goals here are two-fold.

I want to find and enjoy some new music in my life, untainted by other house members.

Feed the Spotify algorithm with new (to me) artists, genres and songs, so it can learn what else I may enjoy listening to.

Obviously, I also need to do something to muzzle the Amazon glossy screen of shopping recommendations and stupid questions.

The bonus side-quest is learning a bit more Python, which I completed. I spent a few hours one evening on this project. It was a fun and educational bit of hacking during time I might otherwise use for podcast listening. The result is four hundred or so lines of Python, including comments. My code, like my blog, tends to be a little verbose because I’m not an expert Python developer.

I’m pretty positive primarily professional programmers potentially produce petite Python.

Not me!

Noodling

My script uses the Spotify API via Spotipy to manage an initially empty, new, “dynamic” playlist. In a nutshell, here’s what the python script does with the empty playlist over time:

Use the Spotify search API to find tracks and albums released within the last eleven days to add to the playlist. I also imposed some simple criteria and filters.

Tracks must be accessible to me on a paid Spotify account in Great Britain.

The maximum number of tracks on the playlist is currenly ninety-four, so there’s some variety, but not too much as to be unweildy. Enough for me to skip some tracks I don’t like, but still have new things to listen to.

The maximum tracks per artist or album permitted on the playlist is three, again, for variety. Initially this was one, but I felt it’s hard to fully judge the appeal of an artist or album based off one song (not you: Black Lace), but I don’t want entire albums on the list. Three is a good middle-ground.

The maximum number of tracks to add per run is configurable and was initially set at twenty, but I’ll likely reduce that and run the script more frequently for drip-fed freshness.

If I use the “favourite” or “like” button on any track in the list before it gets reaped by the script after eleven days, the song gets added to a more permanent keepers playlist. This is so I can quickly build a collection of newer (to me) songs discovered via my script and curated by me with a single button-press.

Delete all tracks released more than eleven days ago if I haven’t favourited them. I chose eleven days to keep it modern (in theory) and fresh (foreshadowing). Technically, the script does this step first to make room for additional new songs.

None of this is set in stone, but it is configurable with variables at the start of the script. I’ll likely be fiddling with these through September until I get it “right,” whatever that means for me. Here’s a handy cut-out-and-keep block diagram in case that helps, but I suspect it won’t.

+-----------------------------+ | Spotify (Cloud) | | +---------------------+ | | | Main Playlist | | | +---------------------+ | | | | | | Like | | Dislike | | v | | | +---------------------+ | | | Keeper Playlist | | | +---------------------+ | | | | | v | | +---------------------+ | | | Sleeper Playlist | | | +---------------------+ | +-------------+---------------+ ^ | v +---------------------------+ | Python Script | | +---------------------+ | | | Calls Spotify API | | | | and Manages Songs | | | +---------------------+ | +---------------------------+
Next track

The expectation is to run this script automatically every day, multiple times a day, or as often as I like, and end up with a frequently changing list of songs to listen to in one handy playlist. If I don’t like a song, I’ll skip it, and when I do like a song, I’ll likely play it more than once. and maybe click the “Like” icon.

My theory is that the list becomes a mix between thirty and ninety artists who have released albums over the previous rolling week. After the first test search on Tuesday, the playlist contained 22 tracks, which isn’t enough. I scaled the maximum up over the next few days. It’s now at ninety-four. If I exhaust all the music and get bored of repeats, I can always up the limit to get a few new songs.

In fact, on the very first run of the script, the test playlist completely filled with songs from one artist who had just released a new album. That triggered the implementation of the three songs per artist/album rule to reduce that happening.

I appreciate listening to tracks out of sequence, and a full album is different from the artist intended. But thankfully, I don’t listen to a lot of Adele, and the script no longer adds whole albums full of songs to the list. So, no longer a “me” problem.

No AI

I said at the top I’m not using any “AI/ML” in my script, and while that’s true, I don’t control what goes on inside the Spotify datacentre. The script is entirely subject to the whims of the Spotify API as to which tracks get returned to my requests. There are some constraints to the search API query complexity, and limits on what the API returns.

The Spotify API documentation has been excellent so far, as has the Spotipy docs.

Popular songs and artists often organically feature prominently in the API responses. Plus (I presume) artists and labels have financial incentives or an active marketing campaign with Spotify, further skewing search results. Amusingly, the API has an optional (and amusing) “hipster” tag to show the bottom 10% of results (ranked by popularity). I did that once, didn’t much like it, and won’t do it again.

It’s also subject to the music industry publishing music regularly, and licensing it to be streamed via Spotify where I live.

Not quite

With the script as-is, initially, I did not get fresh new tunes every single day as expected, so I had a further fettle to increase my exposure to new songs beyond what’s popular, trending, or tagged “new”. I changed the script to scan the last year of my listening habits to find genres of music I (and the rest of the family) have listened to a lot.

I trimmed this list down (to remove the genre taint) and then fed these genres to the script. It then randomly picks a selection of those genres and queries the API for new releases in those categories.

With these tweaks, I certainly think this script and the resulting playlist are worth listening to. It’s fresher and more dynamic than the 14-year-old playlist I currently listen to. Overall, the script works so that I now see songs and artists I’ve not listened to—or even heard of—before. Mission (somewhat) accomplished.

Indeed, with the genres feature enabled, I could add a considerable amount of new music to the list, but I am trying to keep it a manageable size, under a hundred tracks. Thankfully, I don’t need to worry about the script pulling “Death Metal,” “Rainy Day,” and “Disney” categories out of thin air because I can control which ones get chosen. Thus, I can coerce the selection while allowing plenty of randomness and newness.

I have limited the number of genre-specific songs so I don’t get overloaded with one music category over others.

Not new

There are a couple of wrinkles. One song that popped into the playlist this week is “Never Going Back Again” by Fleetwood Mac, recorded live at The Forum, Inglewood, in 1982. That’s older than the majority of what I listened to in all of August! It looks like Warner Records Inc. released that live album on 21st August 2024, well within my eleven-day boundary, so it’s technically within “The Rules” while also not being fresh, new music.

There’s also the compilation complication. Unfresh songs from the past re-released on “TOP HITS 2024” or “DANCE 2024 100 Hot Tracks” also appeared in my search criteria. For example, “Talk Talk” by Charli XCX, from her “Brat” album, released in June, is on the “DANCE 2024 100 Hot Tracks” compilation, released on 23rd August 2024, again, well within my eleven-day boundary.

I’m in two minds about these time-travelling playlist interlopers. I have never knowingly listened to Charli XCX’s “Brat” album by choice, nor have I heard live versions of Fleetwood Mac’s music. I enjoy their work, but it goes against the “new music” goal. But it is new to me which is the whole point of this exercise.

The further problem with compilations is that they contain music by a variety of artists, so they don’t hit the “max-per-artist” limit but will hit the “max-per-album” rule. However, if the script finds multiple newly released compilations in one run, I might end up with a clutch of random songs spread over numerous “Various Artists” albums, maxing out the playlist with literal “filler.”

I initially allowed compilations, but I’m irrationally bothered that one day, the script will add “The Birdie Song” by Black Lace as part of “DEUTSCHE TOP DISCO 3000 POP GEBURTSTAG PARTY TANZ SONGS ZWANZIG VIERUNDZWANZIG”.

Nein.

I added a filter to omit any “album type: compilation,” which knocks that bopping-bird-based botherer squarely on the bonce.

No more retro Europop compilation complications in my playlist. Alles klar.

Not yet

Something else I had yet to consider is that some albums have release dates in the future. Like a fresh-faced newborn baby with an IDE and API documentation, I assumed that albums published would generally have release dates of today or older. There may be a typo in the release_date field, or maybe stuff gets uploaded and made public ahead of time in preparation for a big marketing push on release_date.

I clearly do not understand the music industry or publishing process, which is fine.

Nuke it from orbit

I’ve been testing the script while I prototyped it, this week, leading up to the “Grand Launch” in September 2024 (next month/week). At the end of August I will wipe the slate (playlist) clean, and start again on 1st September with whatever rules and optimisations I’ve concocted this week. It will almost certainly re-add some of the same tracks back-in after the 31st August “Grand Purge”, but that’s as expected, working as designed. The rest will be pseudo-random genre-specific tracks.

I hope.

Newsletter

I will let this thing go mad each day with the playlist and regroup at the end of September to evaluate how this scheme is going. Expect a follow-up blog post detailing whether this was a fun and interesting excursion or pure folly. Along the way, I did learn a bit more Python, the Spotify API, and some other interesting stuff about music databases and JSON.

So it’s all good stuff, whether I enjoy the music or not.

You can get further, more timely updates in my weekly email newsletter, or view it in the newsletter archive, and via RSS, a little later.

Ken said he got “joy out of reading your newsletter”. YMMV. E&OE. HTH. HAND.

Nomenclature

Every good project needs a name. I initially called it my “Personal Dynamic Playlist of Sixty tracks over Eleven days,” or PDP-11/60 for short, because I’m a colossal nerd. Since bumping the max-tracks limit for the playlist, it could be re-branded PDP-11/94. However, this is a relatively niche and restrictive playlist naming system, so I sought other ideas.

My good friend Martin coined the term “Virtual Zane Lowe” (Zane is a DJ from New Zealand who is apparently renowned for sharing new music). That’s good enough for me. Below are links to all three playlists if you’d like to listen, laugh, live, love, or just look at them.

Virtual Zane Lowe Playlist on Spotify.

Keepers

Sleepers

The “Keepers” and “Sleepers” lists will likely be relatively empty for a few days until the script migrates my preferred and disliked tracks over for safe-keeping & archival, respectively.

November approaches

Come back at the end of the month to see if: My script still works. The selections are good. I’m still listening to this playlist, and most importantly. Whether I enjoy doing so!

If it works, I’ll probably continue using it through October and into November as I commute to and from the office. If that happens, I’ll need to update the playlist artwork. Thankfully, there’s an API for that, too!

I may consider tidying up the script and sharing it online somewhere. It feels a bit niche and requires a paid Spotify account to even function, so I’m not sure what value others would get from it other than a hearty chuckle at my terribad Python “skills.”

One potentially interesting option would be to map the songs in Spotify to another, such as Apple Music or even videos on YouTube. The YouTube API should enable me to manage video playlists that mirror the ones I manage directly on Spotify. That could be a fun further extension to this project.

Another option I considered was converting it to a web app, a service I (and other select individuals) can configure and manage in a browser. I’ll look into that at the end of the month. If the current iteration of the script turns out to be a complete bust, then this idea likely won’t go far, either.

Thanks for reading. AirPods in. Click “Shuffle”.

by Alan Pope at August 30, 2024 12:00 PM

August 15, 2024

Ross Younger – Broadcast graphics for fencing

I created a TV graphics package for fencing tournaments.

Earlier this year, Christchurch played host to the Commonwealth Junior & Cadet fencing tournament.

Selected parts of the tournament were livestreamed, with a package broadcast on Sky TV (NZ). The broadcast and finals streams had a live graphics package fed from the scoreboard.

The programmes were produced using a broadcast-spec OB truck supplied by Kiwi Outside Broadcast. The truck graphics PC used Captivate to generate its graphics, which output as key+fill SDI signals. These were fed to the vision mixer and keyed onto the picture in the usual way.

The package can be seen in action on the Commonwealth Junior & Cadet 2024 programmes.

The details read from the scoreboard cover the “hit” lamps, scores, clock (including fractional seconds in the last 10s), period, red/yellow cards, and the priority indicator. On top of that, the package provides a place to enter the fencer names and nationalities, and set colours for them.

This is all made possible by the scoreboard, a Favero FA-07, offering a data feed over an RS-422 interface. I wrote a Python script to parse the data feed, turn it into a JSON dictionary and pass on to Captivate.

by Ross Younger at August 15, 2024 09:16 AM

August 07, 2024

Alan Pope – Text Editors with decent Grammar Tools

This is another blog post lifted wholesale out of my weekly newsletter. I do this when I get a bit verbose to keep the newsletter brief. The newsletter is becoming a blog incubator, which I’m okay with.

A reminder about that newsletter

The newsletter is emailed every Friday - subscribe here, and is archived and available via RSS a few days later.

I talked a bit about the process of setting up the newsletter on episode 34 of Linux Matters Podcast. Have a listen if you’re interested.

Patreon supporters of Linux Matters can get the show a day or so early and without adverts. ðŸ™�

Multiple kind offers

Good news, everyone! I now have a crack team of fine volunteers who proofread the text that lands in your inbox/browser cache/RSS reader. Crucially, they’re doing that review before I send the mail, not after, as was previously the case. Thank you for volunteering, mystery proofreaders.

popey dreamland

Until now, my newsletter “workflow” (such as it was) involved hoping that I’d get it done and dusted by Friday morning. Then, ideally, it would spend some time “in review”, followed by saving to disk. But if necessary, it would be ready to be opened in an emergency text editor at a moment’s notice before emails were automatically sent by lunchtime.

I clearly don’t know me very well.

popey reality

What actually happened is that I would continue editing right up until the moment I sent it out, then bash through the various “post-processing” steps and schedule the emails for “5 minutes from now.” Boom! Done.

This often resulted in typos or other blemishes in my less-than-lovingly crafted emails to fabulous people. A few friends would ping me with corrections. But once the emails are sent, reaching out and fixing those silly mistakes is problematic.

Someone should investigate over-the-air updates to your email. Although zero-day patches and DLC for your inbox sound horrendous. Forget that.

In theory, I could tweak the archived version, but that is not straightforward.

Tool refresh?

Aside: Yes, I know it’s not the tools, but I should slow down, be more methodical and review every change to my document before publishing. I agree. Now, let’s move on.

While preparing the newsletter, I would initially write in Sublime Text (my desktop text editor of choice), with a Grammarlyâ€ (affiliate link) LSP extension, to catch my numerous blunders, and re-word my clumsy English.

Unfortunately, the Grammarly extension for Sublime broke a while ago, so I no longer have that available while I prepare the newsletter.

I could use Google Docs, I suppose, where Grammarly still works, augmenting the built-in spell and grammar checker. But I really like typing directly as Markdown in a lightweight editor, not a big fat browser. So I guess I need to figure something else out to check my spelling and grammar prior to the awesome review team getting it to save at least some of my blushes.

I’m not looking for suggestions for a different text editorâ€”or am I? Maybe I am. I might be.

Sure, that’ll fix it.

ZX81 -> Spectrum -> CPC -> edlin -> Edit -> Notepad -> TextPad -> Sublime -> ?

I’ve used a variety of text editors over the years. Yes, the ZX81 and Sinclair Spectrum count as text editors. Yes, I am old.

I love Sublime’s minimalism, speed, and flexibility. I use it for all my daily work notes, personal scribblings, blog posts, and (shock) even authoring (some) code.

I also value Sublime’s data-recovery features. If the editor is “accidentally” terminated or a power-loss event occurs, Sublime reliably recovers its state, retaining whatever you were previously editing.

I regularly use Windows, Linux, and macOS on any given day across multiple computers. So, a cross-platform editor is also essential for me, but only on the laptop/desktop, as I never edit on mobileâ€¡ devices.

I typically just open a folder as a “workspace” in a window or an additional tab in one window. I frequently open many folders, each full of files across multiple displays and machines.

All my notes are saved in folders that use Syncthing to keep in sync across machines. I leave all of those notes open for days, perhaps weeks, so having a robust sync tool combined with an editor that auto-reloads when files change is key.

Their notes are separately backed up, so cloud storage isn’t essential for my use case.

Something else?

Whatever else I pick, it’s really got to fit that model and requirements, or it’ll be quite a stretch for me to change. One option I considered and test-drove is NotepadNext. It’s an open-source re-implementation of Notepad++, written in C++ and Qt.

A while back, I packaged up and published it as a snap, to make it easy to install and update. It fits many of the above requirements already, with the bonus of being open-source, but sadly, there is no Grammarly support there either.

I’d prefer no :::: W I D E - L O A D :::: electron monsters. Also, not Notion or Obsidian, as I’ve already tried them, and I’m not a fan. In addition, no, not Vim or Emacs.

Bonus points if you have a suggestion where one of the selling points isn’t “AI”Â§.

Perhaps there isn’t a great plain text editor that fulfills all my requirements. I’m open to hearing suggestions from readers of this blog or the newsletter. My contact details are here somewhere.

â€ - Please direct missives about how terrible Grammarly is to /dev/null. Thanks. Further, suggestions that I shouldn’t rely on Grammarly or other tools and should just “Git Gud” (as the youths say) may be inserted into the A1481 on the floor.

â€¡ - I know a laptop is technically a “mobile” device.

Â§ - Yes, I know that “Not wanting AI” and “Wanting a tool like Grammarly” are possibly conflicting requirements.

â—‡ - For this blog post I copy and pasted the entire markdown source into a Google doc, used all the spelling and grammar tools, then pasted it back into Sublime, pushed to git, and I’m done. Maybe that’s all I need to do? Keep my favourite editor, and do all the grammar in one chunk at the end in a tab of a browser I already had open anyway. Beat that!

by Alan Pope at August 07, 2024 09:00 AM

July 25, 2024

Andy Smith – Daniel Kitson – Collaborator (work in progress)

Collaborators

Last night we went to see Daniel Kitson's "Collaborator" (work in progress). I'd no idea what to expect but it was really good!

The in-the-round setup of Collaborator at The Albany Theatre, Deptford, London

It has been reviewed at 4/5 stars in Chortle and positively in the Guardian, but I don't recommend reading any reviews because they'll spoil what you will experience. We went in to it blind as I always prefer that rather than thorough research of a show. I think that was the correct decision. I've been on Daniel's fan newsletter for ages but hadn't had chance to see him live until now.

While I've seen some comedy gigs that resembled this, I've never seen anything quite like it.

At £12 a ticket this is an absolute bargain. We spent more getting there by public transport!

Shout out to the nerds

If you're a casual comedy enjoyer looking for something a bit different then that's all you need to know. If like me however you consider yourself a bit of a wanky appreciator of comedy as an art form, I have some additional thoughts!

Collaborator wasn't rolling-on-the-floor-in-tears funny, but was extremely enjoyable and Jenny and I spent the whole way home debating how Kitson designed it and what parts of it really meant. Not everyone wants that in comedy, and that's fine. I don't always want it either. But to get it sometimes is a rare treat.

It's okay to enjoy a McIntyre or Peter Kay crowd-pleaser about "do you have a kitchen drawer full of junk?" or "do you remember white dog poo?" but it's also okay to appreciate something that's very meta and deconstructive. Stewart Lee for example is often accused of being smug and arrogant when he critiques the work of other comedians, and his fans to some extent are also accused of enjoying feeling superior more than they enjoy a laugh - and some of them who miss the point definitely are like this.

But acts like Kitson and Lee are constructed personalities where what they claim to think and how they behave is a fundamental part of the performance. You are to some extent supposed to disagree with and be challenged by their views and behaviours — and I don't just mean they are edgelording with some "saying the things that can't be said" schtick. Sometimes it's fun to actually have thoughts about it. It's a different but no less valid (or more valid!) experience. A welcome one in this case!

I mean, I might have to judge you if you enjoy Mrs Brown's Boys, but I accept it has an audience as an art form.

White space

There was a comment on Fedi about how the crowd pictured here appears to be a sea of white faces, despite London being a fairly diverse city. This sort of thing hasn't escaped me. I've found it to be the case in most of the comedy gigs I've attended in person, where the performer is white. I don't know why. In fact, acts like Stewart Lee and Richard Herring will frequently make reference to the fact that their stereotypical audience member is a middle aged white male computer toucher with lefty London sensibilities. So, me then.

Don't get me wrong, I do try to see some diverse acts and have been in a demographic minority a few times. Sadly enough, just going to see a female act can be enough to put you in an audience of mostly women. That happened when we went to see Bridget Christie's Who Am I? ("a menopause laugh a minute with a confused, furious, sweaty woman who is annoyed by everything", 4 stars, Chortle), and it's a shame that people seem to stick in their lanes so much.

References

"BEST OF Peter Kay's Very British STAND UP" – Peter Kay

"Explaining the letter from a pirate joke" and "Michael McIntyre", If You Prefer a Milder Comedian, Please Ask For One, Glasgow - Stewart Lee (2010)

Stewart Lee: The Audience is the Problem – Comedy Without Errors (2020)

"Mrs Brown's Ultimate Funniest Moments | Mrs Brown's Boys" – Britbox (2020)

"Bridget Christie", Who Am I? – Bridget Christie (2023)

by Andy at July 25, 2024 12:00 AM

May 21, 2024

Josh Holland – Even more on git scratch branches: using Jujutsu

Even more on git scratch branches: using Jujutsu
21 May 2024

This is the third post in an impromptu series:

Use a scratch branch in git

More on git scratch branches: using stgit

It seems the main topic of this blog is now git scratch branches and ways to manage them, although the main prompt for this one is discovering someone else had exactly the same idea, as I found from a blog post extolling Jujutsu.

I don’t have much to add to the posts from qword and Sandy, beyond the fact that Jujutsu really is the perfect tool to make this workflow straightforward. The default change selection logic in jj rebase means that 9 times out of 10 it’s enough just to run jj rebase -d master to get everything up to date with the master branch, and the Jujutsu workflow as a whole really is a great experience.

So go forth, use Jujutsu to manage your dev branch, and hopefully I’ll never have to write another post on this, and you can have the traditional “I rewrote my blogging engine from scratch again” post that I’ve been owing for a month or two now.

by Josh Holland at May 21, 2024 12:00 AM

April 29, 2024

Ross Younger – Fault-finding at the ends of the earth

This is a tale from many months ago, working on an embedded ARM target.

In my private journal I wrote:

Today I feel like I saddled up and rode my horse to the literal ends of the earth. I was fault-finding in the setting-up-of-the-universe that happens before your program starts up, and in the tearing-it-down-again that happens after you declare you’re finished.

If you know C++, you might guess that this was a story about static object allocation and deallocation. You’d be right. So, destructors belonging to static-allocated objects. You’d never think they’d run on a bare-metal embedded target.

Well, they can. If your target supports exit() - e.g. if you are running with newlib - then an atexit handler is set up for you, and that will be set up to run the static destructors. If your program then calls exit() (as, say, your on-silicon unit tests might, at the end of a test run) then things are at risk of turning to custard.

You might have enabled an interrupt for some peripheral on the silicon. In order to do anything really useful, the ISR might reference a static object. If you do this, you’d damn well better make sure the object has a static destructor that disables the interrupt, or hilarity is one day going to ensue. You know, the sort of hilarity that involves being savaged by a horde of angry rampaging badgers, or your socks catching fire.

But wait, I hear you say, it called exit! The program no longer exists! Well, sure it doesn’t; but what happens on exit? On this particular ARM target, running tests via a debugger as part of a CI chain, when the atexit handlers have run the process signals final completion with a semihosting call, which is a special flavour of debug breakpoint. It is… not fast. If your interrupt happens regularly, the goblins are going to get you before the pseudo-system-call completes. Your test framework will fail the test executable for hanging, despite somehow passing all of its tests.

There was an actual bug in there, and it was mine. Class X, which contained an RTOS queue and enabled an interrupt, only had a default destructor. On exit, somewhen between static destructors and completion of the semihosting exit call, the ISR fired. It duly failed to insert an item into the now-destroyed queue, so jumped to the internal panic routine. That routine contained a breakpoint and then went nowhere fast, waiting for a debugger command that was never going to arrive — hence the time-out. Maybe it would have been useful to have a library option to skip the static destructors, but I probably wouldn’t have been aware of it ahead of time anyway.

The static destructor ordering fiasco can also be yours for the taking, but thankfully that hadn’t bitten me. Nevertheless, it was a rough day.

Cover image: Cyber Bug Search, by juicy_fish on Freepik

by Ross Younger at April 29, 2024 01:29 AM

January 09, 2024

Phil Spencer – Who’s sick of this shit yet?

I find some headlines just make me angry these days, especially ones centered around hyper late stage capitalism.

This one about Apple and Microsoft just made me go “Who the fuck cares?” and seriously, why should I care. those two idiot companies having insane and disgustingly huge market caps isn’t something I’m impressed by.

If anything it makes me furious.

Do something useful besides making iterations of the same ol junk. Make a few thousand houses, make an affordable grocery supply chain.

If you’re doing anything else you’re a waste of everyones time….as I type this on my Apple computer. Still, that bit of honesty aside I don’t give a fuck about either companies made up valuation.

by KingPhil at January 09, 2024 05:22 AM

January 02, 2024

Phil Spencer – New year new…..This

I have made a new years goal to retire this server before March, the OS has been upgraded many many times over the years and various software I’ve used has come and gone so there is lots of cruft. This server/VM started in San Fransisco and then my provider stopped offering VMs in CA and moved my VM to the UK which is where it has been ever since. This VM started its life in Jan 2008 and it is time to die.

During my 2 week xmas break I have been updating web facing software as much as I could so that when I do put the bullet in the head of this thing I can transfer my blog, wiki, and a couple other still active sites to the new OS without minimal tweaking in the new home.

So far the biggest issues I ran into were with my mediawiki, that entire site is very old, from around 2006 2 years before I started hosting it for someone and then I inherited it entirely around 2009 so the database is very finicky to upgrade and some of the extensions are no longer maintained. What I ended up doing was setting up a docker instance at home to test upgrading and working through the kinks and I have put together a solid step by step on how to move/upgrade it to latest.

I have also gotten sick of running my own e-mail servers, the spam management, certificates, block lists…..it’s annoying. I found out recently that iCloud which I already have a subscription to allows up to 5 custom e-mail domains so I retired my Philtopia e-mail to it early in December and as of today I moved the vo-wiki domain to it as well. Much less hassle for me, I already work enough for work I don’t need to work at home as well.

The other work continues, site by site but I think I am on track to put an end to this ol server early in the year.

by KingPhil at January 02, 2024 06:29 AM

December 31, 2023

Phil Spencer – 8bit party

It’s been a few years…four? since my Commodore 64 collection started and I’ve now got 2 working C64’s and a C128 that functions along with 2 disk drives, a tape drive and a collection of addon hardware and boxed games.

That isn’t all I am collecting however I also have my Nintendo Entertainment System and even more recently I acquired a Sega Master System. The 8bit era really seems to catch my eye far more than anything that came after. I suppose it’s because the whole era made it on hacks and luck.

In any case here are some pictures of my collection, I don’t collect for the sake of collecting. Everything I have I use or play cause otherwise why bother having it?

Enjoy

My desk

NES

Sega Master System

Commodore 64

Games

by KingPhil at December 31, 2023 07:30 AM

December 30, 2023

Phil Spencer – I think it’s time the blog came back

It’s been a while since I’ve written a blog post, almost 4 years in fact but I think it is time for a comeback.

The reason for this being that social media has become so locked down you can’t actually give a valid opinion about something without someone flagging your comment or it being caught by a robot. Oddly enough it seems the right wing folks can say whatever they want against the immigrant villain of the month or LGTBQIA+ issues without being flagged but if you dare stand up to them or offer an opposing opinion. 30 day ban!

So it is time to dust off the ol blog and put my opinions to paper somewhere else just like the olden days before social media! It isn’t all bad of course, I’ve found mastodon quite open to opinions but the fediverse is getting a lot of corporate attention these days and i’m sure it’s only a year or two before even that ends up a complete mess.

Crack open the blogs and let those opinions fly

by KingPhil at December 30, 2023 05:11 PM

October 29, 2023

Paul Rayner – Print (only) my public IP

Every now and then, I need to know my public IP. The easiest way to find it is to visit one of the sites which will display it for you, such as https://whatismyip.com. Whilst useful, all of the ones I know (including that one) are chock full of adverts, and can’t easily be scraped as part of any automated scripting.

This has been a minor irritation for years, so the other night I decided to fix it.

http://ip.pr0.uk is my answer. It’s 50 lines of rust, and is accessible via tcp on port 11111, and via http on port 8080.

use std::io::Write; use std::net::{IpAddr, Ipv4Addr, Ipv6Addr, SocketAddr, TcpListener, TcpStream}; use chrono::Utc; use threadpool::ThreadPool; fn main() { let worker_count = 4; let pool = ThreadPool::new(worker_count); let tcp_port = 11111; let socket_v4_tcp = SocketAddr::new(IpAddr::V4(Ipv4Addr::new(0, 0, 0, 0)), tcp_port); let http_port = 8080; let socket_v4_http = SocketAddr::new(IpAddr::V4(Ipv4Addr::new(0, 0, 0, 0)), http_port); let socket_addrs = vec![socket_v4_tcp, socket_v4_http]; let listener = TcpListener::bind(&socket_addrs[..]); if let Ok(listener) = listener { println!("Listening on {}:{}", listener.local_addr().unwrap().ip(), listener.local_addr().unwrap().port()); for stream in listener.incoming() { let stream = stream.unwrap(); let addr =stream.peer_addr().unwrap().ip().to_string(); if stream.local_addr().unwrap_or(socket_v4_http).port() == tcp_port { pool.execute(move||send_tcp_response(stream, addr)); } else { //http might be proxied via https so let anything which is not the tcp port be http pool.execute(move||send_http_response(stream, addr)); } } } else { println!("Unable to bind to port") } } fn send_tcp_response(mut stream:TcpStream, addr:String) { stream.write_all(addr.as_bytes()).unwrap(); } fn send_http_response(mut stream:TcpStream, addr:String) { let html = format!("<html><head><title>{}</title></head><body><h1>{}</h1></body></html>", addr, addr); let length = html.len(); let response = format!("HTTP/1.1 200 OK\r\nContent-Length: {length}\r\n\r\n{html}" ); stream.write_all(response.as_bytes()).unwrap(); println!("{}\tHTTP\t{}",Utc::now().to_rfc2822(),addr) }

A little explanation is needed on the array of SocketAddr. This came from an initial misreading of the docs, but I liked the result and decided to keep it that way. Calls to listen() will only listen on one port - the first one in the array which is free. The result is that when you run this program, it listens on port 11111. If you keep it running and start another copy, that one listens on port 80 (because it can’t bind to port 11111). So to run this on my server, I just have systemd keep 2 copies alive at any time.

The code and binaries for Linux and Windows are available on Github.

Next steps

I might well leave it there. It works for me, so it’s done. Here are some things I could do though:

1) Don’t hard code the ports 2) Proxy https 3) make a client 4) make it available as a binary for anyone to run on crates.io 5) Optionally print the ttl. This would be mostly useful to people running their own instance.

Boring Details

Logging

I log the IP, port, and time of each connection. This is just in case it ever gets flooded and I need to block an IP/range. The code you see above is the code I run. No browser detection, user agent or anythign like that is read or logged. Any data you send with the connection is discarded. If I proxied https via nginx, that might log a bit of extra data as a side effect.

Systemd setup

There’s not much to this either. I have a template file:

[Unit] Description=Run the whatip binary. Instance %i After=network.target [Service] ExecStart=/path/to/whatip Restart=on-failure StandardOutput=syslog StandardError=syslog SyslogIdentifier=whatip%i [Install] WantedBy=multi-user.target

stored at /etc/systemd/system/whatip@.service and then set up two instances to run:

systemctl enable whatip@1 systemctl enable whatip@2

Thanks for reading

by Paul Rayner at October 29, 2023 11:10 AM

September 18, 2023

Alun Jones – Messing with web spiders

You've surely heard of ChatGPT and its ilk. These are massive neural networks trained using vast swathes of text. The idea is that if you've trained a network on enough - about 467 words

by Alun Jones at September 18, 2023 11:00 PM

May 21, 2023

Alex Hudson – Jobs in the AI Future

Everyone is talking about what AI can do right now, and the impact that it is likely to have on us. This weekends’s Semafor Flagship (which is an excellent newsletter; I recommend subscribing!) asks a great question: “What do we teach the AI generation?”. As someone who grew up with computers, knowing he wanted to write software, and knowing that tech was a growth area, I never had to grapple with this type of worry personally. But I do have kids now. And I do worry. I’m genuinely unsure what I would recommend a teenager to do today, right now. But here’s my current thinking.

© Alex Hudson at May 21, 2023 01:10 PM

January 29, 2023

Paul Rudkin – Your new post

Your new post

This is a new blog post. You can author it in Markdown, which is awesome.

by Paul Rudkin at January 29, 2023 09:49 AM

February 06, 2022

Paul Rayner – Putting dc in (chroot) jail

A little over 4 years ago, I set up a VM and configured it to offer dc over a network connection using xinetd. I set it up at http://dc.pr0.uk and made it available via a socket connection on port 1312.

Yesterday morning I woke to read a nice email from Sylvan Butler pointing out that users could run shell commands from dc…

I had set up the dc command to run as a user “dc”, but still, if someone could run a shell command they could, for example, put a key in the dc user’s .ssh config, run sendmail (if it was set up), try for privelidge escalations to get root etc.

I’m not sure what the 2017 version of me was thinking (or wasn’t), but the 2022 version of me is not happy to leave it like this. So here’s how I put dc in jail.

Firstly, how do you run shell commands from dc? It’s very easy. Just prefix with a bang:

$ dc !echo "I was here" > /tmp/foo !cat /tmp/foo I was here

So, really easy. Even if it was hard, it would still be bad.

This needed to be fixed. Firstly I thought about what else was on the VM - nothing that matters. This is a good thing because the helpful Sylvan might not have been the first person to spot the issue (although network dc is pretty niche). I still don’t want this vulnerability though as someone else getting access to this box could still use it to send spam, host malware or anything else they wanted to do to a cheap tiny vm.

I looked at restricting the dc user further (it had no login shell, and no home directory already), but it felt like I would always be missing something, so I turned to chroot jails.

A chroot jail lets you run a command, specifying a directory which is used as / for that command. The command (In theory) can’t escape that directory, so can’t see or touch anything outside it. Chroot is a kernel feature, and forms a basic security feature of linux, so should be good enough to protect network dc if set up correctly, even if it’s not perfect.

Firstly, let’s set up the directory for the jail. We need the programs to run inside the jail, and their dependent libraries. The script to run a networked dc instance looks like this:

#!/bin/bash dc --version sed -u -e 's/\r/\n/g' | dc

Firstly, I’ve used bash here, but this script is trivial, so it can use sh instead. We also need to keep the sed (I’m sure there are plenty of ways to do the replace not using sed, but it’s working fine as it is). For each of the 3 programs we need to run the script, I ran ldd to get their dependencies:

$ ldd /usr/bin/dc linux-vdso.so.1 => (0x00007fffc85d1000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fc816f8d000) /lib64/ld-linux-x86-64.so.2 (0x0000555cd93c8000) $ ldd /bin/sh linux-vdso.so.1 => (0x00007ffdd80e0000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fa3c4855000) /lib64/ld-linux-x86-64.so.2 (0x0000556443a1e000) $ ldd /bin/sed linux-vdso.so.1 => (0x00007ffd7d38e000) libselinux.so.1 => /lib/x86_64-linux-gnu/libselinux.so.1 (0x00007faf5337f000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007faf52fb8000) libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x00007faf52d45000) libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007faf52b41000) /lib64/ld-linux-x86-64.so.2 (0x0000562e5eabc000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007faf52923000) $

So we copy those files to the exact directory structure inside the jail directory. Afterwards it looks like this:

$ ls -alR .: total 292 drwxr-xr-x 4 root root 4096 Feb 5 10:13 . drwxr-xr-x 4 root root 4096 Feb 5 09:42 .. -rwxr-xr-x 1 root root 47200 Feb 5 09:50 dc -rwxr-xr-x 1 root root 72 Feb 5 10:13 dctelnet drwxr-xr-x 3 root root 4096 Feb 5 09:49 lib drwxr-xr-x 2 root root 4096 Feb 5 09:50 lib64 -rwxr-xr-x 1 root root 72504 Feb 5 09:58 sed -rwxr-xr-x 1 root root 154072 Feb 5 10:06 sh ./lib: total 12 drwxr-xr-x 3 root root 4096 Feb 5 09:49 . drwxr-xr-x 4 root root 4096 Feb 5 10:13 .. drwxr-xr-x 2 root root 4096 Feb 5 10:01 x86_64-linux-gnu ./lib/x86_64-linux-gnu: total 2584 drwxr-xr-x 2 root root 4096 Feb 5 10:01 . drwxr-xr-x 3 root root 4096 Feb 5 09:49 .. -rwxr-xr-x 1 root root 1856752 Feb 5 09:49 libc.so.6 -rw-r--r-- 1 root root 14608 Feb 5 10:00 libdl.so.2 -rw-r--r-- 1 root root 468920 Feb 5 10:00 libpcre.so.3 -rwxr-xr-x 1 root root 142400 Feb 5 10:01 libpthread.so.0 -rw-r--r-- 1 root root 146672 Feb 5 09:59 libselinux.so.1 ./lib64: total 168 drwxr-xr-x 2 root root 4096 Feb 5 09:50 . drwxr-xr-x 4 root root 4096 Feb 5 10:13 .. -rwxr-xr-x 1 root root 162608 Feb 5 10:01 ld-linux-x86-64.so.2 $

and here is the modified dctelnet command:

#!/sh #dc | dos2unix 2>&1 ./dc --version ./sed -u -e 's/\r/\n/g' | ./dc

I’ve switched to using sh instead of bash, and all of the commands are now relative paths, as they are just in the root directory.

First attempt

Now I have a directory that I can use for a chrooted dc network dc. I need to set up the xinetdconfig to use chroot and the jail I have set up:

service dc { disable = no type = UNLISTED id = dc-stream socket_type = stream protocol = tcp server = /usr/sbin/chroot server_args = /home/dc/ ./dctelnet user = root wait = no port = 1312 rlimit_cpu = 60 env = HOME=/ PATH=/ }

I needed to set the HOME and PATH environment variables otherwise (not sure whether it was sh,sed or dc causing it) I got a segfault, and to run chroot, you need to be root, so I could no longer run the service as the user dc. This shouldn’t be a problem because the resulting process is constrained.

A bit more security

Chroot jails have a reputation for being easy to get wrong, and they are not something I have done a lot of work with, so I want to take a bit of time to think about whether I’ve left any glaring holes, and also try to improve on the simple option above a bit if I can.

Firstly, can dc still execute commands with the ! operation?

~> nc -v dc.pr0.uk 1312 Connection to dc.pr0.uk 1312 port [tcp/*] succeeded! dc (GNU bc 1.06.95) 1.3.95 Copyright 1994, 1997, 1998, 2000, 2001, 2004, 2005, 2006 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE, to the extent permitted by law. !ls ^C⏎

Nope. Ok, that’s good. The chroot jail has sh though, and has it in the PATH, so can it still get a shell and call dc, sh and sed?

~> nc -v dc.pr0.uk 1312 Connection to dc.pr0.uk 1312 port [tcp/*] succeeded! dc (GNU bc 1.06.95) 1.3.95 Copyright 1994, 1997, 1998, 2000, 2001, 2004, 2005, 2006 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE, to the extent permitted by law. !pwd ^C⏎

pwd is a builtin, so it looks like the answer is no, but why? Running strings on my version of dc, there is no mention of sh or exec, but there is a mention of system. From the man page of system:

The system() library function uses fork(2) to create a child process that executes the shell command specified in command using execl(3) as follows: execl("/bin/sh", "sh", "-c", command, (char *) 0);

So dc calls system() when you use !, which makes sense. system() calls /bin/sh, which does not exist in the jail, breaking the ! call.

For a system that I don’t care about, that is of little value to anyone else, that sees very little traffic, that’s probably good enough, but I want to make it a bit better - if there was a problem with the dc program, or you could get it to pass something to sed, and trigger an issue with that, you could mess with the jail file system, overwrite the dc application, and likely break out of jail as the whole thing is running as root.

So I want to do two things. Firstly, I don’t want dc running as root in the jail. Secondly, I want to throw away the environment after each use, so if you figure out how to mess with it you don’t affect anyone else’s fun.

Here’s a bash script which I think does both of these things:

#!/bin/bash set -e DCDIR="$(mktemp -d /tmp/dc_XXXX)" trap '/bin/rm -rf -- "$DCDIR"' EXIT cp -R /home/dc/ $DCDIR/ cd $DCDIR/dc PATH=/ HOME=/ export PATH export HOME /usr/sbin/chroot --userspec=1001:1001 . ./dctelnet

Line 2 - set -e causes the script to exit on the first error

Lines 3 & 4 - make a temporary directory to run in, then set a trap to clean it up when the script exits.

I then copy the required files for the jail to the new temp directory, set $HOME and SPATH and run the jail as an unprivileged user (uid 1001).

Now to make some changes to the xinetd file:

service dc { disable = no type = UNLISTED id = dc-stream socket_type = stream protocol = tcp server = /usr/local/bin/dcinjail user = root wait = no port = 1312 rlimit_cpu = 60 log_type = FILE /var/log/dctelnet.log log_on_success = HOST PID DURATION log_on_failure = HOST }

The new version just runs the script from above. It still needs to run as root to be able to chroot.

I’ve also added some logging as this has piqued my interest and I want to see how many people (other than me) ever connect, and for how long.

As always, I’m interested in feedback or questions. I’m no expert in this setup so may not be able to answer questions, but if you see something that looks wrong (or that you know is wrong), please let me know. I’m also interested to hear other ways of process isolation - I know I could have used containers, and think I could have used systemd or SELinux features (or both) to further lock down the dc user and achive a similar result.

Thanks for reading.

by Paul Rayner at February 06, 2022 09:43 PM

January 27, 2022

Christopher Roberts – Fixing SVG Files in DokuWiki

Having upgraded a DokuWiki server from 16.04 to 18.04, I found that SVG images were no longer displaying in the browser. As I was unable to find any applicable answers on-line, I thought I should break my radio silence by detailing my solution.

Inspecting the file using browser tools, Network and refreshing the page showed that the file was being downloaded as octet-stream. Sure enough using curl showed the same.

curl -Ik https://example.com/file.svg

All the advice on-line is to ensure that /etc/nginx/mime-types includes the line:

image/svg+xml svg svgz;

But that was already in place.

I decided to try uploading the SVG file again, in case the Inkscape format was causing breakage. Yes, a long-shot indeed.

The upload was rejected by DokuWiki, as SVG was not in the list of allowed file extensions; so I added the following line to /var/www/dokuwiki/conf/mime.local.conf:

svg image/svg_xml

Whereon the images started working again. Presumably Dokuwiki was seeing the mime-type as image/svg instead of image/svg+xml and this mismatch was preventing nginx serving up the correct content-type.

Hopefully this will help others, do let me know if it has helped you.

by Christopher Roberts at January 27, 2022 10:00 AM

January 09, 2022

Paul Rayner – Snakes and Ladders, Estimation and Stats (or 'Sometimes It Takes Ages')

Snakes And Ladders

Simple kids game, roll a dice and move along a board. Up ladders, down snakes. Not much to it?

We’ve been playing snakes and ladders a bit (lot) as a family because my 5 year old loves it. Our board looks like this:

Some games on this board take a really long time. My son likes to play games till the end, so until all players have finished. It’s apparently really funny when everyone else has finished and I keep finding the snakes over and over. Sometimes one player finishes really quickly - they hit some good ladders, few or no snakes and they are done in no time.

This got me thinking. What’s the distribution of game lengths for snakes and ladders? How long should we expect a game to take? How long before we typically have a winner?

Fortunately for me, snakes and ladders is a very simple game to model with a bit of python code.

Firstly, here are the rules we play:

1) Each player rolls a normal 6 sided dice and moves their token that number of squares forward. 2) If a player lands on the head of a snake, they go down the snake 3) If a player lands on the bottom of a ladder, they go up to the top of the ladder. 4) If a player rolls a 6, they get another roll 5) On this board, some ladders and snakes interconnect - the bottom of a snake is the head of another, or the top of a ladder is also the head of a snake. When this happens, you do all of the actions in turn, so down both snakes or up the ladder, down the snake. 6) You don’t need an exact roll to finish, once you get 100 or more, you are done.

To model the board in python, all we really need are the coordinates of the snakes and the ladders - their starting and ending squares.

def get_snakes_and_ladders(): snakes = [ (96,27), (88,66), (89,46), (79,44), (76,19), (74,52), (57,3), (60,39), (52,17), (50,7), (32,15), (30,9) ] ladders = [ (6,28), (10,12), (18,37), (40,42), (49,67), (55,92), (63,76), (61,81), (86,94) ] return snakes + ladders

Since snakes and ladders are both mappings from one point to another, we can combine them in one array as above.

The game is moddeled with a few lines of python:

class Game: def __init__(self) -> None: self.token = 1 snakes_and_ladders_list = get_snakes_and_ladders() self.sl = {} for entry in snakes_and_ladders_list: self.sl[entry[0]] = entry[1] def move(self, howmany): self.token += howmany while (self.token in self.sl): self.token = self.sl[self.token] return self.token def turn(self): num = self.roll() self.move(num) if num == 6: self.turn() if self.token>=100: return True return False def roll(self): return randint(1,6)

A turn consists of all the actions taken by a player before the next player gets their turn. This can consist of multiple moves if the player rolles one or more sixes, as rolling a six gives you another move.

With this, we can run some games and plot them. Here’s what a sample looks like.

The Y axis is the position on the board, and the X axis is the number of turns. This small graphical representation of the game shows how variable it can be. The red player finishes in under 20 moves, whereas the orange player takes over 80.

To see how variable it is, we can run the simulation a large number of times and look at the results. Running for 10,000 games we get the following:

function result

min 5

max 918

mean 90.32

median 65

So the fastest finish in 10,000 games was just 5 turns, and the slowest was an awful (if you were rolling the dice) 918 turns.

Here are some histograms for the distribution of game lengths, the distribution of number of turns for a player to win in a 3 person game, and the number of turns for all players to finish in a 3 person game.

The python code for this post is at snakes.py

by Paul Rayner at January 09, 2022 03:23 PM

Channel	Number
Telegram	796
Discord	683
Mastodon	858
Twitter	9919

function	result
min	5
max	918
mean	90.32
median	65

Footnotes

Info
Blogs from BitFolk, its customers and hangers-on.
Unless otherwise stated articles are the work of their individual authors and do not represent the opinions of BitFolk.
For additions / removals please see BitFolk's Wiki .
Last updated:
August 16, 2025 05:05 PM
Powered by:
Export:

Subscriptions

Alan Pope
Spotlighting Community Stories
Where are Podcast Listener Communities
Windows 3.11 on QEMU 5.2.0
Virtual Zane Lowe for Spotify
Text Editors with decent Grammar Tools

Alex Hudson
Jobs in the AI Future

Alun Jones
Faking a JPEG
Lithium Ion Discharge Curve
Managing load from abusive web bots
Messing with web spiders
Messing with web spiders

Andy Smith
I recommend avoiding the need to have panretinal photocoagulation (PRP) laser treatment
Check yo PTRs
Protecting URIs from Tor nodes with the Apache HTTP Server
Generating a link-local address from a MAC address in Perl
Daniel Kitson – Collaborator (work in progress)

BitFolk Issue Tracker
Auth. DNS - Feature #220: Configure a.authns.bitfolk.co.uk to not send notifies based on NS records
Auth. DNS - Feature #220 (Resolved): Configure a.authns.bitfolk.co.uk to not send notifies based ...
Auth. DNS - Feature #220 (Resolved): Configure a.authns.bitfolk.co.uk to not send notifies based ...
Billing - Feature #219 (Closed): Add host name to data transfer reports
Billing - Feature #219 (In Progress): Add host name to data transfer reports

BitFolk Wiki
Hardware refresh, 2025-2026
Hardware refresh, 2025-2026
Hardware refresh, 2025-2026
Hardware refresh, 2025-2026
Hardware refresh, 2025-2026

Chris Wallace
Art at Southmead Hospital
Art at Southmead Hospital
Trees of Essaouira
Trees of Essaouira
Moving from exist-db 3.0.1 to 6.0.1 6.2.0

Christopher Roberts
Fixing SVG Files in DokuWiki

David Leadbeater
CVE-2025-48384: Breaking git with a carriage return and cloning RCE
Can your terminal do emojis? How big?
Blink and you'll miss it — a URL handler surprise
Using HAProxy to protect me from scrapers
Déjà vu: Ghostly CVEs in my terminal title

Dominic Cleal

Graham Bleach

James Beckett

Jon Fautley

Jon Spriggs
How I deploy Vaultwarden to provide a Bitwarden compatible service in Kubernetes with Monitoring and Backups
Talk Summary – An Eulogy for Auntie Pat
Building a Linux Firewall with AlmaLinux 9, NetworkManager, BGP, DHCP and NFTables with Puppet
Quick Tip: Don’t use concat in your spreadsheet, use textjoin!
A few weird issues in the networking on our custom AWS EKS Workers, and how we worked around them

Josh Holland
Worstsort yet again: polymorphic recursion
KubeCon Europe 2025: days 2-3 (at last)
KubeCon Europe 2025: day 1
Even more on git scratch branches: using Jujutsu

Laura Hobbs

Paul Rayner
Print (only) my public IP
Putting dc in (chroot) jail
Snakes and Ladders, Estimation and Stats (or 'Sometimes It Takes Ages')

Paul Rudkin
Your new post

Phil Spencer
Social media crackdowns
Who’s sick of this shit yet?
New year new…..This
8bit party
I think it’s time the blog came back

Richard Wallman

Ross Younger
A web toy: Forced ranking assistant
Getting value from CI
Announcing qcp
Broadcast graphics for fencing
Fault-finding at the ends of the earth

Stuart Swindells

Show	EM	MA	TW	DS	TG	IR	DW	SK	MX	LI	WF	SG	FB
Linux Matters	âœ…	âœ…	âœ…	âœ…	âœ…						âœ…
Ask The Hosts	âœ…			âœ…	âœ…	âœ…			âœ…
Destination Linux		âœ…	âœ…	âœ…				âœ…				âœ…
Linux Dev Time	âœ…			âœ…	âœ…	âœ…			âœ…
Linux After Dark	âœ…			âœ…	âœ…	âœ…			âœ…
Linux Unplugged			âœ…							âœ…		âœ…	âœ…
This Week in Linux		âœ…	âœ…	âœ…				âœ…				âœ…
Ubuntu Security Podcast	âœ…	âœ…	âœ…			âœ…	âœ…
Linux OTC	âœ…	âœ…		âœ…

Show	EM	MA	TW	DW	SK	LI	WF
2.5 Admins	âœ…	âœ…
Bad Voltage	âœ…		âœ…	âœ…	âœ…
Coffee and Open Source							âœ…
Dot Social	âœ…	âœ…
Open Source Security	âœ…	âœ…				âœ…
localfirst.fm			âœ…

Planet BitFolk

August 13, 2025

Phil Spencer – Social media crackdowns

August 06, 2025

BitFolk Wiki – Hardware refresh, 2025-2026

August 05, 2025

BitFolk Wiki – Hardware refresh, 2025-2026

July 27, 2025

BitFolk Wiki – Hardware refresh, 2025-2026

BitFolk Wiki – Hardware refresh, 2025-2026

July 23, 2025

BitFolk Wiki – Hardware refresh, 2025-2026

July 21, 2025

Ross Younger – A web toy: Forced ranking assistant

July 09, 2025

David Leadbeater – CVE-2025-48384: Breaking git with a carriage return and cloning RCE

July 05, 2025

Josh Holland – Worstsort yet again: polymorphic recursion

June 24, 2025

David Leadbeater – Can your terminal do emojis? How big?

June 22, 2025

BitFolk Issue Tracker – Auth. DNS - Feature #220: Configure a.authns.bitfolk.co.uk to not send notifies based on NS records

June 21, 2025

BitFolk Issue Tracker – Auth. DNS - Feature #220 (Resolved): Configure a.authns.bitfolk.co.uk to not send notifies based ...

June 17, 2025

David Leadbeater – Blink and you'll miss it — a URL handler surprise

June 16, 2025

Josh Holland – KubeCon Europe 2025: days 2-3 (at last)

Day 2

HSBC

Peptone

Spotify

Apple

CERN/Linux Foundation/OpenInfra

Day 3

Keynotes

June 14, 2025

BitFolk Issue Tracker – Auth. DNS - Feature #220 (Resolved): Configure a.authns.bitfolk.co.uk to not send notifies based ...

April 28, 2025

David Leadbeater – Using HAProxy to protect me from scrapers

April 06, 2025

Jon Spriggs – How I deploy Vaultwarden to provide a Bitwarden compatible service in Kubernetes with Monitoring and Backups

April 02, 2025

Josh Holland – KubeCon Europe 2025: day 1

Explain How Kubernetes Works With GPU Like I’m 5

Carlos Santana, AWS

Bringing Agentic AI to Cloud Native - Introducing kagent

Christian Posta, Solo.io

Booths - Clickhouse & Wiz

Poster - Enhancing Research and Data Delivery With the Data Delivery System (DDS)

Álvaro Revuelta, SciLifeLab Data Centre & Valentin Georgiev, Uppsala University

An Introduction to Capture The Flag

Andy Martin & Kevin Ward, ControlPlane

The Life (or Death) of a Kubernetes Request, 2025 Edition

Abu Kashem, Red Hat Inc. & Stefan Schimanski, Upbound

Flux Ecosystem Evolution

Stefan Prodan, ControlPlane & Sanskar Jaiswal, Kong

The Ultimate Container Challenge: An Interactive Trivia Game on OCI, Podman, Docker…

Aurélie Vache, OVHCloud & Sherine Khoury, Red Hat

Museum of Weird Bugs: Our Favorites From 8 Years of Service Mesh Debugging

Alex Leong, Buoyant

Clash Loop Back Off

March 25, 2025

Alun Jones – Faking a JPEG

February 17, 2025

Jon Spriggs – Talk Summary – An Eulogy for Auntie Pat

Jon Spriggs – Building a Linux Firewall with AlmaLinux 9, NetworkManager, BGP, DHCP and NFTables with Puppet

The scenario

A note on IP addresses and DNS names used in this document

Building the Proof of Concept

Working from a common base

Building a Puppetserver for testing your module

Setting up a Puppet Server

Getting modules into the server

Defining what the clients will get

Building a firewall

Initialising the Puppet Module

Defining the interfaces

A brief note on my understanding of BGP

Routing with BGP and FRR