Toolbx is a release blocker for Fedora 39 onwards
This is the second instalment of my 2023 retrospective series on Toolbx. 1
One very important thing that we did behind the scenes was to make Toolbx a release blocker for Fedora 39 and onwards. This means that the registry.fedoraproject.org/fedora-toolbox
OCI image is considered a release-blocking deliverable, and there are release-blocking test criteria to ensure that the toolbox
RPM is usable.
Why do that?
Earlier, there was no formal requirement for Toolbx to be usable when a new Fedora was released. That was a problem for a tool that’s so popular and provides something as fundamental as an interactive command line environment for software development and troubleshooting the host operating system. Everybody expects their CLI environment to just work even under very adverse conditions, and Toolbx should be no different. Except that Toolbx is slightly more complicated than running Bash or Z shell directly on the host OS, and, therefore, requires a bit more diligence.
Toolbx has two parts — an OCI image, which defaults to registry.fedoraproject.org/fedora-toolbox
on Fedora hosts, and the toolbox
RPM. The OCI image is pulled by the RPM to set up a containerized interactive CLI environment.
Let’s look at each separately.
The image
First, we wanted to ensure that there is an up to date fedora-toolbox
OCI image published on registry.fedoraproject.org as a release-blocking deliverable at critical points in the development schedule, just like the installation ISOs for the Editions from download.fedoraproject.org. For example, when an upcoming Fedora release is branched from Rawhide, and for the Beta and Final releases.
One of the recurring complaints that we used to get were from users of Fedora Rawhide Toolbx containers, when Rawhide gets branched in preparation for the Beta for the next Fedora release. At this point, the previous Rawhide version becomes the Branched version, and the current Rawhide version increases by one. If the fedora-toolbox
images aren’t part of the mass branching performed by Fedora Release Engineering, then someone has to quickly step in after they have finished to refresh the images to ensure consistency. This sort of ad hoc manual co-ordination rarely works, and it left users in the lurch.
With this change, the fedora-toolbox
image is part of the nightly Fedora composes, and the branching is handled by Fedora Release Engineering just like any other release-blocking deliverable. This makes the image as readily available and updated as the fedora and fedora-minimal OCI images or any other deliverable, and we hope that it will improve the user experience for Rawhide Toolbx containers.
If someone installs the Fedora Beta or the Final on their host, and creates a Toolbx container using the default image, then, barring exceptions, the host and the container now have the same RPM versions for all packages. Just like Fedora Silverblue and Workstation are released with the same versions. This ensures greater consistency in terms of bug-fixes, features and pending updates.
In the past, this wasn’t the case and it led to occasional surprises. For example, the change to make RPM use a Sequoia based OpenPGP parser made it impossible to install third party RPMs in the fedora-toolbox
image, even long after the actual bug was fixed.
The RPM
Second, we wanted to have release-blocking test criteria to ensure that the toolbox
RPM is usable at critical points in the development schedule. This is to ensure that changes in the Toolbx stack, and future changes in other parts of the operating system do not break Toolbx — at least not for the Beta and Final releases. It’s good to have the fedora-toolbox image be more readily available and updated, but it’s better if Toolbx works more reliably as a whole.
Examples of changes in the Toolbx stack causing breakage can be FUSE preventing RPMs with file capabilities from being installed inside Toolbx containers, Toolbx bind mounts preventing RPMs with %attr()
from being installed or causing systemd-tmpfiles(8) to throw errors, etc.. Examples of changes in other parts of the OS can be changes to Fedora’s Kerberos stack causing Kerberos to stop working inside Toolbx containers, changes to the sysctl(8)
configuration breaking ping(8)
, changes in Mutter breaking graphical applications, etc..
The test criteria for the toolbox
RPM also implicitly tests the fedora-toolbox
image, and co-ordinates several disparate groups of developers to ensure that the containerized interactive command line Toolbx environments on Fedora are just as reliable as those running directly on the host OS.
Tooling changes
This does come with a significant tooling change that isn’t obvious at first. The fedora-toolbox
OCI image is no longer defined as a layered image through a Container/Dockerfile. Instead, it’s built as a base image through Kickstarts and Pungi, just like the fedora
and fedora-minimal
images.
This was necessary because the nightly Fedora composes work with Kickstarts and Pungi, not Container/Dockerfiles. Moreover, building Fedora OCI images from a Dockerfile with fedpkg container-build
uses an ancient unmaintained version of OpenShift Build Service that requires equally unmaintained ancient versions of Fedora to run, and the fedora-toolbox
image was the only thing using Container/Dockerfiles in Fedora.
We either had to update the Fedora infrastructure to use OpenShift Build Service 2.x; or use Kickstarts and Pungi, which uses Image Factory, to build the fedora-toolbox
image. We chose the latter, because updating the infrastructure would be a significant effort, and by using Kickstarts and Pungi we get to stay close to the fedora
and fedora-minimal
images and simplify the infrastructure.
The Fedora Flatpaks were also being built using the same ancient and unmaintained version of OpenShift Build Service, and they too are in the process being migrated. However, that’s outside the scope of this post.
One big benefit of fedora-toolbox
not being a layered image based on top of the fedora
image is that it removes the constant fight against the efforts to minimize the size of the latter. The fedora-toolbox
image is designed for interactive command line use in long-lived containers, and not for deploying server-side applications and services in ephemeral ones. This means that dictionaries, documentation, locales, iconv converter modules, translations, etc. are more important than reducing the size of the images. Now that the image is built from scratch, it has full control over what goes into it.
Unfortunately, Image Factory is weakly maintained and setting it up on one’s local machine is a lot more complicated than using podman build
. One can do scratch builds on the Fedora infrastructure with koji image-build --scratch
, but only if they have been explicitly granted permissions, and then they have to download the tarball and use skopeo copy
to place them in containers-storage
so that Podman can see it. All that is again more complicated than doing a podman build
.
Due to this difficulty of untangling the image build from the Fedora infrastructure, we haven’t published the sources of the fedora-toolbox
image for recent Fedora versions upstream. We do have a fedora-toolbox:39
image defined through a Container/Dockerfile, but that was done purely as a contingency during the Fedora 39 development cycle.
This does degrade the developer experience of working on the fedora-toolbox
image, but, given all the other advantages, we think that it’s worth it.
As of this writing, there’s a Fedora 40 Change to switch to using KIWI to build the OCI images, including fedora-toolbox
, instead of Image Factory. KIWI seems more strongly maintained and a lot easier to set up locally, which is fantastic. So, it should be all rainbows and unicorns, once we soldier through another port of the fedora-toolbox
image to a different tooling and source language.
Acknowledgements
Last but not the least, getting all this done on time required a good deal of co-ordination and help from several different individuals. I must thank Sumantro for leading the effort; Kevin, Tomáš and Samyak for all the infrastructure and release engineering work; and Adam and Kamil for all the testing and validation.