Nix on the argon hpc
How to set up nix on the University of Iowa’s high performance compute cluster.
Preface
I rather like nix as a tool for managing software dependencies. I use it for managing all of my projects as I like the isolation it provides in addition it is mostly the case that “if it works today it will work tomorrow”1. I recently had a need to use the University of Iowa’s compute cluster, Argon, and really wanted to be able to use nix to manage my dependencies on the cluster. There were some little gotcha’s so I figured I would document this for future travelers.
If you are an argon user and don’t want to read more, I have a script! It may work for other clusters, I do not know.
Local Nix
So the biggest problem is that we do not have permissions to do anything on argon. Luckily since 2.10 nix will automatically do a chroot store:
On Linux, if /nix doesn’t exist and cannot be created and you’re not running as root, Nix will automatically use ~/.local/share/nix/root as a chroot store. This enables non-root users to download the statically linked Nix binary and have it work out of the box…
Great so all we have to do is download the statically linked Nix that they refer to. Except, there is no mention (that I can find anyway) of where to find this statically linked nix. There is an issue open to try and make this more discoverable but I wouldn’t hold my breathe. Long story short: you have to go to hydra, their ci engine, and download it from there. In particular:
https://hydra.nixos.org/job/nix/maintenance-2.20/buildStatic.x86_64-linux/latest/download-by-type/file/binary-dist
seems to be the url which resolves to the latest 2.20 build.
So we download this, put it somewhere on our $PATH
and everything just
works… right? Wrong. Here is a rapid-fire list of problems and their
solutions.
-
On argon, for reasons, our home directory is a symlink.
nix
does not like this. To fix this we need toexport HOME=$(realpath "$HOME")
-
nix
has to know where to find ssl certs for some reason. This is done with theNIX_SSL_CERT_FILE
environment variable. On argon, and probably for many others, these certs are in/etc/ssl/certs/ca-bundle.crt
-
We need to symlink all the
nix
tools to point to thenix
binary we download. For example we neednix-collect-garbage
to point tonix
- A modern git is not available by default, this is an issue if you want to use flakes, the easy solution is to enable the git module
If you don’t want to manage all of that, the aforementioned script to do all this still exists.
Issues
This solution is good enough for what I’m doing at the moment, however there are some downsides you should probably know about:
- The nix store is in your home directory, accessing this is slow on compute nodes as it happens over the network
- The compilers in nix will not take advantage of the hardware as well as compilers provided by the cluster might be able to
-
The chroot is only activated when a
nix
command is called. This is… fine, but it does mean if you submit a job from within a nix shell the job has to also call nix. There might be a clever way around this by doing something in yourbashrc
or writing an lmod module but I have not tried
My biggest complaint is caching. I want to be able to build stuff on my local machine and upload it to Argon. I tried several times to make this a reality but it just wouldn’t work. Currently have a pin in that and am hoping to figure it out but for now argon has to compile code or download it itself.
Other solutions
Overall I am pretty happy with this solution and intend to stick with it for the foreseeable future. However there are other possible solutions for getting code managed by nix to run on a system which you do not have permissions to install nix. I leave you with a short sampling of others I ran into while figuring this all out.
Nix bundle
Nix bundle sounds like the perfect solution. In theory the workflow should look like:
-
Run
nix bundle
- Copy the resulting file to argon
- Run the file
- Profit
And it does work! The problem is the resulting executable is sandboxed away from the system. This is probably the right behavior but really stopped me from doing what I needed to, I couldn’t find any way around it. This issue talks more about it. Oh, also the resulting executable takes forever to start.
Nix portable
Local nix requires that user namespaces be enabled. If user namespaces are not enabled one could use nix-portable or hpc-nix. Without user namespaces you pay a performance cost on building derivations.
Containers
Argon supports apptainer which itself support OCI images. Nix can build OCI images2. In theory it should be relatively simple (famous last words) to build a containerized version of your workflow with nix. I have not personally gone down this road so I do not know how bumpy it is.
Nixie
My understanding is nixie tries to give you a nix script which you can ship with your code which will defer to nix if it is installed and if not go through the song and dance I put above. I have not tried it but it came up in my search so I figured I would mention it.
-
If you ever have to run ancient software nix can even make this possible: https://blinry.org/nix-time-travel/↩︎
-
There is even an official tutorial for building said containers↩︎