Table of Contents

WSL Setup — systemd, journald, .wslconfig, kernel

A hygiene guide for running GPU / ML workloads on Windows 11 + WSL2 (Ubuntu 22.04) + NVIDIA GPU. Skipping these steps means crashes leave no diagnostic trail and the VM may be killed under memory pressure mid-run.


1. Per-distro /etc/wsl.conf — enable systemd

Why systemd is required

Without systemd as PID 1, journald cannot run, nvidia-persistenced cannot be managed as a service, and tools like systemctl hard-fail:

System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to connect to bus: Host is down

That message is the canonical sign that systemd is off.

Steps

Inside Ubuntu (requires sudo):

sudo tee /etc/wsl.conf > /dev/null <<'EOF'
[boot]
systemd=true
EOF

Then, from an elevated PowerShell, shut down the distro and re-enter:

wsl --shutdown
wsl

Verification

ps -p 1 -o comm=

Expected output: systemd


2. Persistent journald

Why

Stock Ubuntu on WSL keeps journal entries in a tmpfs (RAM-backed). On reboot or VM restart the logs vanish. To run journalctl -b -1 (last boot's log) after a crash you need a physical directory on the WSL virtual disk.

Steps

sudo mkdir -p /var/log/journal
sudo systemd-tmpfiles --create --prefix /var/log/journal
sudo systemctl restart systemd-journald
journalctl --disk-usage   # confirm output shows > 0 B

Verification

After the next reboot:

journalctl -b -1 --no-pager | tail -20

Should return log lines, not No such boot ID in journal.


3. Per-host %USERPROFILE%\.wslconfig baseline

This file lives on the Windows side and controls VM-level resource limits for all WSL2 distros on the host.

Reference path: C:\Users\<YourUserName>\.wslconfig

[wsl2]
vmIdleTimeout=-1
memory=28GB
swap=16GB

Adjust memory= based on your total RAM — the example above reserves ~4 GB for Windows on a 32 GB system.

Why each line matters

Setting Value Rationale
vmIdleTimeout=-1 Never WSL auto-stops after 60 s of idle by default. During a multi-minute GPU inference run the generator process uses the GPU but not the CPU — WSL sees "idle" and can kill the VM mid-run. Disable the timeout.
memory=28GB 28 GB Without a cap, WSL can grow until Windows pages out its own heap and begins OOM-killing processes. Reserve ~4 GB for Windows.
swap=16GB 16 GB ML models load weight tensors that can temporarily spike above the memory= cap. A swap file lets these spills land on disk instead of triggering the Linux OOM killer.

After editing .wslconfig, apply from PowerShell:

wsl --shutdown
wsl

4. Crash-forensics quick reference

Use these commands immediately after a crash or unexpected VM restart.

Inside WSL

# Last 100 kernel messages with human-readable timestamps
dmesg --ctime | tail -100

# Previous boot log (requires persistent journald from section 2)
journalctl -b -1 --no-pager | tail -200

# OOM / killed-process probe
dmesg | grep -iE 'oom|killed process|out of memory'

From PowerShell (Hyper-V / WSL host events)

Get-WinEvent -FilterHashtable @{
    LogName = 'Microsoft-Windows-Hyper-V-Worker-Operational'
    Level   = 1, 2, 3   # Critical, Error, Warning
} | Select-Object -First 20 | Format-List TimeCreated, Message

dxgkio errors in dmesg

If dmesg shows lines containing dxgkio_query_adapter_info or dxgkio_reserve_gpu_va, this is a GPU-passthrough negotiation error between the WSL kernel and the Windows dxgkrnl driver stack. It is almost always caused by a stale WSL kernel — see section 5.


5. WSL kernel currency

An outdated WSL kernel is a common silent crash cause when paired with newer NVIDIA drivers. The 5.15.x series pre-dates the GPU-passthrough stability fixes that the 6.6.x line delivers.

Field-tested note: updating the kernel from 5.15.146.1 → 6.6.87.2 via wsl --update stabilised SDXL GPU passthrough that was previously crashing silently.

Check and update

# Check current kernel version
wsl --version

# Pull the latest WSL kernel from Microsoft
wsl --update

# Restart to load the new kernel
wsl --shutdown
wsl

Verification

wsl --version

The Kernel version: line should report 6.6.x or later.


Combined verification block

Run all checks in one pass after completing sections 1–5:

# 1. Kernel version (expect 6.6.x or later)
uname -r

# 2. systemd as PID 1 (expect: systemd)
ps -p 1 -o comm=

# 3. Persistent journal disk usage (expect: > 0 B)
journalctl --disk-usage

# 4. vmIdleTimeout active — read back the effective config
#    (inside WSL, read the Windows-side file via /mnt/c)
grep -i vmIdleTimeout /mnt/c/Users/$(cmd.exe /C "echo %USERNAME%" 2>/dev/null | tr -d '\r')/.wslconfig 2>/dev/null \
  || echo "WARNING: .wslconfig not found or vmIdleTimeout not set"

All four checks should succeed before running any GPU-heavy workload.