linux / local root exploit / module vetting

linux / local root exploit / module vetting

  • Written by
    Walter Doekes
  • Published on

Recently, we were greeted with the Copy Fail Linux kernel vulnerability. Mitigating this was a matter of denylisting a module. But, only eight days later, there was another exploit, also (ab)using AF_ALG and kernel module autoloading. I'm betting this is not the last, now that the kernel is scrutinized using AI models that keep getting more advanced.

Luckily, we had our machine inventory up to date. So when CVE-2026-31431 ("Copy Fail") came along, deploying a mitigation was a matter of:

  • Creating /etc/modprobe.d/cve-2026-31431.conf everywhere, with:

    install algif_aead /bin/false
    
  • checking our loaded module inventory (the os.kernel GoCollect collector collects this for us) to see if af_alg, algif_aead or authencesn was already loaded anywhere;

  • and lastly, testing that the specific exploit is now mitigated:

    $ python -c 'from socket import *;s=socket(AF_ALG,SOCK_SEQPACKET);s.bind(("aead","authencesn(hmac(sha256),cbc(aes))"));print("metsys elbarenluv a si siht ,tihs"[::-1])'
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
    FileNotFoundError: [Errno 2] No such file or directory
    

    An error is good: it means the exploit won't work.

Locking down autoloading

I remembered we had been discussing locking down the kernels further, and specifically locking down the loading of (normally) unused modules. Because we expect more bugs to be found in other modules, we'd rather stay ahead of the game and reject them beforehand.

Right now, there seem to be two ways to handle auto-loading:

  1. Disabling all explicit module loading — using the kernel.modules_disabled sysctl;
  2. Allowing all module loading, including implicit module loading by unprivileged users — any user calling e.g. socket(AF_ALG) can get certain modules loaded into the kernel.

Obviously, disabling all unneeded modules or disabling module loading altogether seems like the most secure fix. But, we never got around to the tedious work of figuring out which modules we actually need.

And, locking down module loading once the system is up is nice. But do you really know when it is fully up? Maybe your Ceph daemonsets inside your Kubernetes cluster hadn't started yet, and now you've locked down the modules before loading ceph.

Disallowing at least non-root users from (implicit) module loading sounds like a useful mitigation, but the kernel does not support any modules_autoload_mode. Apparently Linus decided against it. And maybe it is too hard to reason about these permissions when there are also namespaces at play.

So, is there another middle ground?

Module vetting

Can we allowlist modules without loading them beforehand?

Yes, we can. If we put install * /bin/false in /etc/modprobe.d/zz-denylist.conf, that gets loaded last and rejects anything that is not previously allowed.

Allowlisting modules is then a matter of adding many, many lines of this:

install foo /sbin/modprobe --ignore-install foo
install bar /sbin/modprobe --ignore-install bar
install baz /sbin/modprobe --ignore-install baz

Make sure they are loaded earlier, by using a lexicographically earlier filename, like /etc/modprobe.d/00-allowlist.conf.

The hard part

The hard part is knowing which modules we need. As mentioned above, we get os.kernel loaded-module info from GoCollect, so we have a good idea which modules we probably need.

Figuring out which modules we need is a tedious task, but if we simply look at the currently loaded modules on our fleet, we see that there are fewer than 600 modules loaded total on all machines, of differing types. In the most pessimistic scenario, a single machine would still only use 10% of the total modules available. So, allowing them, while denying the rest cuts down the available modules to attack by a great deal.

Assuming we now covered which modules we need, can we make it smarter?

kernel.modprobe

Yes, instead of hardcoding the list in configuration files, we can put them in a script. By using the kernel.modprobe sysctl setting, we can create a wrapper that does the vetting for our allowlist.

This wrapper script denies auto-load of certain modules: it does not disable insmod or (explicit) modprobe directly. This way it exactly targets the nonprivileged users we're trying to block, while still allowing the admin to load additional modules by hand if needed.

When the kernel tries to auto-load a module, it doesn't necessarily call /sbin/modprobe. It calls the executable in the kernel.modprobe sysctl — which we override as /usr/local/sbin/vetted-modprobe. That script gets called with arguments -q -- some_module and it can decide whether to honour the request or not.

Note that the kernel calls the script. You cannot decide which process or user gets permissions, but you can choose which module is allowed.

/usr/local/sbin/vetted-modprobe

Instead of doing many lines in /etc/modprobe.d/00-allowlist.conf, we create a /usr/local/sbin/vetted-modprobe wrapper:

#!/bin/sh
# Requires: sysctl kernel.modprobe=/usr/local/sbin/vetted-modprobe
set -u

log() {
    if test -t 2; then echo "$0: $*" >&2; fi
    logger -t vetted-modprobe -p auth.notice "$*"
}

# We assume we're called as "-q -- MODULE_LIST" -- process them one by one.
if test $# -lt 3 || test "$1" != '-q' || test "$2" != '--'; then
    log "unexpected args: $0 $*"
    exit 1
fi
shift; shift  # drop "-q --"

# This may either give us an error:
# - modprobe: FATAL: Module foobar not found in directory /lib/modules/6.8.0...
# Or one or more suggested modules to load:
# - insmod /lib/modules/6.8.0-87-generic/kernel/crypto/af_alg.ko.zst
# - insmod /lib/modules/6.8.0-87-generic/kernel/crypto/algif_aead.ko.zst
plan=$(/sbin/modprobe -n -v -- "$@" 2>&1)
ret=$?

if test $ret -ne 0; then
    log "modprobe -n failed for '$*': $plan"
    exit $ret
fi

if test -z "$plan"; then
    exit 0
fi

unvetted=$(printf '%s\n' "$plan" | while read action filename; do
    test "$action" = insmod || continue
    filename=${filename##*/}; filename=${filename%.ko*}
    case "$filename" in
        # NOTE: Any aliases have been resolved (like net-pf-38 => af_alg).
        #
        # vv-------- EXAMPLES HERE --------vv
        # Some modules:
        allowed_module1|allowed_module2);;
        # More modules:
        mod_foo|mod_bar|mod_baz);;
        #
        # Explicitly _not_ allowed:
        # - "copy.fail"
        # algif_aead);;
        # - "dirty.frag"
        # esp4|esp6|rxrpc);;
        # ^^-------- EXAMPLES HERE --------^^
        #
        # NOTE: The unmatched (unvetted) modules are echoed.
        *) echo "$filename";;
    esac
done)

if test -n "$unvetted"; then
    log "deny 'modprobe -q -- $*'; because unvetted '$unvetted'"
    exit 1
fi

exec /sbin/modprobe -q -- "$@"

That's the gist of the script. Only auto-loading of the modules in the case statement is allowed. If you try to load an unvetted module, it gets rejected with the following log message:

$ sudo journalctl -t vetted-modprobe --facility auth
deny 'modprobe -q -- algif-skcipher'; because unvetted 'algif_skcipher'

Which modules are used?

As mentioned, the hard part is deciding which modules to allow. The script itself is easy. The list I compiled today has fewer than 600 modules in it (including modules that are not available in all kernels), so it cuts down the amount of allowed modules by a big margin.

The following list goes as contents of the case statement above. You should tweak this to your liking. OBSERVE: The allowlisted modules are matched without action. The rest gets the echo "$filename" treatment and gets rejected.

CAVEAT EMPTOR: These modules are NOT necessarily safe from exploits. But they are actively in use (in our systems), and they account for less than 10% of total modules, so we massively cut down the attack space.

        # NOTE: Any aliases have been resolved (like net-pf-38 => af_alg).
        # Seen everywhere:
        8250_dw|acpi_ipmi|acpi_pad|acpi_power_meter|acpi_tad|aesni_intel);;
        af_packet_diag|ahci|amd64_edac|ast|autofs4|binfmt_misc|bonding);;
        br_netfilter|bridge|btrfs|ccp|cdc_ether|cec|cfg80211|coretemp);;
        crc32_pclmul|crct10dif_pclmul|cryptd|crypto_simd|dmi_sysfs);;
        drm|drm_kms_helper|drm_ttm_helper|drm_vram_helper|edac_mce_amd);;
        ee1004|efi_pstore|failover|fb_sys_fops|floppy|ghash_clmulni_intel);;
        hid|hid_generic|i2c_algo_bit|i2c_i801|i2c_piix4|i2c_smbus);;
        ib_core|ib_uverbs|icp|idma64|ie31200_edac|inet_diag|input_leds);;
        intel_cstate|intel_lpss|intel_lpss_pci|intel_pch_thermal);;
        intel_powerclamp|intel_rapl_common|intel_rapl_msr|intel_tcc_cooling);;
        ip6_tables|ip6_udp_tunnel|ip6t_REJECT|ip6table_filter);;
        ip6table_mangle|ip6table_raw|ip_set|ip_set_hash_ip|ip_set_hash_net);;
        ip_tables|ipmi_devintf|ipmi_msghandler|ipmi_si|ipmi_ssif|ipt_REJECT);;
        ipt_rpfilter|iptable_filter|iptable_mangle|iptable_nat|iptable_raw);;
        irqbypass|joydev|k10temp|kvm|kvm_amd|kvm_intel|libahci|libcrc32c|llc);;
        mac_hid|macsec|mei|mei_me|mii);;
        mlx5_core|mlx5_dpll|mlx5_ib|mlxfw|mptcp_diag);;
        net_failover|netlink_diag|nf_conntrack|nf_conntrack_netlink);;
        nf_defrag_ipv4|nf_defrag_ipv6|nf_log_syslog|nf_nat|nf_reject_ipv4);;
        nf_reject_ipv6|nf_socket_ipv4|nf_socket_ipv6|nf_tables);;
        nf_tproxy_ipv4|nf_tproxy_ipv6|nfnetlink|nfnetlink_acct|nfnetlink_log);;
        nft_chain_nat|nft_compat|nft_counter|nls_iso8859_1);;
        nvme|nvme_auth|nvme_core|nvme_fabrics|nvme_keyring|overlay);;
        pci_hyperv_intf|pinctrl_cannonlake|polyval_clmulni|polyval_generic);;
        psample|psmouse|ptdma|raid6_pq|rapl|raw_diag|rc_core|rndis_host);;
        sch_fq_codel|serio_raw|sha1_ssse3|sha256_ssse3|spl|stp);;
        syscopyarea|sysfillrect|sysimgblt|tcp_diag|tls|ttm);;
        udp_diag|udp_tunnel|unix_diag|usbhid|usbnet|veth|video|wmi|wmi_bmof);;
        x86_pkg_temp_thermal|x_tables|xfrm_algo|xfrm_user);;
        xhci_pci|xhci_pci_renesas|xor|xsk_diag|zavl|zcommon);;
        zfs|zlua|znvpair|zunicode|zzstd);;
        # Seen on many systems (30+):
        8021q|amdgpu|amdxcp|async_memcpy|async_pq|async_raid6_recov|async_tx);;
        async_xor|blake2b_generic|bnxt_en|bochs|bpfilter|chacha_x86_64);;
        cls_bpf|cmdlinepart|curve25519_x86_64|dca|drm_buddy);;
        drm_display_helper|drm_exec|drm_suballoc_helper|dummy);;
        ebtable_filter|ebtables|garp|glue_helper|gpu_sched|igb);;
        intel_pmc_core|intel_uncore_frequency|intel_uncore_frequency_common);;
        intel_vsec|ioatdma|ip6table_nat|ip_tunnel|ipip|jc42|libceph);;
        libchacha|libchacha20poly1305|libcurve25519_generic|linear);;
        lp|lpc_ich|mrp|mtd|multipath|nbd|nfit|nft_limit|nft_log);;
        parport|pata_acpi|pcspkr|pmt_class|pmt_telemetry|poly1305_x86_64);;
        qemu_fw_cfg|raid0|raid1|raid10|raid456|rbd|sb_edac|sch_ingress);;
        sctp|skx_edac_common|softdog|spi_intel|spi_intel_pci|spi_nor|sunrpc);;
        tap|tunnel4|usbmouse|vga16fb|vgastate|vhost|vhost_iotlb|vhost_net);;
        vmgenid|vxlan|wireguard|xfs|xhci_hcd);;
        # Seen on GPU systems:
        drm_gpuvm|nvidia|nvidia_drm|nvidia_modeset|nvidia_uvm);;
        # Seen on mgmt/storage systems:
        aufs|authenc|bluetooth|bochs_drm|ceph|cpuid|crc8|ecc|ecdh_generic);;
        ftdi_sio|fscache|gnss|i40e|ice|intel_qat|irdma|isci|isst_if_common);;
        libsas|mgag200|msr|netfs|qat_c62x);;
        scsi_transport_iscsi|scsi_transport_sas|ses|usbserial|vmd);;
        iommufd|pl2303|pnd2_edac|qat_c3xxx|vfio|vfio_iommu_type1);;
        vfio_pci|vfio_pci_core);;
        # Seen on storage systems:
        cxl_acpi|cxl_core|cxl_port|dax_hmem|enclosure|iaa_crypto);;
        idxd|idxd_bus|intel_ifs|intel_sdsi|mpt3sas|pfr_telemetry|pfr_update);;
        pinctrl_emmitsburg|qat_4xxx);;
        # Seen on NAT gateways or load balancers:
        cls_matchall|cls_u32|tcp_bbr);;
        # Seen on ci-runners (why?):
        af_alg|algif_rng);;
        # Seen on older Cumulus switches (common):
        ablk_helper|accton_as7326_56x_platform|acpi_cpufreq|aes_x86_64|at24);;
        cpr4011|crc32c_intel|cumulus_platform|dm_mod|ebt_police|ebt_setclass);;
        eeprom_class|efivarfs|efivars|fuse|gf128mul|gpio_ich|hwmon);;
        i2c_core|i2c_dev|i2c_ismt|i2c_mux|i2c_mux_pca954x);;
        iTCO_vendor_support|iTCO_wdt|ixgbe|kernel_bde|knet|lm75);;
        loop|lrw|mdio|mfd_core|mpls_iptunnel|mpls_router);;
        nf_conntrack_ipv4|nf_nat_ipv4|pmbus_core|sff_8436_eeprom|shpchp|tg3);;
        tpm|tpm_tis|tun|user_bde|vrf);;
        # Seen on older Cumulus switches (rare):
        accton_as7726_32x_platform|arp_tables|arptable_filter);;
        delta_ag5648v1_platform|delta_ag9032v2_platform|dps460|emc2305);;
        gpio_pca953x|ipmi_poweroff|quanta_ix7_cpld|quanta_ix7_platform);;
        quanta_ix8_cpld|quanta_ix8_platform|quanta_ly4r_platform|thermal);;
        tpm_crb|vhwmon);;
        # Seen on IPsec:
        # (Do check if esp4 makes you vulnerable to "Dirty Frag".)
        echainiv|esp4|nf_conntrack_ftp|nf_conntrack_irc|tunnel6);;
        xfrm6_tunnel|xfrm_interface|xt_policy);;
        # Seen on PVEs:
        act_police|amd_atl|bnxt_re|cls_basic|drm_panel_backlight_quirks);;
        drm_shmem_helper|ehci_hcd|ehci_pci|fwctl|i10nm_edac|isst_if_mbox_pci);;
        iscsi_tcp|isst_if_mmio);;
        libiscsi|libiscsi_tcp|mlx5_fwctl|nvme_common|raid_class|ramoops);;
        scsi_common|scsi_dh_alua|scsi_dh_emc|scsi_dh_rdac|scsi_mod);;
        sch_htb|sctp_diag|sdhci|sdhci_pci|sdhci_uhs2);;
        sg|simplefb|skx_edac|spd5118);;
        usbkbd|xt_connmark|xt_mac);;
        # Seen on older systems:
        reed_solomon|zstd_compress);;
        pstore_blk|pstore_zone);;
        # Seen on VPN:
        ovpn);;
        # iptables (heavy use)
        xt_CT|xt_LOG|xt_MASQUERADE|xt_NFLOG|xt_POLICE|xt_SETCLASS|xt_TPROXY);;
        xt_addrtype|xt_comment|xt_conntrack|xt_hashlimit|xt_length|xt_limit);;
        xt_mark|xt_multiport|xt_nat|xt_nfacct|xt_physdev|xt_recent);;
        xt_set|xt_socket|xt_state|xt_statistic|xt_tcpudp);;
        # iptables (rare)
        ip_set_bitmap_port|ip_set_hash_ipport|ip_set_hash_ipportip);;
        ip_set_hash_ipportnet);;
        xt_CHECKSUM|xt_REDIRECT|xt_hl|xt_owner|xt_string|xt_tcpmss|xt_u32);;
        # netfilter (rare)
        nf_conntrack_pptp);;  # only rs420 tunnel
        nf_log_common|nf_log_ipv4|nf_log_ipv6|nf_nat_ftp|nf_nat_ipv6);;
        nf_nat_irc);;
        nft_masq);;
        # virtio (common)
        virtio_blk|virtio_net|virtio_scsi);;
        # virtio (rare)
        virtio|virtio_balloon|virtio_pci|virtio_pci_legacy_dev);;
        virtio_pci_modern_dev|virtio_ring|virtio_rng);;
        # Other (very rare.. leftovers):
        aacraid|amd64_edac_mod|apex|ata_generic|ata_piix);;
        button|cdrom|configfs|cqhci);;
        crc16|crc32c_generic|crc64|crc64_rocksoft|crc_t10dif);;
        crct10dif_common|crct10dif_generic|dm_multipath|e1000e);;
        ebtable_nat|einj|evdev|ext4|gasket|geneve|hfs|hfsplus|hpilo);;
        ib_cm|ib_iser|intel_pmc_ssram_telemetry);;
        intel_pmt|intel_th|intel_th_gth|intel_th_pci);;
        ip6t_rt|ip_vs|ip_vs_rr|ip_vs_sh|ip_vs_wrr|iw_cm|jbd2|jfs);;
        kheaders|libata|mbcache|megaraid_sas|minix|msdos|mxm_wmi);;
        nouveau|ntfs|pmbus|pmt_discovery|qnx4|qrtr|rdma_cm|regmap_i2c);;
        rfkill|sd_mod|sfc|sha512_generic|sha512_ssse3|spi_intel_platform);;
        sr_mod|t10_pi|ts_bm|uas|ufs|uhci_hcd);;
        usb_common|usb_storage|usbcore);;
        vhost_vsock|vmw_vsock_virtio_transport_common|vmwgfx);;
        vsock|vsock_diag);;
        # copy.fail
        #algif_aead);;
        # Seen on desktop systems, do not include these:
        #algif_hash|algif_skcipher|amd_pmc|amd_pmf|amd_sfh|amdtee);;
        #amdxdna|asus_wmi|auth_rpcgss|bnep|btbcm|btintel|btmtk|btrtl|btusb);;
        #cdc_acm|cmac|cp210x|cros_ec|cros_ec_chardev|cros_ec_debugfs);;
        #cros_ec_dev|cros_ec_hwmon|cros_ec_lpcs|cros_ec_proto|cros_ec_sysfs);;
        #dm_crypt|eeepc_wmi|gpio_cros_ec|gpio_keys|grace|i915);;
        #led_class_multicolor|leds_cros_ec|ledtrig_audio|libarc4|lockd);;
        #mac80211|mei_hdcp|mei_pxp|mfd_aaeon|mt76|mt76_connac_lib);;
        #mt7925_common|mt7925e|mt792x_lib|nfs_acl|nfsd);;
        #parport_pc|platform_profile|ppdev|r8169|realtek|rfcomm);;
        #snd|snd_hda_codec|snd_hda_codec_alc269|snd_hda_codec_atihdmi);;
        #snd_hda_codec_generic|snd_hda_codec_hdmi|snd_hda_codec_realtek);;
        #snd_hda_codec_realtek_lib);;
        #snd_hda_core|snd_hda_intel|snd_hda_scodec_component|snd_hrtimer);;
        #snd_hwdep|snd_intel_dspcfg|snd_intel_sdw_acpi|snd_pcm|snd_rawmidi);;
        #snd_seq|snd_seq_device|snd_seq_dummy);;
        #snd_seq_midi|snd_seq_midi_event);;
        #snd_timer|soc_button_array|soundcore|sparse_keymap|tee|thunderbolt);;
        #typec|typec_ucsi|ucsi_acpi|uhid);;
        *) echo "$filename";;

The full script can be downloaded from vetted-modprobe.

Don't forget executable permissions on /usr/local/sbin/vetted-modprobe and to set /etc/sysctl.d/92-vetted-modprobe.conf to kernel.modprobe=/usr/local/sbin/vetted-modprobe and apply it with sysctl -p /etc/sysctl.d/92-vetted-modprobe.conf

And, because it is a script, you can complicate it all you want, with includes and excludes and auto-updates and whatever floats your boat. Maybe you only want to allow esp4 if ipsec is in the hostname. The possibilities are endless.

Summarizing

If you can set kernel.modules_disabled=1, then please do. If you can't, then maybe try the vetted-modprobe above.


Back to overview Older post: macos tahoe / ecn / slow downloads