CVE-2026-45907

Published: Mag 27, 2026 Last Modified: Mag 27, 2026
ExploitDB:
Other exploit source:
Google Dorks:

Description

AI Translation Available

In the Linux kernel, the following vulnerability has been resolved:

net/mlx5e: Fix deadlocks between devlink and netdev instance locks

In the mentioned 'Fixes' commit, various work tasks triggering devlink
health reporter recovery were switched to use netdev_trylock to protect
against concurrent tear down of the channels being recovered. But this
had the side effect of introducing potential deadlocks because of
incorrect lock ordering.

The correct lock order is described by the init flow:
probe_one -> mlx5_init_one (acquires devlink lock)
-> mlx5_init_one_devl_locked -> mlx5_register_device
-> mlx5_rescan_drivers_locked -...-> mlx5e_probe -> _mlx5e_probe
-> register_netdev (acquires rtnl lock)
-> register_netdevice (acquires netdev lock)
=> devlink lock -> rtnl lock -> netdev lock.

But in the current recovery flow, the order is wrong:
mlx5e_tx_err_cqe_work (acquires netdev lock)
-> mlx5e_reporter_tx_err_cqe -> mlx5e_health_report
-> devlink_health_report (acquires devlink lock => boom!)
-> devlink_health_reporter_recover
-> mlx5e_tx_reporter_recover -> mlx5e_tx_reporter_recover_from_ctx
-> mlx5e_tx_reporter_err_cqe_recover

The same pattern exists in:
mlx5e_reporter_rx_timeout
mlx5e_reporter_tx_ptpsq_unhealthy
mlx5e_reporter_tx_timeout

Fix these by moving the netdev_trylock calls from the work handlers
lower in the call stack, in the respective recovery functions, where
they are actually necessary.

https://git.kernel.org/stable/c/4329514c61abefe4961541b128c549b017bab5ad
https://git.kernel.org/stable/c/63f9d5fb4d8040077df801ca3270e2f02d55e0d9
https://git.kernel.org/stable/c/83ac0304a2d77519dae1e54c9713cbe1aedf19c9