High
CVE-2021-33909
CVE ID
AttackerKB requires a CVE ID in order to pull vulnerability data and references from the CVE list and the National Vulnerability Database. If available, please supply below:
Add References:
CVE-2021-33909
MITRE ATT&CK
Collection
Command and Control
Credential Access
Defense Evasion
Discovery
Execution
Exfiltration
Impact
Initial Access
Lateral Movement
Persistence
Privilege Escalation
Topic Tags
Description
fs/seq_file.c in the Linux kernel 3.16 through 5.13.x before 5.13.4 does not properly restrict seq buffer allocations, leading to an integer overflow, an Out-of-bounds Write, and escalation to root by an unprivileged user, aka CID-8cae8cd89f05.
Add Assessment
Ratings
-
Attacker ValueHigh
-
ExploitabilityHigh
Technical Analysis
An unprivileged local attacker can exploit this vulnerability by creating, mounting, and deleting a deep directory structure whose total path length exceeds 1GB, and also escalate privileges.
Would you also like to delete your Exploited in the Wild Report?
Delete Assessment Only Delete Assessment and Exploited in the Wild ReportRatings
-
Attacker ValueHigh
-
ExploitabilityMedium
Technical Analysis
CVE-2021-33909 has high attacker value because it is root privilege escalation in core functionality of the Linux kernel itself. Exploitability is a little lower, since it involves kernel memory corruption with particular requirements, but Qualys has indicated successful exploitation of several Linux distributions and versions, noting that distributions they haven’t tested may be equally exploitable out of the box. Mitigations do exist but do not fix the root cause. You’ll want to patch this one before a full exploit drops. (A crash PoC has already been released.)
Would you also like to delete your Exploited in the Wild Report?
Delete Assessment Only Delete Assessment and Exploited in the Wild ReportTechnical Analysis
TLDR Version
size_t
toint
conversion vulnerability leading to an integer overflow in the Linux kernel’s filesystem layer.
- Exploitable by creating, mounting, and deleting a deep directory structure whose total path length exceeds 1 GB.
- Exploiting it allows an attacker to write the string
//deleted
to an offset of exactly -2GB-10B (-10B cause the length of//deleted
is 10 bytes if you include the NULL terminator) below the beginning of avmalloc()
allocated kernel buffer.
- Exploit the uncontrolled OOB write to obtain full
root
privileges on default installations of Ubuntu 20.04, Ubuntu 20.10, Ubuntu 21.04, Debian 11, Fedora 34 Workstation. Though with that being said other Linux distributions are vulnerable and likely exploitable.
- Exploit requires about 5GB of memory and 1M inodes, and exploit will be published by Qualys sometime in near future according to their blog.
- Vulnerability introduced in July 2014 (Linux 3.16) by commit 058504ed (“fs/seq_file: fallback to vmalloc allocation”), was fixed with https://github.com/torvalds/linux/commit/8cae8cd89f05f6de223d63e6d15e31c8ba9cf53b in Linux kernel 5.13.4.
Preliminary Warning :)
This commentary may get a bit technical as that is my preferred style of writing so if you want the nutshell version take a look at @NinjaOperator or @wvu-r7’s reviews on this for a nutshell version or look at the TLDR section above if you need the pertinent details and aren’t interested in a deeper dive into this bug (you’ll miss out on some good info though :D) Alright, ready? Lets dive into this bug.
Exploit Background and Some History
So according to Qualys this bug was first introduced in July 2014 (Linux 3.16) by commit 058504ed (“fs/seq_file: fallback to vmalloc allocation”) which can be found at https://gitlab.raptorengineering.com/meklort/talos-obmc-linux/–/commit/058504edd02667eef8fac9be27ab3ea74332e9b4. This was when original kmalloc(m->size <<= 1, GFP_KERNEL);
call was switched to a seq_buf_alloc(m->size <<= 1);
call.
This is interesting as when we look at the earlier source code for kmalloc
from say, version 3.15 of the Linux source code we find that the maximum size of memory that kmalloc
can allocate is noted at https://elixir.bootlin.com/linux/v3.15/source/include/linux/slab.h#L455 as KMALLOC_MAX_CACHE_SIZE
. This is defined at https://elixir.bootlin.com/linux/v3.15/source/include/linux/slab.h#L234 as #define KMALLOC_MAX_CACHE_SIZE (1UL << KMALLOC_SHIFT_HIGH)
or the unsigned number 1
left shifted by KMALLOC_SHIFT_HIGH
. KMALLOC_SHIFT_HIGH
is defined multiple ways, depending on the backend allocator in use by the OS, but its either defined as a max of 25 (for SLAB allocators), PAGESHIFT
, which is defined as 12 for x86/x64 systems, or PAGESHIFT
+1 aka 13.
Or in other words to make a long story story, the maximum size that kmalloc()
may allocate is 32 MB aka 1<<25
. This is far below what can be represented by a 32 bit number. However when the kernel changed to calling seq_buf_alloc()
it now calls vmalloc
as can be seen at https://elixir.bootlin.com/linux/v3.16/source/fs/seq_file.c#L41, which does not have this same limitation and can allocate as much memory as it pleases. Which means that size could technically be a number that is larger than what can be represented by a signed 32 bit integer.
This leads us into the actual code itself, which I’ll explain below.
The Vulnerable Code Explanation
The Linux kernel has a seq_file
interface that produces virtual files that contain sequences of records. Each record must fit into a seq_file
buffer, whose size is increased as needed by doubling its size by freeing the existing allocation, and then doing a new seq_buf_alloc()
call where the size is the previous size bit shifted left by 1, effectively doubling the size allocated. We can see this if we take a look at https://elixir.bootlin.com/linux/v5.13.3/source/fs/seq_file.c#L242, though the relevant parts of the code are shown below:
168 ssize_t seq_read_iter(struct kiocb *iocb, struct iov_iter *iter) 169 { 170 struct seq_file *m = iocb->ki_filp->private_data; ... 205 /* grab buffer if we didn't have one */ 206 if (!m->buf) { 207 m->buf = seq_buf_alloc(m->size = PAGE_SIZE); ... 210 } ... 220 // get a non-empty record in the buffer ... 223 while (1) { ... 227 err = m->op->show(m, p); ... 236 if (!seq_has_overflowed(m)) // got it 237 goto Fill; 238 // need a bigger buffer ... 240 kvfree(m->buf); ... 242 m->buf = seq_buf_alloc(m->size <<= 1); ... 246 }
Note that the m
is the seq_read_iter()
function is a seq_file
structure corresponding to the path to the virtual file that we are currently operating on. Anyway now that we have allocated memory the next question might be “well if we can control m->size, couldn’t we do an overflow here?” Well not really as Qualys notes cause either the allocation will fail, or you will run out of memory before you overflow m->size
since it is of type size_t
as noted at https://elixir.bootlin.com/linux/v5.13.3/source/include/linux/seq_file.h#L18.
The problem however is that m->size
is actually used in functions that expect an int
value, aka a signed 32 bit integer, not size_t
, a 64 bit unsigned integer. Which leads us to the function show()
at https://elixir.bootlin.com/linux/v5.13.3/source/fs/seq_file.c#L269, which according to Qualys ends up calling show_mountinfo()
.
You see show_mountinfo()
, will end up calling seq_dentry(m, mnt->mnt_root, " \t\n\\");
as shown at https://elixir.bootlin.com/linux/v5.13.3/source/fs/proc_namespace.c#L150. Note that m
here will be the seq_file
object containing a buf
pointer to the buffer we allocated earlier. seq_dentry()
then gets the size of the allocate buffer that m->buf
points to, aka the buffer we allocated earlier, and sets the local variable size
to its size as can be seen at https://elixir.bootlin.com/linux/v5.13.3/source/fs/seq_file.c#L544. Note however that size
is of type size_t
, aka an unsigned 64 bit number
And this is where things go really wrong, as dentry_path(dentry, buf, size);
is then called which leads to the code starting at https://elixir.bootlin.com/linux/v5.13.3/source/fs/d_path.c#L385. Note however that dentry_path
expects its size
argument to be an int
, aka a signed 32 bit integer though, yet we passed it a size_t
number. So if we allocated a buffer 2GB or greater, aka 2147483648 or greater, this would overflow the limits of a signed 32 bit integer, as signed 32 bit numbers can only represent numbers in the range -2147483648 to 2147483647. So in effect the number 2147483648 in size
would be converted inside dentry_path
to the value -2147483648. Woops!
This then leads to p = buf + buflen;
pointing to the location of the allocated buffer, aka buf
minus buflen
which will now be -2147483648 assuming the that size
was specified as 2GB aka 2147483648. So p
will in effect point to a memory location 2GB before where our vmalloc()
allocated buffer is.
We then end up calling prepend(&p, &buflen, "//deleted", 10)
, and we can see the code for prepend
at https://elixir.bootlin.com/linux/v5.13.3/source/fs/d_path.c#L11, thought the interesting part starts at https://elixir.bootlin.com/linux/v5.13.3/source/fs/d_path.c#L16. Here we can see that buffer
, aka the pointer to the memory at 2GB before the vmalloc()
allocated buffer, is subtracted by 10, making it point an additional 10 bytes earlier in memory. Following this 10 bytes of the //deleted
string, aka the //deleted
string plus the null terminator is written to memory.
This effectively allows the attacker to gain an OOB write in kernel memory as they can adjust the length of the string allocated to adjust where they write in memory. Now typically this wouldn’t lead to much however Qualys was able to use this OOB write to overwrite the instruction of the validated eBPF program after it has been validated by the kernel but before it has been JIT compiled, and use this to transform the uncontrolled OOB write into and information disclosure and then in to a limited but controlled OOB write. Then then used Manfred Paul’s btf
and map_push_elem
techniques from https://www.thezdi.com/blog/2020/4/8/cve-2020-8835-linux-kernel-privilege-escalation-via-improper-ebpf-program-verification to transform this limited controlled OOB write into a full arbitrary kernel read and write and used this to set modprobe_path
to their current executable, a technique that has been described in more detail than I can describe here in places like https://lkmidas.github.io/posts/20210223-linux-kernel-pwn-modprobe/, to elevate their current process such that it now executes code in kernel mode as root
.
Official Patch And Some Important Notes
https://github.com/torvalds/linux/commit/8cae8cd89f05f6de223d63e6d15e31c8ba9cf53b is the official patch for this issue, which fixes the issue by ensuring that seq_buf_alloc
doesn’t allocate memory that is larger than MAX_RW_COUNT
. Looking at where MAX_RW_COUNT
is defined we see https://elixir.bootlin.com/linux/v5.13.4/source/include/linux/fs.h#L2572 where it is defined as #define MAX_RW_COUNT (INT_MAX & PAGE_MASK)
. This basically page aligns INT_MAX
to a page so the max value allowed will be the value of INT_MAX
, which is the maximum value of a signed 32 bit integer, minus the size of a page of memory on the system, which is typically 4KB in size.
If the size is over this amount then the allocation will fail and we will never hit the vulnerable code. However this doesn’t really solve the root issue per say. Assuming users can find another way to execute the same vulnerable code and abuse the fact that the kernel is still passing unsigned 64 bit integers to functions that expect signed 32 bit integers, its likely that someone could bypass this patch via alternative means. Whether or not this is possible remains to be seen, but in my opinion, whilst it would be more work, the appropriate solution would be update the functions to pass the appropriate data into one another whilst also taking care to not perform casts between signed and unsigned numbers without performing appropriate checks.
Would you also like to delete your Exploited in the Wild Report?
Delete Assessment Only Delete Assessment and Exploited in the Wild ReportCVSS V3 Severity and Metrics
General Information
Vendors
- debian,
- fedoraproject,
- linux,
- netapp,
- oracle,
- sonicwall
Products
- communications session border controller 8.2,
- communications session border controller 8.3,
- communications session border controller 8.4,
- communications session border controller 9.0,
- debian linux 10.0,
- debian linux 9.0,
- fedora 34,
- hci management node -,
- linux kernel,
- sma1000 firmware,
- solidfire -
References
Advisory
Miscellaneous
Additional Info
Technical Analysis
Report as Exploited in the Wild
CVE ID
AttackerKB requires a CVE ID in order to pull vulnerability data and references from the CVE list and the National Vulnerability Database. If available, please supply below: