CVE-2007-4559 Python vulnerability ignored for 15 years puts 350000 projects at risk of code execution
CVE-2007-4559 Python vulnerability ignored for 15 years puts 350000 projects at risk of code execution.
A 15-year-overlooked vulnerability in the Python programming language is drawing renewed attention because of its potential impact on more than 350,000 open source repository projects.
In fact, as early as 2007, security researchers have disclosed and marked CVE-2007-4559.
Sadly, it hasn’t gotten an official fix.
The only mitigation, and only the relevant risks are indicated in the updated developer documentation.
But now, the vulnerability has been found to be used for code execution.
According to Bleeping Computer , the vulnerability resides in a Python tarfile.
This path traversal vulnerability could potentially be exploited to overwrite arbitrary files in code that uses the raw tarfile.extract() function, or the built-in defaults of tarfile.extractall().
While we haven’t heard of an exploit report related to CVE-2007-4559 since it was first reported in August 2007, it does alert the outside world to a long-overlooked risk in the software supply chain.
Earlier this year, a Trellix security researcher uncovered CVE-2007-4559 again while investigating another security issue.
As a new enterprise offering Extended Detection and Response (XDR) solutions, it was formed from the merger of McAfee Enterprise and FireEye .
Charles McFarland from the Trellix Advanced Threat Research Team stated:
Failure to write any safe code to clean up member files before calling tarfile.extract() and tarfile.extractall() could allow this directory traversal vulnerability to be exploited by bad actors to access the file system.
The flaw stems from code in Python’s tarfile module/extract function that explicitly trusts the information in the TarInfo object and adds the path passed to the extract function and the name in the TarInfo object.
Less than a week after the disclosure, a message on the Python bug tracker said the issue had been closed.
The update document states that the official is working on fixing the problem, and reminds everyone to never extract files from untrusted sources, otherwise you will be at considerable risk.
Through analysis, Trellix researchers found that the vulnerability affects thousands of open-source and closed-source software projects.
Out of a random batch of 257 repositories with a high probability of containing vulnerable code, they manually inspected 175 of them, which showed a risk rate of 61%.
After an automated inspection of the remaining repositories, the odds increased to 65%, indicating that the problem is fairly widespread — and that’s only for GitHub, a code-hosting platform.
Charles McFarland added: “With the help of GitHub, we obtained a larger dataset – 588,840 unique repositories that included import tarfiles in their Python code”.
The vulnerability rate for manual verification is as high as 61%, and Trellix estimates that more than 350,000 repositories are vulnerable to this vulnerability, and there are many machine learning tools that help developers complete projects faster – such as GitHub Copilot.
This AI programming aid relies on code from hundreds of thousands of repositories to provide a convenient “autocomplete” programming experience. But if the reference code itself is not secure enough, the problem will spread unwittingly to more innocent new projects.
Through in-depth research, Trellix discovered open source code vulnerable to the CVE-2007-4559 vulnerability, spanning numerous industries.
As expected, Development was at the top of the list, followed by Artificial Intelligence (AI)/Machine Learning (ML), and projects such as Web, Security, and Admin Tools.