-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python: Decompression Bombs #13557
Python: Decompression Bombs #13557
Conversation
QHelp previews: python/ql/src/experimental/Security/CWE-409/DecompressionBombs.qhelpDecompression BombExtracting Compressed files with any compression algorithm like gzip can cause to denial of service attacks. Attackers can compress a huge file which created by repeated similiar byte and convert it to a small compressed file. RecommendationWhen you want to decompress a user-provided compressed file you must be careful about the decompression ratio or read these files within a loop byte by byte to be able to manage the decompressed size in each cycle of the loop. Examplepython ZipFile library is vulnerable by default import zipfile
def Bad(zip_path):
zipfile.ZipFile(zip_path, "r").extractall() By checking the decompressed size from input zipped file you can check the decompression ratio. attackers can forge this decompressed size header too. So can't rely on file_size attribute of ZipInfo class. this is recommended to use "ZipFile.open" method to be able to manage decompressed size. Reading decompressed file byte by byte and verifying the total current size in each loop cycle in recommended to use in any decompression library. import zipfile
def safeUnzip(zipFileName):
'''
safeUnzip reads each file inside the zipfile 1 MB by 1 MB
and during extraction or reading of these files it checks the total decompressed size
doesn't exceed the SIZE_THRESHOLD
'''
buffer_size = 1024 * 1024 * 1 # 1 MB
total_size = 0
SIZE_THRESHOLD = 1024 * 1024 * 10 # 10 MB
with zipfile.ZipFile(zipFileName) as myzip:
for fileinfo in myzip.infolist():
with myzip.open(fileinfo.filename, mode="r") as myfile:
content = b''
chunk = myfile.read(buffer_size)
total_size += buffer_size
if total_size > SIZE_THRESHOLD:
print("Bomb detected")
return False # it isn't a successful extract or read
content += chunk
# reading next bytes of uncompressed data
while chunk:
chunk = myfile.read(buffer_size)
total_size += buffer_size
if total_size > SIZE_THRESHOLD:
print("Bomb detected")
return False # it isn't a successful extract or read
content += chunk
# An example of extracting or reading each decompressed file here
print(bytes.decode(content, 'utf-8'))
return True # it is a successful extract or read References
|
Hi, I've completed the work on this query and I don't have any further updates/commits here. |
Hi @RasmusWL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few things, besides the inline comments:
- Have you applied the feedback from Ruby/Go to this PR as well?
- My understanding of the
safeUnzip
function is that it will return early if a file is larger thanSIZE_THRESHOLD
, which is set to 100 kb. Overall that doesn't sound super useful in practice, since I imagine many zip files to contain images (which are usually in the 1-10mb filesize). I also noted that it doesn't actually extract anything, but only checks if a zip-file is safe, it should probably also be calledis_zipfile_safe
and return true/false. - If the
open
call has amode
argument that is not known, and later an.extractall()
call is made, I would have assumed we can deduct this is a problem... but your code currently doesn't seem to allow this. Can you explain why you added this restriction? - When you have looked through results for this query, what is your own impression on number of FPs?
NIT: After being promoted we do care to keep the .ql file and .qll file in separate directories, but for Python we have adopted a simpler approach for experimental queries, so it's just fine to place python/ql/src/experimental/semmle/python/security/DecompressionBomb.qll
right next to the .ql file 😊
NIT: It would be nice if you could adopt DataflowQueryTest.ql, like we do here for path-problem tests in the future, since it makes reading the expected good/bad results much easier.
python/ql/src/experimental/Security/CWE-409/DecompressionBombs.ql
Outdated
Show resolved
Hide resolved
python/ql/src/experimental/Security/CWE-409/DecompressionBombs.ql
Outdated
Show resolved
Hide resolved
python/ql/src/experimental/Security/CWE-409/DecompressionBombs.ql
Outdated
Show resolved
Hide resolved
python/ql/src/experimental/semmle/python/security/DecompressionBomb.qll
Outdated
Show resolved
Hide resolved
python/ql/src/experimental/semmle/python/security/DecompressionBomb.qll
Outdated
Show resolved
Hide resolved
I'll add inline tests, but I think I can't import decompression bomb sinks into inline tests I put the DecompressionBomb.qll file into the current DecompressionBomb.ql directory? |
I tried to add the inline test but I failed :
|
Since the helper predicate had nice qldocs
The old title/id matches how we used to write queries, but I think just using the normal conversational name is easier for everyone :)
All those newlines are not good for inline expectations
You're only allowed to have `result=OK` if there is a sink on that line...
Thanks for making these updates. I've taken the liberty to fix up a few things in this PR, instead of merging it as-is and then filing a PR to do it afterwards. Overall I think some of the additional taint-steps looks a little too broad, and it would have been more clear if they were split up a bit more (example). So something to keep in mind for future PRs 😊 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
taking my comments above into account, I think it's good enough to merge as-is (at least for experimental)
@RasmusWL you are the best, thanks a lot. |
python/ql/src/experimental/semmle/python/security/FileAndFormRemoteFlowSource.qll
Fixed
Show fixed
Hide fixed
python/ql/src/experimental/semmle/python/security/FileAndFormRemoteFlowSource.qll
Fixed
Show fixed
Hide fixed
python/ql/src/experimental/semmle/python/security/FileAndFormRemoteFlowSource.qll
Fixed
Show fixed
Hide fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ready to merge to experimental (see other comments above)
This is part of All for one, one for all query submission, I'm going to submit an issue in github/securitylab for this pull request too.
I've also added
FileAndFormRemoteFlowSource
module which is related to all queries that need a file as their user provided remote flow sources. in this module I added support of file and form of FastAPI framework.I tried my best to add as much sanitizers as possible. if you have another idea about sanitizers I can write more sanitizers too.