Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++: Decompression Bombs #13560

Merged
merged 57 commits into from
Sep 5, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
4a37da3
V1
am0o0 Jun 25, 2023
430375e
fix a commit mistake
am0o0 Jun 25, 2023
ae98510
add more source and sinks and sanitizers
am0o0 Jun 25, 2023
3ddc9a8
fix warnings, more sinks,sources,comments
am0o0 Jun 25, 2023
f715a34
better examples
am0o0 Jun 25, 2023
042133a
add queries for more popular libs
am0o0 Jul 2, 2023
d4d505d
complete the minizip query
am0o0 Jul 3, 2023
56bc32f
add libarchive
am0o0 Jul 3, 2023
16be908
add Miniz
am0o0 Jul 3, 2023
065c527
update Miniz
am0o0 Jul 3, 2023
e0798b2
stash: change sinks to zip handles and sources to the zip handle init…
am0o0 Jul 4, 2023
e37ceac
merge all query files into one query file
am0o0 Jun 7, 2024
a5c9dc7
Merge branch 'github:main' into amammad-cpp-bombs
am0o0 Jun 7, 2024
184aa04
Merge branch 'amammad-cpp-bombs' of https://github.com/amammad/codeql…
am0o0 Jun 7, 2024
a536328
add implicit this
am0o0 Jun 7, 2024
273848c
remove old comments
am0o0 Jun 7, 2024
11a416e
add FlowSources as a common source for all sinks, so we don't need St…
am0o0 Jun 13, 2024
13f697c
relocate the query
am0o0 Jun 25, 2024
656dc4e
use abstract class for decompression sinks
am0o0 Jun 25, 2024
361ad6b
use abstract class for decompression flow steps
am0o0 Jun 26, 2024
87b6495
add zlib tests with stubs :)
am0o0 Jul 14, 2024
a10b502
fix tests, it is not fixed 100%
am0o0 Jul 15, 2024
6f8eec2
Merge branch 'github:main' into amammad-cpp-bombs
am0o0 Jul 28, 2024
f97b103
update test files, add one more additional flow step for inflate func…
am0o0 Jul 30, 2024
89e842b
finilize tests for zlib
am0o0 Sep 3, 2024
8c1c537
finilize tests for zlib
am0o0 Sep 3, 2024
49eaaf5
Merge branch 'amammad-cpp-bombs' of https://github.com/am0o0/codeql i…
am0o0 Sep 3, 2024
e85ca79
add tests for brotli
am0o0 Sep 3, 2024
9531701
delete miniz support because there is no good documents and i don't h…
am0o0 Sep 3, 2024
6c97096
remove unused imports, add tests for libarchive
am0o0 Sep 3, 2024
4fc971d
remove xz(lzma)
am0o0 Sep 3, 2024
81283d5
remove more unused imports, add tests for zstd, add flow steps for zstd
am0o0 Sep 3, 2024
386e45a
delete bzip2 as it is not updated for more than three years so it is …
am0o0 Sep 3, 2024
50d9e77
C++: Move experimental files into the correct locations
jketema Sep 4, 2024
d526f1d
C++: Disentangle confusing test results by declaring only a single `m…
jketema Sep 4, 2024
751e7e6
C++: Remove useless function bodies from tests
jketema Sep 4, 2024
d8a70d8
C++: Add test annotations
jketema Sep 4, 2024
ad3605c
C++: Minor test clean up
jketema Sep 4, 2024
078e635
C++: Remove code that is irrelevant for the zlib test
jketema Sep 4, 2024
09f6576
C++: Simplify libarchive test
jketema Sep 4, 2024
0f98e29
C++: Cleanup minizip test
jketema Sep 4, 2024
c048401
C++: Clean up Brotli test
jketema Sep 4, 2024
084dbc4
C++: Rename qhelp file to match ql file
jketema Sep 4, 2024
65fafbf
C++: Fix QL-for-QL warnings
jketema Sep 4, 2024
8d22d14
C++: Clean up QLDoc
jketema Sep 4, 2024
8fe0d0a
C++: Improve query output
jketema Sep 4, 2024
2369b18
C++: Make additional flow steps more uniform
jketema Sep 4, 2024
92c6170
C++: Simplify QLhelp
jketema Sep 4, 2024
238895e
C++: Fix formatting
jketema Sep 4, 2024
9b905d5
C++: Set precision to low
jketema Sep 4, 2024
4fa4624
Merge pull request #1 from jketema/amammad-cpp-bombs
am0o0 Sep 4, 2024
3aa68b3
C++: Fix zstd and clean up test
jketema Sep 4, 2024
05bdce1
Merge pull request #2 from jketema/amammad-cpp-bombs
am0o0 Sep 5, 2024
faef635
add '// BAD' comment for the zstd sink
am0o0 Sep 5, 2024
401bb24
remove redundent `zStreamAccess` in flow steps
am0o0 Sep 5, 2024
e891c5a
C++: Fix expected test results
jketema Sep 5, 2024
a226bdf
Merge pull request #3 from jketema/amammad-cpp-bombs
am0o0 Sep 5, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
<!DOCTYPE qhelp PUBLIC
"-//Semmle//qhelp//EN"
"qhelp.dtd">
<qhelp>
<overview>
<p>Extracting Compressed files with any compression algorithm like gzip can cause to denial of service attacks.</p>
<p>Attackers can compress a huge file which created by repeated similiar byte and convert it to a small compressed file.</p>

</overview>
<recommendation>

<p>When you want to decompress a user-provided compressed file you must be careful about the decompression ratio or read these files within a loop byte by byte to be able to manage the decompressed size in each cycle of the loop.</p>

</recommendation>
<example>

<p>
Reading uncompressed Gzip file within a loop and check for a threshold size in each cycle.
</p>
<sample src="example_good.cpp"/>

<p>
An Unsafe Approach can be this example which we don't check for uncompressed size.
</p>
<sample src="example_bad.cpp" />

</example>
<references>

<li>
<a href="https://www.bamsoftware.com/hacks/zipbomb/">A great research to gain more impact by this kind of attacks</a>
</li>

</references>
</qhelp>
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
/**
* @name User-controlled file decompression
* @description User-controlled data that flows into decompression library APIs without checking the compression rate is dangerous
* @kind path-problem
* @problem.severity error
* @security-severity 7.8
* @precision high
* @id cpp/user-controlled-file-decompression1
* @tags security
* experimental
* external/cwe/cwe-409
*/

import cpp
import semmle.code.cpp.ir.dataflow.TaintTracking
import semmle.code.cpp.security.FlowSources

/**
* A gzFile Variable as a Flow source
*/
private class GzFileVar extends VariableAccess {
GzFileVar() { this.getType().hasName("gzFile") }
}

/**
* The `gzopen` function as a Flow source
*/
private class GzopenFunction extends Function {
GzopenFunction() { hasGlobalName("gzopen") }
Fixed Show fixed Hide fixed
}

/**
* The `gzdopen` function as a Flow source
*/
private class GzdopenFunction extends Function {
GzdopenFunction() { hasGlobalName("gzdopen") }
Fixed Show fixed Hide fixed
}

/**
* The `gzfread` function is used in Flow sink
*/
private class GzfreadFunction extends Function {
GzfreadFunction() { hasGlobalName("gzfread") }
Fixed Show fixed Hide fixed
}

/**
* The `gzread` function is used in Flow sink
*/
private class GzreadFunction extends Function {
GzreadFunction() { hasGlobalName("gzread") }
Fixed Show fixed Hide fixed
}

module ZlibTaintConfig implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node source) {
// gzopen(const char *path, const char *mode);
exists(FunctionCall fc | fc.getTarget() instanceof GzopenFunction |
fc.getArgument(0) = source.asExpr() and
// arg 0 can be a path string whichwe must do following check
not fc.getArgument(0).isConstant()
)
or
//gzdopen(int fd, const char *mode);
// IDK whether it is good to use all file decriptors function returns as source or not
// because we can do more sanitization from fd function sources
exists(FunctionCall fc | fc.getTarget() instanceof GzdopenFunction |
fc.getArgument(0) = source.asExpr()
)
or
source.asExpr() instanceof GzFileVar
}

predicate isSink(DataFlow::Node sink) {
// gzread(gzFile file, voidp buf, unsigned len));
exists(FunctionCall fc | fc.getTarget() instanceof GzreadFunction |
fc.getArgument(0) = sink.asExpr() and
not sanitizer(fc)
// TODO: and not sanitizer2(fc.getArgument(2))
)
or
// gzfread((voidp buf, z_size_t size, z_size_t nitems, gzFile file));
exists(FunctionCall fc | fc.getTarget() instanceof GzfreadFunction |
sink.asExpr() = fc.getArgument(3)
)
}

predicate isAdditionalFlowStep(DataFlow::Node node1, DataFlow::Node node2) {
// gzopen(const char *path, const char *mode);
// gzdopen(int fd, const char *mode);
exists(FunctionCall fc |
fc.getTarget() instanceof GzopenFunction or fc.getTarget() instanceof GzdopenFunction
|
node1.asExpr() = fc.getArgument(0) and
node2.asExpr() = fc
)
or
// gzread(gzFile file, voidp buf, unsigned len);
exists(FunctionCall fc | fc.getTarget() instanceof GzreadFunction |
node1.asExpr() = fc.getArgument(0) and
node2.asExpr() = fc.getArgument(1)
)
or
// gzfread(voidp buf, z_size_t size, z_size_t nitems, gzFile file);
exists(FunctionCall fc | fc.getTarget() instanceof GzfreadFunction |
node1.asExpr() = fc.getArgument(3) and
node2.asExpr() = fc.getArgument(0)
)
}
}

predicate sanitizer(FunctionCall fc) {
exists(Expr e |
// a RelationalOperation which isn't compared with a Literal that using for end of read
TaintTracking::localExprTaint(fc, e.(RelationalOperation).getAChild*()) and
not e.getAChild*().(Literal).getValue() = ["0", "1", "-1"]
)
}

// TODO:
// predicate sanitizer2(Expr arg) {
// exists(Expr e |
// // a RelationalOperation which isn't compared with a Literal that using for end of read
// TaintTracking::localExprTaint(arg, e.(RelationalOperation).getAChild*())
// )
// }
// predicate test(FunctionCall fc, Expr sink) {
// exists( | fc.getTarget() instanceof GzreadFunction |
// // a RelationalOperation which isn't compared with a Literal that using for end of read
// TaintTracking::localExprTaint(fc.getArgument(2).(VariableAccess), sink)
// ) and
// sink.getFile().getLocation().toString().matches("%main.cpp%")
// }
module ZlibTaint = TaintTracking::Global<ZlibTaintConfig>;

import ZlibTaint::PathGraph

from ZlibTaint::PathNode source, ZlibTaint::PathNode sink
where ZlibTaint::flowPath(source, sink)
select sink.getNode(), source, sink, "This Decompressiondepends on a $@.", source.getNode(),
"potentially untrusted source"
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
/**
* @name User-controlled file decompression
* @description User-controlled data that flows into decompression library APIs without checking the compression rate is dangerous
* @kind path-problem
* @problem.severity error
* @security-severity 7.8
* @precision high
* @id cpp/user-controlled-file-decompression2
* @tags security
* experimental
* external/cwe/cwe-409
*/

import cpp
import semmle.code.cpp.ir.dataflow.TaintTracking
import semmle.code.cpp.security.FlowSources

/**
* A z_stream Variable as a Flow source
*/
private class ZStreamVar extends VariableAccess {
ZStreamVar() { this.getType().hasName("z_stream") }
}

/**
* The `inflate`/`inflateSync` function is used in Flow sink
*/
private class DeflateFunction extends Function {
DeflateFunction() { hasGlobalName(["inflate", "inflateSync"]) }
Fixed Show fixed Hide fixed
}

module ZlibTaintConfig implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node source) { source.asExpr() instanceof ZStreamVar }

predicate isSink(DataFlow::Node sink) {
exists(FunctionCall fc | fc.getTarget() instanceof DeflateFunction |
fc.getArgument(0) = sink.asExpr()
)
}

predicate isAdditionalFlowStep(DataFlow::Node node1, DataFlow::Node node2) {
exists(FunctionCall fc | fc.getTarget() instanceof DeflateFunction |
node1.asExpr() = fc.getArgument(0) and
node2.asExpr() = fc
)
}
}

module ZlibTaint = TaintTracking::Global<ZlibTaintConfig>;

import ZlibTaint::PathGraph

from ZlibTaint::PathNode source, ZlibTaint::PathNode sink
where ZlibTaint::flowPath(source, sink)
select sink.getNode(), source, sink, "This Decompression depends on a $@.", source.getNode(),
"potentially untrusted source"
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
/**
* @name User-controlled file decompression
* @description User-controlled data that flows into decompression library APIs without checking the compression rate is dangerous
* @kind path-problem
* @problem.severity error
* @security-severity 7.8
* @precision high
* @id cpp/user-controlled-file-decompression3
* @tags security
* experimental
* external/cwe/cwe-409
*/

import cpp
import semmle.code.cpp.ir.dataflow.TaintTracking
import semmle.code.cpp.security.FlowSources

/**
* A Bytef Variable as a Flow source
*/
private class BytefVar extends VariableAccess {
BytefVar() { this.getType().hasName("Bytef") }
}

/**
* The `uncompress`/`uncompress2` function is used in Flow sink
*/
private class UncompressFunction extends Function {
UncompressFunction() { hasGlobalName(["uncompress", "uncompress2"]) }
Fixed Show fixed Hide fixed
}

module ZlibTaintConfig implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node source) { source.asExpr() instanceof BytefVar }

predicate isSink(DataFlow::Node sink) {
exists(FunctionCall fc | fc.getTarget() instanceof UncompressFunction |
fc.getArgument(0) = sink.asExpr()
)
}

predicate isAdditionalFlowStep(DataFlow::Node node1, DataFlow::Node node2) {
exists(FunctionCall fc | fc.getTarget() instanceof UncompressFunction |
node1.asExpr() = fc.getArgument(0) and
node2.asExpr() = fc
)
}
}

module ZlibTaint = TaintTracking::Global<ZlibTaintConfig>;

import ZlibTaint::PathGraph

from ZlibTaint::PathNode source, ZlibTaint::PathNode sink
where ZlibTaint::flowPath(source, sink)
select sink.getNode(), source, sink, "This Decompression depends on a $@.", source.getNode(),
"potentially untrusted source"
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
#include <iostream>
#include <vector>
#include "zlib.h"
int UnsafeRead() {
std::cout << "enter compressed file name!\n" << std::endl;
char fileName[100];
std::cin >> fileName;
gzFile inFileZ = gzopen(fileName, "rb");
if (inFileZ == nullptr) {
printf("Error: Failed to gzopen %s\n", fileName);
exit(0);
}
unsigned char unzipBuffer[8192];
unsigned int unzippedBytes;
std::vector<unsigned char> unzippedData;
while (true) {
unzippedBytes = gzread(inFileZ, unzipBuffer, 8192);
if (unzippedBytes > 0) {
unzippedData.insert(unzippedData.end(), unzipBuffer, unzipBuffer + unzippedBytes);
} else {
break;
}
}

for ( auto &&i: unzippedData)
std::cout << i;
gzclose(inFileZ);

return 0;
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
#include <iostream>
#include <vector>
#include "zlib.h"
int SafeRead() {
std::cout << "enter compressed file name!\n" << std::endl;
char fileName[100];
std::cin >> fileName;
gzFile inFileZ = gzopen(fileName, "rb");
if (inFileZ == nullptr) {
printf("Error: Failed to gzopen %s\n", fileName);
exit(0);
}
unsigned char unzipBuffer[8192];
unsigned int unzippedBytes;
uint totalRead = 0;
std::vector<unsigned char> unzippedData;
while (true) {
unzippedBytes = gzread(inFileZ, unzipBuffer, 8192);
totalRead += unzippedBytes;
if (unzippedBytes > 0) {
unzippedData.insert(unzippedData.end(), unzipBuffer, unzipBuffer + unzippedBytes);
if (totalRead > 1024 * 1024 * 4) {
std::cout << "Bombs!" << totalRead;
exit(1);
} else {
std::cout << "not Bomb yet!!" << totalRead << std::endl;
}
} else {
break;
}
}

for (auto &&i: unzippedData)
std::cout << i;
gzclose(inFileZ);

return 0;
}