ExecHash hashing algorithm choice #1352
Replies: 6 comments
-
Just as a curiosity, does it change if you use an entropy generator ? (if you install "haveged", for example, and execute the same test, is it still a bottleneck ?). If your bottle neck if to wait for entropy for the hash calculation, then you might have as a "best practices" to make sure to have entropy by using "havege" like algorithm (or specific HW support for random num gen). |
Beta Was this translation helpful? Give feedback.
-
Nope, not yet. |
Beta Was this translation helpful? Give feedback.
-
WDYM ? Have you tested it and it doesn't change anything ? Or you haven't tested it yet ? |
Beta Was this translation helpful? Give feedback.
-
We haven't tested that yet, if and when we do i'll update the issue. |
Beta Was this translation helpful? Give feedback.
-
Although this code path is indeed slow, you should also keep in mind that its frequency is low.
2+3 combined shouldn't happen too frequent. Another consideration for choosing the hashing algorithm is related to the fact that it is then used to feed static analysis tools (AVs) which are required to support the selected hashing algorithm. |
Beta Was this translation helpful? Give feedback.
-
Converting this issue to a discussion |
Beta Was this translation helpful? Give feedback.
-
@mtcherni95 and me are currently researching tracee's performance. One of the code paths we noticed that is somewhat of a bottleneck in tracee-ebpf's event processing is related to hashing in the ExecHash feature.
In particular the getFileHash function seems to take most of it's time in writing the SHA256 hash as shown here:
You can see from the graph that in getFileHash we use io.Copy which uses the Read implementation of sha256, which is the bottleneck in this code path.
We were considering the possibility of using a different hashing algorithm, for example xxhash (which is non-cryptographic) or SHA-1 if we need the hashing algorithm to be cryptographic.
I'd like to get some opinions so we can consider if we should change the algorithm and which one to choose if so.
Beta Was this translation helpful? Give feedback.
All reactions