BPF Design Decisions #593
-
I've been working with BPF for a while, both writing and reading it. tracee.bpf.c is some of the best BPF code I've ever seen! I just wanted to express that to start out :) I do have some questions about the design decision on the kernel side though, some of the paradigms are a bit unfamiliar to me in the context of BPF programming. I noticed in if (event_chosen(RAW_SYS_ENTER)) {
buf_t *submit_p = get_buf(SUBMIT_BUF_IDX);
if (submit_p == NULL) { return 0; }
set_buf_off(SUBMIT_BUF_IDX, sizeof(context_t));
context_t context = init_and_save_context(ctx, submit_p, RAW_SYS_ENTER, 1, 0);
u64 *tags = bpf_map_lookup_elem(¶ms_names_map, &context.eventid);
if (!tags) {
return -1;
}
save_to_submit_buf(submit_p, (void*)&id, sizeof(int), INT_T, DEC_ARG(0, *tags));
events_perf_submit(ctx);
} So I read this as:
Those last two bits don't make sense to me. Why submit to the perf buffer prior to hitting the Then you hit the Next is the So I decided to take a different approach and looked at some of the helper functions and came across
Again, just wanted to compliment on the high-quality code! I've reviewed a bunch of the bigger BPF applications out there (and a bunch of smaller ones too) and this has been, by far, my favorite to learn about. It's really opened my eyes about what's possible with BPF. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
Hi @rtkaratekid , You got most of the things right, and I think that what you are missing is that Tracee (more specifically tracee-ebpf) is capable of showing not only system calls events. If you are using the latest docker image (v0.5.0), you can list all of the events that are supported by Tracee ebpf using the By listing the events, you will see events like security_bprm_check and security_file_open. These are actually bpf programs by themselves, which are not called by any tail call youv'e seen in the sys_enter code (but are attached using kprobes, as you can see in the end of the big EventsIDToEvent table in the consts.go file). You can actually choose just one of these events, which will load the relevant bpf program youv'e seen in the c code by, for example: Getting back to your first question, note that when you list the events, there are also two events named
Now you can understand why there are three perf buffer submissions for each syscalls. Usually, when sys_enter and sys_exit events are not chosen (this is the default behavior) - two of these submissions will never happen! Now for your last (second) question. Note that bufs and bufs_off are percpu maps. This means that each cpu gets its own copy of the buffer, and no inter-cpu race conditions can happen. Also, we aggregate the data on the exit from the syscall (in the case of syscall events), and our tracing programs can be preempted before they finish, so no race conditions should occur. Hope that this answers your questions. You might also find the cmdline help usefull to explore other features of Tracee (like capturing file writes and some other stuff) |
Beta Was this translation helpful? Give feedback.
-
@yanivagman thank you so much for the great answer!
I'm facepalming a bit here because I generally neglected reading the userspace code aside from the parts that deal with libbpf. I'm sure if I had just taken the time to better explore the cli options or the userspace code I would have seen this, sorry for the lack of effort!
Ah that makes so much more sense to me now! For some reason I assumed that all the code was meant to gather data and the arguments, but knowing this is just raw data collection makes the code so so much clearer for me.
This is a great example of a use of these maps then, I just haven't found a use case for very many of the BPF map types but perhaps its time for me to rethink that and go back to fundamentals to review map options. Thanks again, that answer was extremely helpful! |
Beta Was this translation helpful? Give feedback.
Hi @rtkaratekid ,
Many thanks for the compliments! Knowing that people find your code helpful is very satisfying.
You got most of the things right, and I think that what you are missing is that Tracee (more specifically tracee-ebpf) is capable of showing not only system calls events. If you are using the latest docker image (v0.5.0), you can list all of the events that are supported by Tracee ebpf using the
-l
flag:docker run -it --rm aquasec/tracee:latest trace -l
Note: In previous versions of Tracee, there was no need to provide the
trace
command, but this is now necessary as tracee-ebpf became just one of the components of Tracee (the other major component is tracee-rules)By listing …