-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat/better seed debug logging #1240
base: master
Are you sure you want to change the base?
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
05c958c
to
27d669e
Compare
let max_shift = seed_matches.iter().map(|sm| sm.offset.abs()).max().unwrap_or(0); | ||
// Need to iterate in pairs | ||
let max_unmatched_ref_stretch = seed_matches | ||
.iter() | ||
.tuple_windows() | ||
.map(|(sm1, sm2)| sm2.ref_pos - (sm1.ref_pos + sm1.length)) | ||
.max() | ||
.unwrap_or(0); | ||
let max_unmatched_qry_stretch = seed_matches | ||
.iter() | ||
.tuple_windows() | ||
.map(|(sm1, sm2)| sm2.qry_pos - (sm1.qry_pos + sm1.length)) | ||
.max() | ||
.unwrap_or(0); | ||
let first_match = seed_matches.first().unwrap(); | ||
let last_match = seed_matches.last().unwrap(); | ||
let first_match_ref_pos = first_match.ref_pos; | ||
let first_match_qry_pos = first_match.qry_pos; | ||
let last_match_ref_pos = ref_seq.len() - last_match.ref_pos - last_match.length; | ||
let last_match_qry_pos = qry_seq.len() - last_match.qry_pos - last_match.length; | ||
|
||
let n_matches = seed_matches.len(); | ||
debug!("Chained seed stats. max indel: {max_shift}, # matches: {n_matches}, first/last match distance from start/end [start: (ref: {first_match_ref_pos}, qry: {first_match_qry_pos}), end: (ref: {last_match_ref_pos}, qry: {last_match_qry_pos})], max unmatched stretch (ref: {max_unmatched_ref_stretch}, qry stretch: {max_unmatched_qry_stretch})"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please extract the body of the condition into a separate function (keeping the condition itself where it is), so that it does not distract from the main algo code.
Note that the conditional jump will always be executed in production. (And every log macro invocation also adds a similar conditional jump underneath, which might hurt performance in tight loops. This particular place seems to be executed rarely though.)
Is this thing required in release builds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point to refactor into a function.
I know that this will be always executed in production - but it's only executed once per sequence, I don't think we need to think about perf at things so rarely executed.
Not required in release builds but it's nice if once can always get this info - we also include lots of trace statements in release builds. What's the argument against including it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. No particular argument other than it becomes unnecessary instructions for most executions. Let's hope branch predictor figures it out :)
93f31a9
to
edefa32
Compare
27d4a31
to
9010d85
Compare
We're currently not really using
debug
level logging for anything. I think it'd be useful to output some basic stats for each sequence like what proportion of the query are covered by chained matches, mean band width, max indel, and things like that.This is what this PR outputs at debug level (
-vv
):