-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Variable inclusion control #75
base: main
Are you sure you want to change the base?
Conversation
…averse/metatools into variable_inclusion_control
…averse/metatools into variable_inclusion_control
…averse/metatools into variable_inclusion_control
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Matt!
This looks really good there are just a few tweaks to be made
R/build.R
Outdated
#' - "PREREQUISITE": columns are kept if they are required for future | ||
#' derivations in the specification. For example, if | ||
#' a derivation references VSSTDTC despite this not | ||
#' being present in the ADaM specification, the column | ||
#' will be kept. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I understand what this means. Because it seems to be both saying that if it is in the specification it will be kept even if it isn't the specifications... Also can you make sure to add in what the pattern is we use. So something like "Variables will be retained if they are either in the derivation column of the metacore object in the form of DATASET.VARIABLE. So in the case of "BMI = ADSL.HEIGHTBL/WEIGHTBL^2" both HEIGHTBL and WEIGHTBL will be retrained delisted not being a predecessor (this will need to be edited cause I can don't think that is the right bmi calculation)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have updated to be more specific and include examples of predecessors/prerequisites with definitions. Let me know if you think it's clearer!
R/build.R
Outdated
# Deprecate KEEP = TRUE | ||
keep <- match.arg(as.character(keep), c("TRUE", "FALSE", "ALL", "PREREQUISITE")) | ||
if (keep == "TRUE"){ | ||
warning("Setting 'keep' = TRUE has been superseded, and will be unavailable in future releases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you switch this to a cli_alert_warning. At some point I am going to improve the error messaging in this package and change everything over to cli
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have done so, and updated tests/NAMESPACE/DESCRIPTION to import CLI
…averse/metatools into variable_inclusion_control # Conflicts: # R/build.R
out <- data %>% | ||
select(x$col_name) %>% | ||
rename(all_of(rename_vec)) | ||
mutate(across(all_of(rename_vec))) %>% | ||
select(x$variable) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the change to fix the issue with one predecessor to multiple columns - not sure if this would break anything, figure you probably understand how people might use it better,
Added new options for the "keep" parameter of build_from_derived():
Added supporting function prepare_join(), which deals with conflics when adding duplicated non-key columns from different source datasets
Added tests to account for new functionality
Fixed bug when two columns are remained from the same source column, and only one is included