Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variable inclusion control #75

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open

Conversation

MattBearham
Copy link
Collaborator

Added new options for the "keep" parameter of build_from_derived():

  • PREREQUISITE: Additionally keep columns referenced in derivation specifications
  • ALL: Brings through all columns present in original datasets
  • Deprecated keep = TRUE

Added supporting function prepare_join(), which deals with conflics when adding duplicated non-key columns from different source datasets

Added tests to account for new functionality

Fixed bug when two columns are remained from the same source column, and only one is included

Copy link
Collaborator

@statasaurus statasaurus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Matt!
This looks really good there are just a few tweaks to be made

R/build.R Outdated
Comment on lines 25 to 29
#' - "PREREQUISITE": columns are kept if they are required for future
#' derivations in the specification. For example, if
#' a derivation references VSSTDTC despite this not
#' being present in the ADaM specification, the column
#' will be kept.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand what this means. Because it seems to be both saying that if it is in the specification it will be kept even if it isn't the specifications... Also can you make sure to add in what the pattern is we use. So something like "Variables will be retained if they are either in the derivation column of the metacore object in the form of DATASET.VARIABLE. So in the case of "BMI = ADSL.HEIGHTBL/WEIGHTBL^2" both HEIGHTBL and WEIGHTBL will be retrained delisted not being a predecessor (this will need to be edited cause I can don't think that is the right bmi calculation)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have updated to be more specific and include examples of predecessors/prerequisites with definitions. Let me know if you think it's clearer!

R/build.R Outdated
# Deprecate KEEP = TRUE
keep <- match.arg(as.character(keep), c("TRUE", "FALSE", "ALL", "PREREQUISITE"))
if (keep == "TRUE"){
warning("Setting 'keep' = TRUE has been superseded, and will be unavailable in future releases.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you switch this to a cli_alert_warning. At some point I am going to improve the error messaging in this package and change everything over to cli

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have done so, and updated tests/NAMESPACE/DESCRIPTION to import CLI

R/build.R Outdated Show resolved Hide resolved
R/build.R Outdated Show resolved Hide resolved
R/build.R Outdated Show resolved Hide resolved
out <- data %>%
select(x$col_name) %>%
rename(all_of(rename_vec))
mutate(across(all_of(rename_vec))) %>%
select(x$variable)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the change to fix the issue with one predecessor to multiple columns - not sure if this would break anything, figure you probably understand how people might use it better,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants