Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code review for contributions to MAMBO project #30

Open
metazool opened this issue Jul 29, 2024 · 4 comments
Open

Code review for contributions to MAMBO project #30

metazool opened this issue Jul 29, 2024 · 4 comments

Comments

@metazool
Copy link
Collaborator

As a group we've had a request to offer code review of work contributed to the MAMBO project.

For now this is a placeholder issue while we discover:

  • How much of our time the project budget has got earmarked for the engagement
  • What the volume and quality of code is like in outline, whether a considered review fits in the maximum time available
  • Any specific needs (documentation, process) that make having a code review a condition of the project

I've reached out to the PI and trying to close the loop by pointing the code contributors here. Links to repos and docs to be added as they are discovered. Date for completion of whatever can be done, is likely to be end of November 2024.

Anticipate getting out of this exercise:

  • Identify through practise where workshops or hands-on support would benefit research coders (testing, CI, MLOps, other)
  • Opportunity for more than one person to work on this together, learn from one another
  • Insight into the domain, what's happening Europe-wide in biodiversity monitoring?
@metazool
Copy link
Collaborator Author

Quick update on this - we've received links to three projects, one of which is accessible to read.

As submitted it needs more documentation, I've pushed back on accepting for review until we reach a certain threshold; and we should be involved in helping to define what that is in future projects that look like this! Included here is the bulk of my response by email:

https://book.the-turing-way.org/reproducible-research/reviewing/reviewing-checklist- the Turing Institute produce a decent, if idealised guide to the process of reviewing research code that we would be looking to go through.

Their checklist goes into a lot of detail, not all of which we would insist upon. For example, if someone is working in a setting where they have never been expected to produce automated tests that run and check their code, and they hadn’t been explicitly asked to do this at the project outset, then we couldn’t demand that of them (and would consider contributing some tests ourselves as part of the review process).

How much capacity do the project partners have to ask the contributors to do a bit more work to make their code reusable? At a minimum threshold needed for the results to be reproduced, code included in a research project needs:

• Description of how to build the software, and guidance for any software it depends on
• Description on how to run it against the sample data provided, including any specific conditions for the environment it runs in
• Description of what the intended outcome is

This doesn’t need to be a lot of overhead, or a formal document – it could be a few paragraphs in a README.txt file. I would push back against accepting anything that doesn’t have this in place. We could and ideally should ask for more thoroughness, go further into that Turing checklist! Documentation within the code itself about what each function does; a standard layout aid readability; and crucially, a set of code tests that check its output, check it handles bad data properly, and illustrate its inner workings. But if this wasn’t requested up front, and would cause extra work the project partners don’t have resources for, then we probably can’t expect it now.

@metazool
Copy link
Collaborator Author

metazool commented Dec 3, 2024

@metazool
Copy link
Collaborator Author

metazool commented Dec 3, 2024

https://github.com/barbedorafael/shrub-prepro
https://github.com/barbedorafael/att-unet-shrub-id

  • a couple of repositories in @barbedorafael 's personal namespace which are part of the same project and would ideally form stages of the same pipeline (Unet for segmentation masks then algorithmic methods for biomass estimation from LIDAR) - Rafael's other projects got prioritised above this, it's worth finding a more institutional / sustainable home for the repositories.

@mattjbr123 - tagging you again as you've been in the loop before and if we check https://github.com/Jinhu-Wang/Tree_Classification_and_Individualization_In_Marsh_Area/tree/main/src there have been some very nice improvements made by @Jinhu-Wang

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant