Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Category inconsistent with paper on NC #19

Open
PonderLi opened this issue Jul 20, 2024 · 4 comments
Open

Category inconsistent with paper on NC #19

PonderLi opened this issue Jul 20, 2024 · 4 comments

Comments

@PonderLi
Copy link

Hello,
I read the your paper published on Nature Coummunication recently and you mentioned that the script getElementClassifications.R (https://github.com/clb21565/mobileOGdb/blob/main/scripts/getElementClassifications.R) was used to classify MGEs.

In the paper, the MGEs were profiled by following rules:
MGE marker hits were subclassified into element classes of plasmid (sequences derived from COMPASS64 or NCBI Plasmid RefSeq65), transposable element (sequences derived from ISfinder66), integrative (sequences derived from ICEberg67 and integration/excision category proteins not included in ISfinder), or conjugative types (sequences with the transfer major mobileOG category and conjugation minor category) using the script getElementClassifications.R.

But in the getElementClassifications.R, the rules seemed to by inconsistent.

Could you help me to choose reasonable and right rules? Thanks for your help!

@clb21565
Copy link
Owner

clb21565 commented Jul 20, 2024

Hi there, thanks for using mobileOG-db!

I'm unclear what you are referring to here. The getElementClassifications.R script just creates a new file using the mobileOG-db metadata with classifications. As stated in the paper, proteins can be assigned to multiple classes (as in the figure on the tool readme page) if they are found in more than one database or meet more than one of those conditions. IOW, it is possible to have a protein that is labeled as a plasmid and insertion sequence, for example.

please do send an example if this doesn't clear things up!

@PonderLi
Copy link
Author

Thanks for your quick and kind reply!

For example, in the paper integrative elements were defined as sequences derived from ICEberg and integration/excision category proteins not included in ISfinder.
In the script, the integrative elements only use sequences derived from ICEberg/immedb and not from ISfinder. No integration/excision category was considered , especially some MGEs in ACLAME with this feature. The code was showed bollowing:
"IGE=mobileOGs_classes%>%subset(AICE!=0|ICE!=0|CIME!=0|immedb!=0|IME!=0)%>%subset(ISFinder==0)
IGE$MGE_Class="Integrative Element (IGE)"

When I tried to classify the MGEs with rule (sequences derived from integration/excision category proteins not included in ISfinder), some MGEs may be annotated with many database, such as phage and integrative elements, I konw it is OK according to your explaination. Thanks a lot!
I am currently mainly confused about the classification method for integrative elements. Should I follow the method described in the paper or the one used in the script?

Hi there, thanks for using mobileOG-db!

I'm unclear what you are referring to here. The getElementClassifications.R script just creates a new file using the mobileOG-db metadata with classifications. As stated in the paper, proteins can be assigned to multiple classes (as in the figure on the tool readme page) if they are found in more than one database or meet more than one of those conditions. IOW, it is possible to have a protein that is labeled as a plasmid and insertion sequence, for example.

please do send an example if this doesn't clear things up!

@clb21565
Copy link
Owner

Ah, nice catch! I suggest to follow the paper's - will push an update to the script shortly.

@PonderLi
Copy link
Author

Thank you for your answer and help. I wish you a joyful day every day.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants