Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

modification support for DORADO #46

Open
lpryszcz opened this issue Mar 4, 2024 · 0 comments
Open

modification support for DORADO #46

lpryszcz opened this issue Mar 4, 2024 · 0 comments

Comments

@lpryszcz
Copy link
Collaborator

lpryszcz commented Mar 4, 2024

Hi Luca, mostly copy-paste from @soniacruciani with some of my comments:

dorado encodes modifications (ie when using RNA004 m6A DRACH model) in bam files (MM and ML tags).
In order to support dorado m6A model in the upcoming release we have two options:

  1. do mapping via dorado - BAM output is already mapped. this is not ideal because dorado doesn't support much control over alignement ie lack of spliced alignment.
  2. do mapping after basecalling and include modification info (MM and ML tags) in FastQ and BAM using following commands
dorado basecall > unmapped.BAM_with_mods_from_dorado
samtools view -F3840 -bu unmapped.BAM_with_mods_from_dorado | samtools fastq -T MM,ML,mv,pt,ts > fastq_with_mods
minimap2 -y ... $ref fastq_with_mods | samtools sort --write-index -o mapped.BAM.with_mods

Note, the second option is troublesome, because dorado/remora uses reference sequence in order to make better modification calls (reference-anchored calling). So if we go for it, expect lower modification calling accuracy.

If we go for 2nd option, I'd recommend to include a few more useful tags in final BAM:

  • mv - move table
  • pt - estimated polyA tail length
  • ts - template start
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant