Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Somewhat unclear references to types in SAMtags #798

Open
cmdcolin opened this issue Oct 14, 2024 · 1 comment
Open

Somewhat unclear references to types in SAMtags #798

cmdcolin opened this issue Oct 14, 2024 · 1 comment
Assignees

Comments

@cmdcolin
Copy link
Contributor

"Optional fields are usually displayed as TAG:TYPE:VALUE; the type may be one of A (character), B (general
array), f (real number), H (hexadecimal array), i (integer), or Z (string)."

however there is C,I,S, and the notion that B is combined with the types is not immediately obvious from the sentence. somewhat unclear what H is also.

digging deeper into cross references from BAM/CRAM might reveal more clarity, but perhaps even explicitly linking to those docs from SAMtags could help

@jkbonfield
Copy link
Contributor

c, C, i, I and s, S are (or were) BAM encoding specific. So in SAM we could have AB:i:7 while in BAM it would be e.g. AB C \007.

Once "B" was added as a byte array, the internal C,I,S representation was exposed to the text format. Although arguably it's not needed and i could have sufficed, I am guessing this was to aid rapid conversion to BAM and to avoid the need for multiple passes through the data to work out the minimum and maximum values.

PS. I'm not sure I like "are usually displayed". For SAM it's mandatory, and there's really no "display" for CRAM or BAM. It's a bit of a woolly definition. We should probably copy the table from SAMv1.tex where it defines them more precisely using a regular expression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants