- Added function
sort_site_list
for numerically sorting a string of comma-separated integers, as returned bygroup_concat()
's. - INTERNAL: Removed redundant inline function
split_by_substr_view
and instead used the already-existingsplit_by_substr
function.
- Optimized functions
sort_alleles
andsort_list_unique
for efficiency. - Updated documentation
- Added distance function
tn_93
for calculating Tamura-Nei (TN-93) Distance between two aligned nucleotide sequences, with optional parameter for gamma correction for variability of mutations among sites.
- Added aggregate functions
alphanumeric_entropy
,nt_entropy
,aa_entropy
, andcodon_entropy
which will calculate the Shannon information entropy for alphanumeric strings, nucleotides, amino acids and codons respectively (the latter three expect only one per record). - Added scalar function
alnum_entropy
to calculate the Shannon entropy of an alphanumeric string.
- Changed the annotation for glycosylation in
mutation_list_gly
andmutation_list_indel_gly
from reporting-ADD-GLY
and-LOSS-GLY
to the more conventionally accepted(CHO+)
,(CHO+/-)
, and(CHO-)
. mutation_list_gly
andmutation_list_indel_gly
will ignore.
(which represents an unresolved base at the 3` or 5` end) in a sequence but will correctly treat-
as a gap.- Unit tests are changed to test for whether
.
is correctly ignored inmutation_list_gly
andmutation_list_indel_gly
.