-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Line length limit on input FASTA file: 65,536 characters (limit imposed by bioperl) #56
Comments
Incidentally, if anyone bumps into the same issue, you can use FASTX-Toolkit to reformat your FASTA (see http://hannonlab.cshl.edu/fastx_toolkit/commandline.html#fasta_formatter_usage). It can be installed using conda.
|
Yes I could add a patch to reformat the Fasta file in such case, but I would prefer that this type of fix is hold within Bioperl directly. |
See here for discussion with bioperl team: bioperl/bioperl-live#345 |
I see bioperl does not have a plan to fix this issue. Here is a Perl alternative of the fastx_toolkit written by Ning Jiang: https://github.com/oushujun/LTR_retriever/blob/master/bin/fasta-reformat.pl. It's slower but free of third-party dependencies. |
Hello,
I'm trying to run the following command:
agat_sp_extract_sequences.pl -g JU2526_Y39G10AR.22.gff -f JU2526*_region.fa -p
And it throws the following error:
It would appear the use of BioPerl means that your scripts won't accept single-line FASTAs with sequences longer than 65kb. Would it be possible to do pre-processing (ie converting from single-line to multi-line) of the FASTAs within your scripts so that they work regardless of the input format? While it's straightforward enough to convert the FASTA file prior to running your scripts, it would be far more straightforward to have it done by the script itself. Would probably save you a tonne of time with confused users, too.
Thanks,
Lewis
PS: I've only begun using AGAT but it seems like it will largely solve the constant pain of working with GFF3 files. Huge thanks for developing it!
The text was updated successfully, but these errors were encountered: