Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Significant increase in unknown/unassigned reads - change in GPU architecture??? #28

Closed
DepledgeLab opened this issue Jan 12, 2024 · 4 comments

Comments

@DepledgeLab
Copy link

Hello,

The HPC that I use for deeplexicon recently received a hardware upgrade that broke my installation. I have since rebuilt it and am now performing the demultiplexing step on an NVIDIA A100 80GB PCIe (previously Tesla V100-SXM2-16GB). However, I am getting a huge number of unknown/unassigned reads, even when repeating previous demultiplexing runs...

For instance,

Original Tesla V100 run
1 Barcode
225803 bc_1
124918 bc_2
172721 bc_3
91739 bc_4
19497 unknown

New NVIDIA A100 run
1 Barcode
86091 bc_1
29855 bc_2
6 bc_3
49 bc_4
518677 unknown

I guess I don't understand enough about GPU processes to know if the GPU itself is causing a problem or something else. I would really appreciate any help you can give on this front.

The (rebuilt) virtual environment I am using has the following packages installed (pip freeze).

absl-py==0.7.1
astor==0.8.0
cycler==0.10.0
gast==0.2.2
google-pasta==0.1.7
grpcio==1.22.0
h5py==2.10.0
joblib==0.13.2
Keras==2.2.4
Keras-Applications==1.0.8
Keras-Preprocessing==1.1.0
kiwisolver==1.1.0
llvmlite==0.36.0
Markdown==3.1.1
matplotlib==3.1.1
mock==5.1.0
numba==0.53.0
numpy==1.17.0
ont-fast5-api==3.3.0
packaging==23.2
pandas==0.25.0
progressbar33==2.4
protobuf==3.9.1
pyparsing==2.4.2
python-dateutil==2.8.0
pyts==0.8.0
pytz==2019.2
PyYAML==5.1.2
scikit-learn==0.21.3
scipy==1.3.1
six==1.12.0
tensorboard==1.13.1
tensorflow==1.13.1
tensorflow-estimator==1.13.0
tensorflow-gpu==1.13.1
termcolor==1.1.0
Werkzeug==0.15.5
wrapt==1.11.2

@DepledgeLab
Copy link
Author

A small update. I managed to find a node on the HPC that was still running the older Tesla V100 and the results now look 'normal' again in terms of most reads being assigned a specific barcode.

However, I am still concerned about why this doesn't happen with the A100 cards and would appreciate your thoughts on this. Is there any reason to suspect that the 'normal' results from the V100 might also be compromised?

@DepledgeLab
Copy link
Author

I have belatedly discovered this issue has been reported in other forms here and [here]biocorecrg/BioNextflow#17).

A warning on the GitHub page (and other places) about not using newer Nvidia GPU architectures is definitely needed.

@enovoa
Copy link
Collaborator

enovoa commented Jan 12, 2024

Hi @DepledgeLab, yes indeed we ourselves had the same issue when we changed to testing new GPU architectures, and we opened the issue you refer above to ourselves. I fully agree that we should put a warning in the README, I will fix this now.

PS. To solve the GPU issue and also to improve the speed of demultiplexing, we've been working on an alternative approach to demultiplex direct RNA runs. We'll be releasing the code very soon, it is embedded as part of the new version of MasterOfPores that we are working on, version 3, which we are about to release once we fix a couple of final things, should be ready very soon, I will comment here once it is public. Thanks!

@enovoa
Copy link
Collaborator

enovoa commented Jan 12, 2024

Updated README with WARNING about CUDA version required added.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants