This is the official implementation with training code for Thesis: Cell Morphology Based Diagnosis of Cancer using Convolutional Neural Networks: CellNet. For technical details, please refer to:
Cell Morphology Based Diagnosis of Cancer using Convolutional Neural Networks: CellNet
Qiang Li*, Otesteanu Corin* Manfred Claassen*
Paper In preparation
[Software Report] [CellNetSoftware Video] [Research Grant Page]
These are the reproduction results from this repository. All the training/testing lsg log file on ETH Zurich leonhard cluster can be downloaded from our lsf file and all original data for generating those data analyse graph can be downloaded from all data file
Comparison in terms of detection/segmentation accuracy with Yolo-based methods in (Redmon & Farhadi, 2018) (He et al., 2019). Here we selected 850 representative images for training from the sezary syndrome dataset, consist of noise images and typical cell images (manually labeled HD cell image and SS cell image). In the evaluation stage, we utilized 723 images ( 308 HD cell images, 306 SS cell images, and 109 noises images). We tried to simulate the actual cell data distribution, as noise image less than cell image in the real sezary dataset. It is worth noting that by applying AttentionNet*, we mean adopting a bunch of algorithms mentioned above together, including GBCIOU segmentation, KMean++ Clustering in pro-processing, and 13 × 13, 26 × 26 output Yolo layers, compared to original Yolo widely used in only detection or object localization scenario without segmentation. TP means cell detected as cell, FP implicit stands for noise detected as cell, and TN refers to noise image correctly labeled. mAP here refers mean Average Precision.
Model | TP (cell detected as cell) | FP (noise detected as cell) | TN (noise detected as noise) | No detection | mAP |
---|---|---|---|---|---|
YOLOV3-tiny | 63.19% | 0.91% | 87.16% | 33.05% | 0.55 |
AttentionNet* Solution | 96.25% | 11% | 80.73% | 1.93% | 0.88 |
TF-Yolo with Kmean++ Clustering | 91.20% | 9.17% | 66.05% | 11.20% | 0.73 |
Evaluate CellNet performance on CiFar10
This is the Boxplot of resnet18, Ournet, ghostnet on cifar without AttentionNet. Due to the fact that every image from this dataset is 32*32 pixel image, it's getting hard to train a well segmentor by AttentionNet to filter out the other artifacts in the image. As it illustrated that, even without AttentionNet preprocessing, our net already achieved the best performance.
Model | Weights(million) | Top-1 Val Acc.(%) | FLops(million) |
---|---|---|---|
VGG-16 | 15 | 93.6 | 313 |
ResNet-18 | 11 | 91.96 | 180 |
GhostNet | 5.18 | 91.45 | 141 |
OurNet | 2.91 | 92.45 | 41.7 |
CIFAR-10 dataset consists of 60,000 32 × 32 color images in 10 classes, with 50,000 training images and 10,000 test images. A common data augmentation scheme including random crop and mirroring is adopted as well.
Note:
- Speed are tested on a ETH Zurich Leonhard Cluster.
- You will see AttentionNet and ghostresNet in several places, please do not be frustrated in the paper, and ghostresNet = CellNet in the paper, just nickname:)!.
Evaluate CellNet performance on Pneumonia Dataset
On benchmark pneumonia dataset, the Pneumonia/Normal classification val accuracy of our Net converges into nearly 91.785% better than Ghost Net and ResNet18, In addition, after around 80 epochs the accuracy of our Net converged, comparing to Inception V3 after 7000 epochs reaches 88.0%.
Model | Weights(million) | Top-1 Val Acc.(%) | FLops(million) |
---|---|---|---|
InceptionV3 | 23.81 | 88 | 540 |
ResNet-18 | 11 | 87.50 | 180 |
GhostNet | 5.18 | 88.69 | 141 |
OurNet | 2.91 | 91.78 | 41.7 |
Evaluate CellNet performance on Sezary Syndrome Dataset
ResNet18 [17] and ShuffleNetv2 [25] were verified so far the most representative best performance on Sezary Syndrome Dataset. But Our* Net can achieve higher classification perfor-mance (e.g. 95.638% top-1 accuracy ) than ResNet 18 [17], ShuffleNet V2 [25] and GhostNet [16], while less weights and computational cost.
Model | Weights(million) | Top-1 Val Acc.(%) | FLops(million) |
---|---|---|---|
ResNet-18 | 11 | 95.28 | 180 |
GhostNet | 5.18 | 93.411 | 141 |
OurNet | 2.91 | 95.638 | 41.7 |
ShuffleNet V2 | 1.4 | 83.868 | 41 |
Note:
- Speed are tested on a ETH Zurich Leonhard Cluster.
- Performance are tested with AttentionNet preprocessing.
- This is I Chart of ournet, resnet 18, shufflenet without AttentionNet- Summary Report
- This is Time Series Plot of Shufflenet V, ResNet18 Val, GhostNet18 V, on Sezary syndrome with AttentionNet Preprocessing
Evaluate CellNet performance on COVID-19 Dataset
In order to help the medical scientists, we made this COVID-19 CT dataset. Based on the initial COVID-19 Image Data Collection, which contains only 123 frontal view X-rays. We also collected data from the newest publications on the European Journal of Radiology and collected nearly 1583 healthy Lung CT/Xray images as comparative data from recently available resources and publications.
Model | Weights(million) | Top-1 Val Acc.(%) | FLops(million) |
---|---|---|---|
ResNet-18 | 11 | 94.389 | 180 |
GhostNet | 5.18 | 92.739 | 141 |
OurNet | 2.91 | 94.719 | 41.7 |
MobileNet V2 | 3.4 | 95.38 | 301 |
Vgg11_BN | 13.28 | 87.129 | 132.87 |
DenseNet121 | 7.98 | 95.71 | 283 |
AlexNet | 60.95 | 0 | 727 |
SqueezeNet V2 | -- | 0 | 40 |
Note:
- -- denoted un-provided.
- Speed are tested on a ETH Zurich Leonhard Cluster.
- Performance are tested without AttentionNet preprocessing.
- This is I Chart of ournet, resnet 18, shufflenet without AttentionNet- Summary Report
Comparison of state-of-art methods for training on COVID-19 Dataset. Our models' weights are 2.91 million, comparing toDenseNet121 7.98 million of weights, MobileNet V2 3.4 million of weights, and 301 million of FLOPs; considering the higher complexity and parameter amount of other SOTA Nets, our Net is very competitive on classification tasks for the biomedical dataset.
Evaluate AttentionNet performance on Sezary Syndrome Dataset with Saliency Map
To better visualize the performance of the AttentionNet and demonstrate the necessity of AttentionNet, we wrote a saliency script to generate an attention map. ResNet18 puts more attention on the outside of ROI, while VGG and our Net focus more on ROI. AttentionNet is playing a vital role in eliminating the artifacts, enforcing the models more focus on the cell itself.
Note:
- For more attention maps see saliencymap folder.
Original pic: hd070916_2 (7102).png | After AttentionNet segmentation | Ournet with AttentionNet | Ournet without AttentionNet | Res18 with AttentionNet | Res18 without AttentionNet | Vgg16 with AttentionNet |
Original pic: hd070916_2 (7558).png | After AttentionNet segmentation | Ournet with AttentionNet | Ournet without AttentionNet | Res18 with AttentionNet | Res18 without AttentionNet | Vgg16 with AttentionNet |
Original pic: hd1 (3697).png | After AttentionNet segmentation | Ournet with AttentionNet | Ournet without AttentionNet | Res18 with AttentionNet | Res18 without AttentionNet | Vgg16 with AttentionNet |
Original pic: hd1 (4550).png | After AttentionNet segmentation | Ournet with AttentionNet | Ournet without AttentionNet | Res18 with AttentionNet | Res18 without AttentionNet | Vgg16 with AttentionNet |
Original pic: hd17_5 (1876).png | After AttentionNet segmentation | Ournet with AttentionNet | Ournet without AttentionNet | Res18 with AttentionNet | Res18 without AttentionNet | Vgg16 with AttentionNet |
Original pic: hd1 (4400).png | After AttentionNet segmentation | Ournet with AttentionNet | Ournet without AttentionNet | Res18 with AttentionNet | Res18 without AttentionNet | Vgg16 with AttentionNet |
Original pic: hd3 (1).png | After AttentionNet segmentation | Ournet with AttentionNet | Ournet without AttentionNet | Res18 with AttentionNet | Res18 without AttentionNet | Vgg16 with AttentionNet |
Original pic: ss2_8 (117).png | After AttentionNet segmentation | Ournet with AttentionNet | Ournet without AttentionNet | Res18 with AttentionNet | Res18 without AttentionNet | Vgg16 with AttentionNet |
Original pic: ss1_2 (270).png | After AttentionNet segmentation | Ournet with AttentionNet | Ournet without AttentionNet | Res18 with AttentionNet | Res18 without AttentionNet | Vgg16 with AttentionNet |
Original pic: ss2_8 (142).png | After AttentionNet segmentation | Ournet with AttentionNet | Ournet without AttentionNet | Res18 with AttentionNet | Res18 without AttentionNet | Vgg16 with AttentionNet |
Original pic: ss2_8 (468).png | After AttentionNet segmentation | Ournet with AttentionNet | Ournet without AttentionNet | Res18 with AttentionNet | Res18 without AttentionNet | Vgg16 with AttentionNet |
Prediction with our CellNet best weight trained so far on Non-cerebriform dataset, As shown in the figure, the TP and TN achieved the general highest score on HD/SS with more considerable image amount. Moreover, average accuracy up to 99.53%-96.51% among HD image, and average accuracy achieved 92.19%-98.78% among SS image, but there is some small folder obtain 38.29%-37.48% on SS1 and SS2, 40.17% in SS6_B folder as well.
After further finetuning, basically using best weight trained so far + new subset of Non-cerebriform, and set mini batch=679, trained around 100 epochs. We test the performance again. As shown, the accuracy is improved with SS1 and SS2 and SS6_B folder surprisingly up to 64.34%, 82.64%, and 96.91%.
This is the comparison between Cellnet and ResNet18 on the Non-cerebriform dataset with finetuning. As illustrated, our net has comparable Acc. even some higher on some folder.
Prediction with our CellNet best weight trained so far on the cerebriform dataset, As shown in the figure, the TP and TN achieved comparable accuracy(in %) with resnet18.
Now our software upload on nash cloud as well, and support pretrained_weight further training, and all the prediction lsg files you can check here: lsg file for you to check
You want to have it try by your own dataset with our cellnet. No problem! These are all the commands
Take a look at our CellNet software framework and Our CellNet won Top AI Camp Deecamp2020 Medical Track 2nd place
With the help of the power of Qt and the high efficiency of Python, using PyQt/PySide for desktop development will be a wonderful plus for demonstrating our excellent software. The current Qt/PyQt/PySide based GUI development common development methods are list follow: QWidget + QSS, QtWebkit+ HTML + CSS + js and Qt Quick. All these three technologies can efficiently and quickly develop the crossplatform desktop software. Qt’s formal development method is Qt Quick, which uses the JSON like language qml for rapid development. It is easy to learn, expansible, and wildly used in Ubuntu, LinuxDeepin, and other Linux desktop application development. It enables the developer for a rapid development framework and putting more effort into amplifying the corresponding business logic and easy to build the framework prototypes quickly.
The Proposed software structure diagram. To better demonstrate our model’s diagnostic performance, we selected the classic medical bench-mark datasets from competitions on Kaggle, such as the melanoma dataset, the diabetic retinopathy dataset, the actinic keratosis, vascular lesion dataset dermatofibroma dataset, squamous cell carcinoma dataset. Meanwhile, we selected nearly 11 representative classification networks, enable users to choose the di-agnostic network that fits their customer dataset. Besides, we inherit the computer vision classification network and the classic classification network of NLP. We develop desktop applications and open APIs to facilitate a better user experience, and ETH Leonhard and Megengine jointly provide our computing power.
All Software copy right licensed by QiangLi
@inproceedings{Qiang21ICLRW,
author = {Qiang Li and Lily Xu and Corin Otesteanu},
title = {All you need is Cell Attention: A Cell Annotation Tool for Single-Cell Morphology Data},
booktitle = {AI4PH Workshop on ICLR},
year = {2021}
}