Skip to content

Latest commit

 

History

History
291 lines (166 loc) · 37.5 KB

README.md

File metadata and controls

291 lines (166 loc) · 37.5 KB

AttentionNet and CellNet Software

This is the official implementation with training code for Thesis: Cell Morphology Based Diagnosis of Cancer using Convolutional Neural Networks: CellNet. For technical details, please refer to:

Cell Morphology Based Diagnosis of Cancer using Convolutional Neural Networks: CellNet
Qiang Li*, Otesteanu Corin* Manfred Claassen*
Paper In preparation
[Software Report] [CellNetSoftware Video] [Research Grant Page]

Project work flow

General Workflow of Proposed Project. After Image Flow Cytometry (Morphological identification of tumor T-cells in the blood), those generated images can be categorized into six typical classes: lighting artifacts, out of focus cell, debris, contaminated cell, outside FOV cells and multiple cells concatenated together. Using AttentionNet as an automatic detector and segment-or, we can filter out most artifacts in the images and only keep the morphological characteristics of the cell for CellNet classification.

Results

These are the reproduction results from this repository. All the training/testing lsg log file on ETH Zurich leonhard cluster can be downloaded from our lsf file and all original data for generating those data analyse graph can be downloaded from all data file

Evaluate AttentionNet performance on Sezary Syndrome Dataset

Comparison in terms of detection/segmentation accuracy with Yolo-based methods in (Redmon & Farhadi, 2018) (He et al., 2019). Here we selected 850 representative images for training from the sezary syndrome dataset, consist of noise images and typical cell images (manually labeled HD cell image and SS cell image). In the evaluation stage, we utilized 723 images ( 308 HD cell images, 306 SS cell images, and 109 noises images). We tried to simulate the actual cell data distribution, as noise image less than cell image in the real sezary dataset. It is worth noting that by applying AttentionNet*, we mean adopting a bunch of algorithms mentioned above together, including GBCIOU segmentation, KMean++ Clustering in pro-processing, and 13 × 13, 26 × 26 output Yolo layers, compared to original Yolo widely used in only detection or object localization scenario without segmentation. TP means cell detected as cell, FP implicit stands for noise detected as cell, and TN refers to noise image correctly labeled. mAP here refers mean Average Precision.

Model TP (cell detected as cell) FP (noise detected as cell) TN (noise detected as noise) No detection mAP
YOLOV3-tiny 63.19% 0.91% 87.16% 33.05% 0.55
AttentionNet* Solution 96.25% 11% 80.73% 1.93% 0.88
TF-Yolo with Kmean++ Clustering 91.20% 9.17% 66.05% 11.20% 0.73

Evaluate CellNet performance on CiFar10

This is the Boxplot of resnet18, Ournet, ghostnet on cifar without AttentionNet. Due to the fact that every image from this dataset is 32*32 pixel image, it's getting hard to train a well segmentor by AttentionNet to filter out the other artifacts in the image. As it illustrated that, even without AttentionNet preprocessing, our net already achieved the best performance.

Model Weights(million) Top-1 Val Acc.(%) FLops(million)
VGG-16 15 93.6 313
ResNet-18 11 91.96 180
GhostNet 5.18 91.45 141
OurNet 2.91 92.45 41.7

CIFAR-10 dataset consists of 60,000 32 × 32 color images in 10 classes, with 50,000 training images and 10,000 test images. A common data augmentation scheme including random crop and mirroring is adopted as well.

Note:

  • Speed are tested on a ETH Zurich Leonhard Cluster.
  • You will see AttentionNet and ghostresNet in several places, please do not be frustrated in the paper, and ghostresNet = CellNet in the paper, just nickname:)!.

Comparison  of  state-of-art  methods  on  CIFAR10  Dataset

Evaluate CellNet performance on Pneumonia Dataset

On benchmark pneumonia dataset, the Pneumonia/Normal classification val accuracy of our Net converges into nearly 91.785% better than Ghost Net and ResNet18, In addition, after around 80 epochs the accuracy of our Net converged, comparing to Inception V3 after 7000 epochs reaches 88.0%.

Model Weights(million) Top-1 Val Acc.(%) FLops(million)
InceptionV3 23.81 88 540
ResNet-18 11 87.50 180
GhostNet 5.18 88.69 141
OurNet 2.91 91.78 41.7

Pneumonia Dataset

Evaluate CellNet performance on Sezary Syndrome Dataset

ResNet18 [17] and ShuffleNetv2 [25] were verified so far the most representative best performance on Sezary Syndrome Dataset. But Our* Net can achieve higher classification perfor-mance (e.g. 95.638% top-1 accuracy ) than ResNet 18 [17], ShuffleNet V2 [25] and GhostNet [16], while less weights and computational cost.

Model Weights(million) Top-1 Val Acc.(%) FLops(million)
ResNet-18 11 95.28 180
GhostNet 5.18 93.411 141
OurNet 2.91 95.638 41.7
ShuffleNet V2 1.4 83.868 41

Note:

  • Speed are tested on a ETH Zurich Leonhard Cluster.
  • Performance are tested with AttentionNet preprocessing.
  • This is I Chart of ournet, resnet 18, shufflenet without AttentionNet- Summary Report

  • This is Time Series Plot of Shufflenet V, ResNet18 Val, GhostNet18 V, on Sezary syndrome with AttentionNet Preprocessing

Sezary Syndrome-Dataset

Evaluate CellNet performance on COVID-19 Dataset

In order to help the medical scientists, we made this COVID-19 CT dataset. Based on the initial COVID-19 Image Data Collection, which contains only 123 frontal view X-rays. We also collected data from the newest publications on the European Journal of Radiology and collected nearly 1583 healthy Lung CT/Xray images as comparative data from recently available resources and publications.

Model Weights(million) Top-1 Val Acc.(%) FLops(million)
ResNet-18 11 94.389 180
GhostNet 5.18 92.739 141
OurNet 2.91 94.719 41.7
MobileNet V2 3.4 95.38 301
Vgg11_BN 13.28 87.129 132.87
DenseNet121 7.98 95.71 283
AlexNet 60.95 0 727
SqueezeNet V2 -- 0 40

Note:

  • -- denoted un-provided.
  • Speed are tested on a ETH Zurich Leonhard Cluster.
  • Performance are tested without AttentionNet preprocessing.
  • This is I Chart of ournet, resnet 18, shufflenet without AttentionNet- Summary Report

 COVID-19  Dataset

Comparison of state-of-art methods for training on COVID-19 Dataset. Our models' weights are 2.91 million, comparing toDenseNet121 7.98 million of weights, MobileNet V2 3.4 million of weights, and 301 million of FLOPs; considering the higher complexity and parameter amount of other SOTA Nets, our Net is very competitive on classification tasks for the biomedical dataset.

Sezary Syndrome-Dataset

Evaluate AttentionNet performance on Sezary Syndrome Dataset with Saliency Map

To better visualize the performance of the AttentionNet and demonstrate the necessity of AttentionNet, we wrote a saliency script to generate an attention map. ResNet18 puts more attention on the outside of ROI, while VGG and our Net focus more on ROI. AttentionNet is playing a vital role in eliminating the artifacts, enforcing the models more focus on the cell itself.

Note:

original pic aftercellyolo ournetwithAttentionNet ournetwithoutAttentionNet resnetwithAttentionNet resnetwithoutAttentionNet vggnetwithAttentionNet
Original pic: hd070916_2 (7102).png After AttentionNet segmentation Ournet with AttentionNet Ournet without AttentionNet Res18 with AttentionNet Res18 without AttentionNet Vgg16 with AttentionNet
original pic afterAttentionNet ournetwithAttentionNet ournetwithoutAttentionNet resnetwithAttentionNet resnetwithoutAttentionNet vggnetwithAttentionNet
Original pic: hd070916_2 (7558).png After AttentionNet segmentation Ournet with AttentionNet Ournet without AttentionNet Res18 with AttentionNet Res18 without AttentionNet Vgg16 with AttentionNet
original pic afterAttentionNet ournetwithAttentionNet ournetwithoutAttentionNet resnetwithAttentionNet resnetwithoutAttentionNet vggnetwithAttentionNet
Original pic: hd1 (3697).png After AttentionNet segmentation Ournet with AttentionNet Ournet without AttentionNet Res18 with AttentionNet Res18 without AttentionNet Vgg16 with AttentionNet
original pic afterAttentionNet ournetwithAttentionNet ournetwithoutAttentionNet resnetwithAttentionNet resnetwithoutAttentionNet vggnetwithAttentionNet
Original pic: hd1 (4550).png After AttentionNet segmentation Ournet with AttentionNet Ournet without AttentionNet Res18 with AttentionNet Res18 without AttentionNet Vgg16 with AttentionNet
original pic afterAttentionNet ournetwithAttentionNet ournetwithoutAttentionNet resnetwithAttentionNet resnetwithoutAttentionNet vggnetwithAttentionNet
Original pic: hd17_5 (1876).png After AttentionNet segmentation Ournet with AttentionNet Ournet without AttentionNet Res18 with AttentionNet Res18 without AttentionNet Vgg16 with AttentionNet
original pic afterAttentionNet ournetwithAttentionNet ournetwithoutAttentionNet resnetwithAttentionNet resnetwithoutAttentionNet vggnetwithAttentionNet
Original pic: hd1 (4400).png After AttentionNet segmentation Ournet with AttentionNet Ournet without AttentionNet Res18 with AttentionNet Res18 without AttentionNet Vgg16 with AttentionNet
original pic aftercellyolo ournetwithAttentionNet ournetwithoutAttentionNet resnetwithAttentionNet resnetwithoutAttentionNet vggnetwithAttentionNet
Original pic: hd3 (1).png After AttentionNet segmentation Ournet with AttentionNet Ournet without AttentionNet Res18 with AttentionNet Res18 without AttentionNet Vgg16 with AttentionNet
original pic afterAttentionNet ournetwithcellyolo ournetwithoutAttentionNet resnetwithAttentionNet resnetwithoutAttentionNet vggnetwithAttentionNet
Original pic: ss2_8 (117).png After AttentionNet segmentation Ournet with AttentionNet Ournet without AttentionNet Res18 with AttentionNet Res18 without AttentionNet Vgg16 with AttentionNet
original pic afterAttentionNet ournetwithcellyolo ournetwithoutAttentionNet resnetwithAttentionNet resnetwithoutAttentionNet vggnetwithAttentionNet
Original pic: ss1_2 (270).png After AttentionNet segmentation Ournet with AttentionNet Ournet without AttentionNet Res18 with AttentionNet Res18 without AttentionNet Vgg16 with AttentionNet
original pic aftercellyolo ournetwithAttentionNet ournetwithoutAttentionNet resnetwithAttentionNet resnetwithoutAttentionNet vggnetwithAttentionNet
Original pic: ss2_8 (142).png After AttentionNet segmentation Ournet with AttentionNet Ournet without AttentionNet Res18 with AttentionNet Res18 without AttentionNet Vgg16 with AttentionNet
original pic afterAttentionNet ournetwithcellyolo ournetwithoutAttentionNet resnetwithAttentionNet resnetwithoutAttentionNet vggnetwithAttentionNet
Original pic: ss2_8 (468).png After AttentionNet segmentation Ournet with AttentionNet Ournet without AttentionNet Res18 with AttentionNet Res18 without AttentionNet Vgg16 with AttentionNet

The generalization performance with our best weight(with/without finetuning)

Non-cerebriform dataset

CellNet before finetuning

Prediction with our CellNet best weight trained so far on Non-cerebriform dataset, As shown in the figure, the TP and TN achieved the general highest score on HD/SS with more considerable image amount. Moreover, average accuracy up to 99.53%-96.51% among HD image, and average accuracy achieved 92.19%-98.78% among SS image, but there is some small folder obtain 38.29%-37.48% on SS1 and SS2, 40.17% in SS6_B folder as well.

CellNet after finetuning

After further finetuning, basically using best weight trained so far + new subset of Non-cerebriform, and set mini batch=679, trained around 100 epochs. We test the performance again. As shown, the accuracy is improved with SS1 and SS2 and SS6_B folder surprisingly up to 64.34%, 82.64%, and 96.91%.

This is the comparison between Cellnet and ResNet18 on the Non-cerebriform dataset with finetuning. As illustrated, our net has comparable Acc. even some higher on some folder.

cerebriform dataset

Prediction with our CellNet best weight trained so far on the cerebriform dataset, As shown in the figure, the TP and TN achieved comparable accuracy(in %) with resnet18.

Now our software upload on nash cloud as well, and support pretrained_weight further training, and all the prediction lsg files you can check here: lsg file for you to check

How to train with your data

You want to have it try by your own dataset with our cellnet. No problem! These are all the commands

Take a look at our CellNet software framework and Our CellNet won Top AI Camp Deecamp2020 Medical Track 2nd place

With the help of the power of Qt and the high efficiency of Python, using PyQt/PySide for desktop development will be a wonderful plus for demonstrating our excellent software. The current Qt/PyQt/PySide based GUI development common development methods are list follow: QWidget + QSS, QtWebkit+ HTML + CSS + js and Qt Quick. All these three technologies can efficiently and quickly develop the crossplatform desktop software. Qt’s formal development method is Qt Quick, which uses the JSON like language qml for rapid development. It is easy to learn, expansible, and wildly used in Ubuntu, LinuxDeepin, and other Linux desktop application development. It enables the developer for a rapid development framework and putting more effort into amplifying the corresponding business logic and easy to build the framework prototypes quickly.

The Proposed software structure diagram. To better demonstrate our model’s diagnostic performance, we selected the classic medical bench-mark datasets from competitions on Kaggle, such as the melanoma dataset, the diabetic retinopathy dataset, the actinic keratosis, vascular lesion dataset dermatofibroma dataset, squamous cell carcinoma dataset. Meanwhile, we selected nearly 11 representative classification networks, enable users to choose the di-agnostic network that fits their customer dataset. Besides, we inherit the computer vision classification network and the classic classification network of NLP. We develop desktop applications and open APIs to facilitate a better user experience, and ETH Leonhard and Megengine jointly provide our computing power.

License

All Software copy right licensed by QiangLi
@inproceedings{Qiang21ICLRW,
 author = {Qiang Li and Lily Xu and Corin Otesteanu},
 title = {All you need is Cell Attention: A Cell Annotation Tool for Single-Cell Morphology Data},
 booktitle = {AI4PH Workshop on ICLR},
 year = {2021}
}