Skip to content

The project aim to experiment implementing a modular architecture: an early-exit model and testing it using Tensorflow.

License

Notifications You must be signed in to change notification settings

giulio-derasmo/Experimenting-with-modularity-in-deep-learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Experimenting with modularity in deep learning

The project is carried out for the course Neural Networks for Data Science at Sapienza, with the aim to have an hands-on with the recent field of Modular Networks. Modular Networks aim to provide an alternativy to tecnique like Distillation in order to reduce the training and inference time or the overall computational budget given by the deep architecture going more deeper nowadays.

The model and training

The type of modular networks I implement is Early Exit over a VGG11 model from scratch in Tensorflow Keras for image classification given the flower dataset. More specifically I add fixed early exits in the previous architecture and train the new model with a Joint Cross Entropy loss using all the early exit prediction and the final one. Inference is done by a thresholding operation on the entropy of the early exit layer e that ensure if exit early or continue. The Early-Exit layer is a small classifier of two sequential layer: convolution and a fully connected one. The early exit branch are implemented after every convolutional block in the model, for a total of 5, in order to explore more this modular architecture.

References

[1]Why should we add early exits to neural networks?

[2]DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

[3]Going deeper with convolutions

About

The project aim to experiment implementing a modular architecture: an early-exit model and testing it using Tensorflow.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published