A Two-stage Regression Framework for Automated Cephalometric Landmark Detection Incorporating Semantically Fused Anatomical Features and Multi-head Refinement Loss
Muhammad Anwaar Khalid
·
Atif Khurshid
·
Kanwal Zulfiqar
·
Ulfat Bashir
·
Muhammad Moazam Fraz
This repository contains the implementation of CEPHMark-Net, a novel, end-to-end trainable two-stage regression framework for accurate cephalometric landmark detection. This work has been published in Expert Systems with Applications. The framework aims to streamline the detection process and improve clinical workflows by providing precise localization of landmarks with reduced computational overhead.
Cephalometric analysis plays a crucial role in orthodontic treatment planning and maxillofacial surgeries. Traditionally, this process involves manually tracing anatomical landmarks on two-dimensional (2D) radiographs, known as cephalograms.This manual process is time-consuming and subject to inter- and intra-observer variability. This repository presents state-of-the-art automatic cephalometric landmark detection framework that utilizes a novel multi-head Convolutional Neural Network (CNN). The framework consists of two primary modules: landmark detection and landmark refinement, both leveraging a shared backbone neural network. This backbone network functions as a feature extractor, providing rich, high-dimensional features to both modules. The landmark detection module (LDM) module utilizes these features to simultaneously regress coordinates for all landmarks, capturing global geometric relationships among landmarks. Our framework incorporates a cropping mechanism to extract high-dimensional features from multi-resolution feature tensors generated during the backbone network's forward pass. To effectively bridge the semantic gap between these features, we use a semantic fusion block (SFB) that integrates high-resolution, semantically weak features with low-resolution, semantically rich features. This fusion yields a single, high-level feature map with fine resolution, which the landmark refinement module (LRM) uses to refine the initial landmark estimates.
- Multi-head CNN Architecture: A two-stage regression framework with a shared backbone feature extractor across two integrated modules.
- End-to-end Trainable: A unified architecture enabling simultaneous detection and refinement, making the system efficient and scalable.
- Semantic Fusion Block: Integrates high- and low-resolution features to capture both global geometric relations and local tissue characteristics.
- Real-time Inter-model Communication: Enables the system to learn from and correct each module's predictions, boosting overall accuracy.
- Multi-head Refinement Loss: Aggregates predictions from multiple CNN heads, each specializing in different aspects of landmark detection.
- Python 3.8
- TensorFlow 2.10
- OpenCV 4.9
- Clone the repository:
git clone https://github.com/manwaarkhd/CEPHMark-Net.git cd CEPHMark-Net
- Set up a Python virtual environment (optional but recommended):
python3 -m venv pyenv source pyenv/bin/activate # On Windows use `env\Scripts\activate`
- Install the required packages:
pip install -r requirements.txt
We utilized the publicly available ISBI 2015 Dataset by Wang et al. (2015), which consists of 400 high-resolution X-ray images. Each image has spatial dimensions of 1935 × 2400 pixels, with a spatial resolution of 0.1 mm/pixel in both directions. We employed the same 150 images for training as used in the ISBI Grand Challenge 2015. The remaining 250 images are reserved for evaluation and further partitioned into two distinct subsets: Test1 and Test2. Test1 serves as our validation set for assessing the accuracy of our method during the development phase, and Test2 is used as our test set for the final evaluation of our proposed method.
After downloading the files, please create a folder named datasets
and organize it as follows:
datasets/
└── ISBI Dataset/
├── Dataset/
│ ├── Training/
│ │ ├── 001.bmp
│ │ ├── 002.bmp
│ │ └── ...
│ └── Testing/
│ ├── Test1/
│ │ ├── 151.bmp
│ │ ├── 152.bmp
│ │ └── ...
│ └── Test2/
│ ├── 301.bmp
│ ├── 302.bmp
│ └── ...
└── Annotations/
├── Junior Orthodontist/
│ ├── 001.txt
│ ├── 002.txt
│ ├── ...
│ └── 400.txt
└── Senior Orthodontist/
├── 001.txt
├── 002.txt
├── ...
└── 400.txt
- Download the dataset from link.
- Organize the dataset as shown in the Datset section above.
- Ensure that annotation files are placed correctly within the
Annotations
directory.
The configuration of the CEPHMark-Net framework is managed through the config.py
file. This file contains various settings, hyper-parameters, and heuristics that can be adjusted to fine-tune the model's performance and adapt it to different datasets. To customize the configuration for your specific use case, edit the config.py
file accordingly. Here are some common modifications you might consider:
- Adjusting Image Dimensions: If your dataset has different image dimensions, modify
config.ORIGINAL_HEIGHT
,config.ORIGINAL_WIDTH
. - Region of Interest (ROI) Pooling: Adjust the size of the pooling region used for extracting features from the detected regions by
config.ROI_POOL_SIZE
. - Updating Training Parameters: Modify
config.TRAIN.EPOCHS
andconfig.TRAIN.OPTIMIZER
to set the number of training epochs and choose a different optimizer or learning rate.
To train the model on Train
dataset, use:
python train.py
To run inference on Test1
dataset using pre-trained weights:
python valid.py
To evaluate the model on the Test2
dataset:
python test.py
For a comprehensive analysis of the results, including quantitative metrics such as mean squared error (MSE) and landmark detection accuracy, as well as qualitative comparisons with traditional methods and other state-of-the-art approaches, please refer to our paper. The paper provides detailed tables, charts, and visualizations illustrating the performance improvements and validation of our method.
If you find our work useful in your research, please consider citing our paper:
@article{khalid2024two,
title={A two-stage regression framework for automated cephalometric landmark detection incorporating semantically fused anatomical features and multi-head refinement loss},
author={Khalid, Muhammad Anwaar and Khurshid, Atif and Zulfiqar, Kanwal and Bashir, Ulfat and Fraz, Muhammad Moazam},
journal={Expert Systems with Applications},
pages={124840},
year={2024},
publisher={Elsevier}
}