Include a new function to improve quality of training data using BERT. #1256

gilbertocamara · 2024-12-16T17:45:27Z

Selecting high-quality training samples is crucial for enhancing the accuracy of land classification using remote sensing data. A significant challenge in this area is the limited availability of large datasets that contain good quality training samples. As a result, the deep learning community has explored various techniques to maximize the potential of small training datasets. One such method is SITS-BERT, which stands for "Bidirectional Encoder Representations from Transformers" applied to satellite image time series.

BERT, a technique developed in the first generation of large language models (LLMs), employs a combination of unsupervised pretraining and supervised fine-tuning. It utilises a "Masked Language Model" as its pretext task. The model predicts these missing tokens based on the surrounding context after this task randomly masks certain tokens in a sentence. This approach allows BERT to learn word relationships from a plain text corpus.

When applied to satellite image time series (SITS), the BERT method begins by training a deep learning model to recover missing observations within the time series. A second phase then refines this model using additional training samples. It is worth investigating whether the SITS-BERT approach can enhance the quality of the models used for land classification.

Reference:

Yuan, Yuan, and Lei Lin. 2021. ‘Self-Supervised Pretraining of Transformers for Satellite Image Time Series Classification’. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 14:474–87. https://doi.org/10.1109/JSTARS.2020.3036602.

gilbertocamara added the new API function label Dec 16, 2024

gilbertocamara added this to the version 1.5.3 milestone Dec 16, 2024

gilbertocamara added this to sits-management Dec 16, 2024

gilbertocamara moved this to To do in sits-management Dec 16, 2024

gilbertocamara assigned alexcarssuncao, M3nin0 and OldLipe Dec 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Include a new function to improve quality of training data using BERT. #1256

Include a new function to improve quality of training data using BERT. #1256

gilbertocamara commented Dec 16, 2024

Include a new function to improve quality of training data using BERT. #1256

Include a new function to improve quality of training data using BERT. #1256

Comments

gilbertocamara commented Dec 16, 2024