new website

ML4ITS · Nov 17, 2023 · 3b9a44d · 3b9a44d
1 parent 09e4505
commit 3b9a44d
Show file tree

Hide file tree

Showing 1,568 changed files with 155,875 additions and 0 deletions.
diff --git a/archetypes/default.md b/archetypes/default.md
@@ -0,0 +1,5 @@
++++
+title = '{{ replace .File.ContentBaseName "-" " " | title }}'
+date = {{ .Date }}
+draft = true
++++
diff --git a/archetypes/podcast.md b/archetypes/podcast.md
@@ -0,0 +1,8 @@
+---
+title: "{{ replace .Name "-" " " | title }}"
+date: {{ .Date }}
+description: ""
+season:
+episode:
+draft: true
+---
diff --git a/archetypes/rersearch_papers.md b/archetypes/rersearch_papers.md
@@ -0,0 +1,8 @@
+---
+#date: '{{ .Date }}'
+draft: true
+title: '{{ replace .File.ContentBaseName `-` ` ` | title }}'
+---
+
+## Abstract
+
diff --git a/config.toml b/config.toml
@@ -0,0 +1,44 @@
+baseURL = 'ml4its.github.io'
+languageCode = 'en-us'
+title = 'ML4ITS - Machine Learning for Irregular Time Series'
+theme = 'mainroad'
+
+[Params.widgets.social]
+  cached = false # activate cache if true
+  # Enable parts of social widget
+  github = "ml4its"
+  email = "massimiliano.ruocco@sintef.no"
+
+[Params]
+  description = "ML4ITS website" 
+  copyright = "Massimiliano Ruocco" 
+  #post_meta = ["author", "date", "categories", "translations"] # Order of post meta information
+  post_meta = ["categories", "translations"]
+
+[Params.logo]
+  image = "images/logoml4its.png" # Logo image. Path relative to "static"
+  title = "ML4ITS" # Logo title, otherwise will use site title
+  subtitle = "Machine Learning for Irregular Time Series" # Logo subtitle
+
+
+[Params.thumbnail]
+  #visibility = ["list", "post"] # Control thumbnail visibility
+  visibility = ["list"] # Control thumbnail visibility
+
+[menu]
+[[menu.main]]
+    name = 'Home'
+    pageRef = '/home'
+    weight = 1
+[[menu.main]]
+    name = 'Publications'
+    pageRef = '/publications'
+    weight = 2
+[[menu.main]]
+	name = 'Thesis'
+	pageRef = '/thesis'
+	weight = 3
+[[menu.main]]
+    name = 'Team'
+    pageRef = '/team'
+    weight = 4
diff --git a/content/home/_index.md b/content/home/_index.md
@@ -0,0 +1,37 @@
++++
+title = 'About the project'
+date = 2023-11-16T16:24:37+01:00
+draft = true
+sidebar = "right" # Enable sidebar (on the right side) per page
+
+#thumbnail = "images/claudio.jpeg"
+tags = ["about"]
+lead = "Example lead - highlighted near the title" # Lead text
+widgets = ["taglist","social"]
+description =  "Example article description"
+toc= false # Enable Table of Contents for specific page
+comments= true # Enable Disqus comments for specific page
+
+
+
+#authorbox= true # Enable authorbox for specific page
+
++++
+
+**Time series** are everywhere. Data recorded from **sensors** in **mobile phones**, financial data like accounting figures and climate indicators are all examples of time series society and individuals are exposed to daily. Understanding such time series are essential for technological advance and making informed decisions.
+
+Many of these time series are **irregular** in some sense. They may have **missing data**, which may occur if sensors fail, if a person forgets to insert a number in a spreadsheet, or if the phenomenon we are interested in may only be observed at certain points in time. They may also be **very noisy**: for example, using cheap sensors may allow us to get data from more sensors at the expense of the measurement having more noise than when using a more expensive sensor.
+
+The project **Machine Learning for Irregular Time Series (ML4ITS)** addresses some core challenges for irregular time series. In particular, the project develop methodology that handles irregular time series for the following tasks:
+
+- *Forecasting*: predicting the future values of the time series based on current/past data.
+- *Imputation/denoising*: creating “clean” data in the scenario there is missing or noisy data
+- *Anomaly detection* and *Failure prediction*: knowing which observations are unusual or indicating that a system is in a critical state.
+- *Synthetic data creation*.
+
+The last point addresses the need for creating datasets that are **privacy preserving**. For example, the sensor data on a cell phone may not be anonymous, but it may be possible to create a synthetic dataset that behaves like the original data in a statistical sense that at the same time preserves privacy. Furthermore, the project aims to make **reproducible research** and develop **open source** software that will benefit the research ecosystem.
+
+The project is a collaboration between [Sintef Digital]() and three departments at [NTNU](): [Department of Computer Science](), [Department of Mathematical Sciences]() and [Department of Electronic Systems]().
+
+
+We are grateful for funding from [The Research Council of Norway](https://www.forskningsradet.no/) within the IKTPLUSS initiative.
diff --git a/content/publications/_index.md b/content/publications/_index.md
@@ -0,0 +1,14 @@
++++
+title = 'Publications'
+date = 2023-11-14T10:24:37+01:00
+draft = true
+author = "Massimiliano Ruocco"
+#sidebar = "right" # Enable sidebar (on the right side) per page
+
+#thumbnail = "images/claudio.jpeg"
+lead = "Example lead - highlighted near the title" # Lead text
+#widgets = ["taglist","social", "categories"]
+description =  "Example article description"
++++
+
+List of publications
diff --git a/content/publications/pub0.md b/content/publications/pub0.md
@@ -0,0 +1,17 @@
++++
+title = 'Global Transformer Architecture for Indoor Room Temperature Forecasting (2023)'
+date = 2023-10-15T19:17:52+01:00
+draft = true
+tags = ["forecasting","transformer","timeseries"]
+
+#lead = "Alfredo V. Clemente, Alessandro Nocente, Massimiliano Ruocco" # Lead text
+description =  "Example article description"
+#categories = ["paper", "2023", "CISBAT"]
+lead = "CISBAT 2023"
+categories = ["Alfredo V. Clemente", "Alessandro Nocente", "Massimiliano Ruocco"]
+thumbnail = "images/globaltransformer.png"
+post_meta = ["author","categories", "translations"]
++++
+
+## Abstract
+A thorough regulation of building energy systems translates in relevant energy savings and in a better comfort for the occupants. Algorithms to predict the thermal state of a building on a certain time horizon with a good confidence are essential for the implementation of effective control systems. This work presents a global Transformer architecture for indoor temperature forecasting in multi-room buildings, aiming at optimizing energy consumption and reducing greenhouse gas emissions associated with HVAC systems. Recent advancements in deep learning have enabled the development of more sophisticated forecasting models compared to traditional feedback control systems. The proposed global Transformer architecture can be trained on the entire dataset encompassing all rooms, eliminating the need for multiple room-specific models, significantly improving predictive performance, and simplifying deployment and maintenance. Notably, this study is the first to apply a Transformer architecture for indoor temperature forecasting in multi-room buildings. The proposed approach provides a novel solution to enhance the accuracy and efficiency of temperature forecasting, serving as a valuable tool to optimize energy consumption and decrease greenhouse gas emissions in the building sector.
diff --git a/content/publications/pub1.md b/content/publications/pub1.md
@@ -0,0 +1,18 @@
++++
+author = "ddd dsdsds"
+title = 'Persistence initialization: A novel adaptation of the transformer architecture for time series forecasting (2023)'
+date = 2023-11-15T19:17:52+01:00
+draft = true
+tags = ["forecasting","transformer","timeseries"]
+
+#lead = "Espen Haugsdal, Massimiliano Ruocco" # Lead text
+lead = "Applied Intelligence"
+categories = ["Espen Haugsdal", "Massimiliano Ruocco"]
+description =  "Example article description"
+#categories = ["paper", "2023", "Applied Intelligence"]
+thumbnail = "images/pitransformer.jpeg"
+post_meta = ["categories", "translations"]
++++
+
+## Abstract
+Time series forecasting is an important problem, with many real world applications. Transformer models have been successfully applied to natural language processing tasks, but have received relatively little attention for time series forecasting. Motivated by the differences between classification tasks and forecasting, we propose PI-Transformer, an adaptation of the Transformer architecture designed for time series forecasting, consisting of three parts: First, we propose a novel initialization method called Persistence Initialization, with the goal of increasing training stability of forecasting models by ensuring that the initial outputs of an untrained model are identical to the outputs of a simple baseline model. Second, we use ReZero normalization instead of Layer Normalization, in order to further tackle issues related to training stability. Third, we use Rotary positional encodings to provide a better inductive bias for forecasting. Multiple ablation studies show that the PI-Transformer is more accurate, learns faster, and scales better than regular Transformer models. Finally, PI-Transformer achieves competitive performance on the challenging M4 dataset, both when compared to the current state of the art, and to recently proposed Transformer models for time series forecasting.
diff --git a/content/publications/pub10.md b/content/publications/pub10.md
@@ -0,0 +1,15 @@
++++
+title = 'Data-Driven Classifiers for Early Meal Detection Using ECG (2023)'
+date = 2023-10-15T19:17:52+01:00
+draft = true
+tags = ["Signal Processing","sensors"]
+
+categories = ["Muhammad A. Cheema", "Pallavi Patil",  "Salman I. Siddiqui", "Pierluigi Salvo Rossi", "Øyvind Stavdahl", "Anders Lyngvi Fougner"] # Lead text
+description =  "Example article description"
+lead = "IEEE Sensors Letters"
+thumbnail = "images/ECGClass.png"
+post_meta = ["categories", "translations"]
++++
+
+## Abstract
+This letter investigates the potential of the electrocardiogram to perform early meal detection, which is critical for developing a fully-functional automatic artificial pancreas. The study was conducted in a group of healthy subjects with different ages and genders. Two classifiers were trained: one based on neural networks (NNs) and working on features extracted from the signals and one based on convolutional NNs (CNNs) and working directly on raw data. During the test phase, both classifiers correctly detected all the meals, with the CNN outperforming the NN in terms of misdetected meals and detection time (DT). Reliable meal onset detection with short DT has significant practical implications: It reduces the risk of postprandial hyperglycemia and hypoglycemia, and it reduces the mental burden of meal documentation for patients and related stress.
diff --git a/content/publications/pub2.md b/content/publications/pub2.md
@@ -0,0 +1,16 @@
++++
+title = 'Navigating the Metric Maze: A Taxonomy of Evaluation Metrics for Anomaly Detection in Time Series (2023)'
+date = 2023-11-15T19:17:52+01:00
+draft = true
+tags = ["anomaly detection","timeseries"]
+#lead = "Sondre Sørbø, Massimiliano Ruocco" # Lead text
+lead = "Data Mining and Knowledge Discovery"
+description =  "Example article description"
+author = "Massimiliano Ruocco"
+#categories = ["paper", "2023"]
+categories = ["Sondre Sørbø", "Massimiliano Ruocco"]
+thumbnail = "images/maze.jpeg"
++++
+
+## Abstract
+The field of time series anomaly detection is constantly advancing, with several methods available, making it a challenge to determine the most appropriate method for a specific domain. The evaluation of these methods is facilitated by the use of metrics, which vary widely in their properties. Despite the existence of new evaluation metrics, there is limited agreement on which metrics are best suited for specific scenarios and domain, and the most commonly used metrics have faced criticism in the literature. This paper provides a comprehensive overview of the metrics used for the evaluation of time series anomaly detection methods, and also defines a taxonomy of these based on how they are calculated. By defining a set of properties for evaluation metrics and a set of specific case studies and experiments, twenty metrics are analyzed and discussed in detail, highlighting the unique suitability of each for specific tasks. Through extensive experimentation and analysis, this paper argues that the choice of evaluation metric must be made with care, taking into account the specific requirements of the task at hand.
diff --git a/content/publications/pub3.md b/content/publications/pub3.md
@@ -0,0 +1,17 @@
++++
+title = 'Circle Attention: Forecasting Network Traffic by Learning Interpretable Spatial Relationships from Intersecting Circles (2023)'
+date = 2023-11-15T19:17:52+01:00
+draft = true
+tags = ["forecasting","transformer","timeseries"]
+#lead = "Espen Haugsdal, Sara Malacarne, Massimiliano Ruocco" # Lead text
+lead = "ECML-PKDD 2023"
+description =  "Example article description"
+author = "Massimiliano Ruocco"
+#categories = ["paper", "2023", "ECML-PAKDD"]
+categories = ["Espen Haugsdal","Sara Malacarne","Massimiliano Ruocco"]
+thumbnail = "images/towers.png"
++++
+
+## Abstract
+Accurately forecasting traffic in telecommunication networks is essential for operators to efficiently allocate resources, provide better services, and save energy. We propose Circle Attention, a novel spatial attention mechanism for telecom traffic forecasting, which directly models the area of effect of neighboring cell towers. Cell towers typically point in three different geographical directions, called sectors. Circle Attention models the relationships between sectors of neighboring cell towers by assigning a circle with learnable parameters to each sector, which are: the azimuth of the sector, the distance from the cell tower to the center of the circle, and the radius of the circle. To model the effects of neighboring time series, we compute attention weights based on the intersection of circles relative to their area. These attention weights serve as multiplicative gating parameters for the neighboring time series, allowing our model to focus on the most important time series when making predictions. The circle parameters are learned automatically through back-propagation, with the only signal available being the errors made in the traffic forecasting of each sector. To validate the effectiveness of our approach, we train a Transformer to forecast the number of attempted calls to sectors in the Copenhagen area, and show that Circle Attention outperforms the baseline methods of including either all or none of the neighboring time series. Furthermore, we perform an ablation study to investigate the importance of the three learnable parameters of the circles, and show that performance deteriorates if any of the parameters are kept fixed. Our method has practical implications for telecommunication operators, as it can provide more accurate and interpretable models for forecasting network traffic, allowing for better resource allocation and improved service provision.
+
diff --git a/content/publications/pub4.md b/content/publications/pub4.md
@@ -0,0 +1,17 @@
++++
+title = 'Masked Generative Modeling with Enhanced Sampling Scheme (2023)'
+date = 2023-10-15T19:17:52+01:00
+draft = true
+tags = ["generative","VAE","timeseries", "Synthetic Data"]
+
+#lead = "Daesoo Lee, Erlend Aune, Sara Malacarne" # Lead text
+lead = "arXiv"
+description =  "Example article description"
+#categories = ["paper", "2023", "arXiv"]
+categories = ["Daesoo Lee", "Erlend Aune","Sara Malacarne"]
+thumbnail = "images/masked.png"
+post_meta = ["categories", "translations"]
++++
+
+## Abstract
+This paper presents a novel sampling scheme for masked non-autoregressive generative modeling. We identify the limitations of TimeVQVAE, MaskGIT, and Token-Critic in their sampling processes, and propose Enhanced Sampling Scheme (ESS) to overcome these limitations. ESS explicitly ensures both sample diversity and fidelity, and consists of three stages: Naive Iterative Decoding, Critical Reverse Sampling, and Critical Resampling. ESS starts by sampling a token set using the naive iterative decoding as proposed in MaskGIT, ensuring sample diversity. Then, the token set undergoes the critical reverse sampling, masking tokens leading to unrealistic samples. After that, critical resampling reconstructs masked tokens until the final sampling step is reached to ensure high fidelity. Critical resampling uses confidence scores obtained from a self-Token-Critic to better measure the realism of sampled tokens, while critical reverse sampling uses the structure of the quantized latent vector space to discover unrealistic sample paths. We demonstrate significant performance gains of ESS in both unconditional sampling and class-conditional sampling using all the 128 datasets in the UCR Time Series archive.
diff --git a/content/publications/pub5.md b/content/publications/pub5.md
@@ -0,0 +1,15 @@
++++
+title = 'Vector Quantized Time Series Generation with a Bidirectional Prior Model (2023)'
+date = 2023-10-15T19:17:52+01:00
+draft = true
+tags = ["generative","timeseries", "Synthetic Data"]
+
+categories = ["Daesoo Lee", "Erlend Aune", "Sara Malacarne"] # Lead text
+description =  "Example article description"
+lead = "AISTATS 2023"
+thumbnail = "images/VQBidirectional.png"
+post_meta = ["categories", "translations"]
++++
+
+## Abstract
+This paper presents a novel sampling scheme for masked non-autoregressive generative modeling. We identify the limitations of TimeVQVAE, MaskGIT, and Token-Critic in their sampling processes, and propose Enhanced Sampling Scheme (ESS) to overcome these limitations. ESS explicitly ensures both sample diversity and fidelity, and consists of three stages: Naive Iterative Decoding, Critical Reverse Sampling, and Critical Resampling. ESS starts by sampling a token set using the naive iterative decoding as proposed in MaskGIT, ensuring sample diversity. Then, the token set undergoes the critical reverse sampling, masking tokens leading to unrealistic samples. After that, critical resampling reconstructs masked tokens until the final sampling step is reached to ensure high fidelity. Critical resampling uses confidence scores obtained from a self-Token-Critic to better measure the realism of sampled tokens, while critical reverse sampling uses the structure of the quantized latent vector space to discover unrealistic sample paths. We demonstrate significant performance gains of ESS in both unconditional sampling and class-conditional sampling using all the 128 datasets in the UCR Time Series archive.
diff --git a/content/publications/pub6.md b/content/publications/pub6.md
@@ -0,0 +1,15 @@
++++
+title = 'Vnibcreg: Vicreg with neighboring-invariance and better-covariance evaluated on non-stationary seismic signal time series (2022)'
+date = 2022-10-15T19:17:52+01:00
+draft = true
+tags = ["Selfsupervised","timeseries"]
+
+categories = ["Daesoo Lee", "Erlend Aune", "Nadège Langet", "Jo Eidsvik"] 
+description =  "Example article description"
+lead = "arXiv"
+thumbnail = "images/VNIBCREG.png"
+post_meta = ["categories", "translations"]
++++
+
+## Abstract
+This paper presents a novel sampling scheme for masked non-autoregressive generative modeling. We identify the limitations of TimeVQVAE, MaskGIT, and Token-Critic in their sampling processes, and propose Enhanced Sampling Scheme (ESS) to overcome these limitations. ESS explicitly ensures both sample diversity and fidelity, and consists of three stages: Naive Iterative Decoding, Critical Reverse Sampling, and Critical Resampling. ESS starts by sampling a token set using the naive iterative decoding as proposed in MaskGIT, ensuring sample diversity. Then, the token set undergoes the critical reverse sampling, masking tokens leading to unrealistic samples. After that, critical resampling reconstructs masked tokens until the final sampling step is reached to ensure high fidelity. Critical resampling uses confidence scores obtained from a self-Token-Critic to better measure the realism of sampled tokens, while critical reverse sampling uses the structure of the quantized latent vector space to discover unrealistic sample paths. We demonstrate significant performance gains of ESS in both unconditional sampling and class-conditional sampling using all the 128 datasets in the UCR Time Series archive.