ADD: Learning-MNL (#171)

* ADD: LearningMNL * ADD: foundation of L-MNL notebook
artefactory · Oct 29, 2024 · 85bc388 · 85bc388
1 parent 8810592
commit 85bc388
Show file tree

Hide file tree

Showing 10 changed files with 743 additions and 25 deletions.
diff --git a/.github/actions/publish/action.yaml b/.github/actions/publish/action.yaml
@@ -111,8 +111,8 @@ runs:
         then
           git checkout -b ${{ inputs.PUSH_BRANCH }}
           git add ${{ inputs.PACKAGE_DIRECTORY }}__init__.py ./pyproject.toml
-          git config user.name 'github-actions[bot]'
-          git config user.email 'github-actions[bot]@users.noreply.github.com'
+          git config user.name 'VincentAuriau'
+          git config user.email '22350719+VincentAuriau@users.noreply.github.com'
           git commit -m "Change version to ${{ github.event.release.tag_name }}" --allow-empty
           git push origin HEAD:${{ inputs.PUSH_BRANCH }}
         fi

diff --git a/README.md b/README.md
@@ -37,6 +37,7 @@ Choice-Learn uses NumPy and pandas as data backend engines and TensorFlow for mo
   - [Documentation](#trident-documentation)
   - [Contributing](#trident-contributing)
   - [Citation](#trident-citation)
+  - [References](#trident-references)
 
 ## :trident: Introduction - Discrete Choice modeling
 
@@ -49,27 +50,28 @@ If you are new to choice modeling, you can check this [resource](https://www.pub
 ### Data
 - Generic dataset handling with the ChoiceDataset class [[Example]](notebooks/introduction/2_data_handling.ipynb)
 - Ready-To-Use datasets:
-  - [SwissMetro](./choice_learn/datasets/data/swissmetro.csv.gz) [[2]](#citation)
-  - [ModeCanada](./choice_learn/datasets/data/ModeCanada.csv.gz) [[3]](#citation)
-  - The [Train](./choice_learn/datasets/data/train_data.csv.gz) dataset [[5]](#citation)
+  - [SwissMetro](./choice_learn/datasets/data/swissmetro.csv.gz) [[2]](#trident-references)
+  - [ModeCanada](./choice_learn/datasets/data/ModeCanada.csv.gz) [[3]](#trident-references)
+  - The [Train](./choice_learn/datasets/data/train_data.csv.gz) dataset [[5]](#trident-references)
   - The [Heating](./choice_learn/datasets/data/heating_data.csv.gz), [HC](./choice_learn/datasets/data/HC.csv.gz) & [Electricity](./choice_learn/datasets/data/electricity.csv.gz) datasets from Kenneth Train described [here](https://rdrr.io/cran/mlogit/man/Electricity.html), [here](https://cran.r-project.org/web/packages/mlogit/vignettes/e2nlogit.html) and [here](https://rdrr.io/cran/mlogit/man/Heating.html)
-  - [Stated car preferences](./choice_learn/datasets/data/car.csv.gz) [[9]](#citation)
+  - [Stated car preferences](./choice_learn/datasets/data/car.csv.gz) [[9]](#trident-references)
   - The [TaFeng](./choice_learn/datasets/data/ta_feng.csv.zip) dataset from [Kaggle](https://www.kaggle.com/datasets/chiranjivdas09/ta-feng-grocery-dataset)
-  - The ICDM-2013 [Expedia](./choice_learn/datasets/expedia.py) dataset from [Kaggle](https://www.kaggle.com/c/expedia-personalized-sort) [[6]](#citation)
-  - The London Passenger Mode Choice dataset [[11]](#citation)
+  - The ICDM-2013 [Expedia](./choice_learn/datasets/expedia.py) dataset from [Kaggle](https://www.kaggle.com/c/expedia-personalized-sort) [[6]](#trident-references)
+  - The London Passenger Mode Choice dataset [[11]](#trident-references)
 
 ### Model estimation
 - Ready-to-use models:
-  - Conditional MultiNomialLogit [[4]](#citation)[[Example]](notebooks/introduction/3_model_clogit.ipynb)
-  - Nested Logit [[10]](#citation) [[Example]](notebooks/models/nested_logit.ipynb)
+  - Conditional MultiNomialLogit [[4]](#trident-references)[[Example]](notebooks/introduction/3_model_clogit.ipynb)
+  - Nested Logit [[10]](#trident-references) [[Example]](notebooks/models/nested_logit.ipynb)
   - Latent Class MultiNomialLogit [[Example]](notebooks/models/latent_class_model.ipynb)
-  - RUMnet [[1]](#citation)[[Example]](notebooks/models/rumnet.ipynb)
-  - TasteNet [[7]](#citation)[[Example]](notebooks/models/tastenet.ipynb)
-  - ResLogit [[12]](#citation)[[Example]](notebooks/models/reslogit.ipynb)
+  - RUMnet [[1]](#trident-references)[[Example]](notebooks/models/rumnet.ipynb)
+  - TasteNet [[7]](#trident-references)[[Example]](notebooks/models/tastenet.ipynb)
+  - Learning-MNL [[13]](#trident-references)[[Example]](notebooks/models/learning_mnl.ipynb)
+  - ResLogit [[12]](#trident-references)[[Example]](notebooks/models/reslogit.ipynb)
 - Custom modeling is made easy by subclassing the ChoiceModel class [[Example]](notebooks/introduction/4_model_customization.ipynb)
 
 ### Auxiliary tools
-- Assortment & Pricing optimization algorithms [[Example]](notebooks/auxiliary_tools/assortment_example.ipynb) [[8]](#citation)
+- Assortment & Pricing optimization algorithms [[Example]](notebooks/auxiliary_tools/assortment_example.ipynb) [[8]](#trident-references)
 
 ## :trident: Getting Started
 
@@ -281,12 +283,14 @@ Choice-Learn has been developed through a collaboration between researchers at t
 [9] [Stated Preferences for Car Choice in Mixed MNL models for discrete response.](https://www.jstor.org/stable/2678603), McFadden, D. and Kenneth Train (2000)\
 [10] [Modeling the Choice of Residential Location](https://onlinepubs.trb.org/Onlinepubs/trr/1978/673/673-012.pdf), McFadden, D. (1978)\
 [11] [Recreating passenger mode choice-sets for transport simulation: A case study of London, UK](https://www.icevirtuallibrary.com/doi/10.1680/jsmic.17.00018), Hillel, T.; Elshafie, M. Z. E. B.; Jin, Y. (2018)\
-[12] [ResLogit: A residual neural network logit model for data-driven choice modelling](https://doi.org/10.1016/j.trc.2021.103050), Wong, M.; Farooq, B. (2021)
+[12] [ResLogit: A residual neural network logit model for data-driven choice modelling](https://doi.org/10.1016/j.trc.2021.103050), Wong, M.; Farooq, B. (2021)\
+[13] [Enhancing Discrete Choice Models with Representation Learning](https://arxiv.org/abs/1812.09747), Sifringer, B.; Lurkin, V.; Alahi, A. (2018)
 
 ### Code and Repositories
 
 *Official models implementations:*
 
 [1] [RUMnet](https://github.com/antoinedesir/rumnet)\
 [7] TasteNet [[Repo1](https://github.com/YafeiHan-MIT/TasteNet-MNL)] [[Repo2](https://github.com/deborahmit/TasteNet-MNL)]\
-[12] [ResLogit](https://github.com/LiTrans/reslogit)
+[12] [ResLogit](https://github.com/LiTrans/reslogit)\
+[13] [Learning-MNL](https://github.com/BSifringer/EnhancedDCM)
diff --git a/choice_learn/models/__init__.py b/choice_learn/models/__init__.py
@@ -5,6 +5,7 @@
 import tensorflow as tf
 
 from .conditional_logit import ConditionalLogit
+from .learning_mnl import LearningMNL
 from .nested_logit import NestedLogit
 from .reslogit import ResLogit
 from .simple_mnl import SimpleMNL
@@ -18,4 +19,12 @@
 
     logging.info("No GPU detected, importing CPU version of RUMnet.")
 
-__all__ = ["ConditionalLogit", "RUMnet", "SimpleMNL", "TasteNet", "NestedLogit", "ResLogit"]
+__all__ = [
+    "ConditionalLogit",
+    "RUMnet",
+    "SimpleMNL",
+    "TasteNet",
+    "NestedLogit",
+    "ResLogit",
+    "LearningMNL",
+]
diff --git a/choice_learn/models/base_model.py b/choice_learn/models/base_model.py
@@ -95,6 +95,7 @@ def __init__(
             self.optimizer = tf.keras.optimizers.Adamax(lr)
         elif optimizer.lower() == "lbfgs" or optimizer.lower() == "l-bfgs":
             print("Using L-BFGS optimizer, setting up .fit() function")
+            self.optimizer = "lbfgs"
             self.fit = self._fit_with_lbfgs
         else:
             print(f"Optimizer {optimizer} not implemented, switching for default Adam")
@@ -801,3 +802,18 @@ def _fit_with_lbfgs(self, choice_dataset, sample_weight=None, verbose=0):
                 f"Algorithm converged before reaching max iterations: {results[0].numpy()}",
             )
         return {"train_loss": func.history}
+
+    def assign_lr(self, lr):
+        """Change value of learning rate.
+
+        Parameters
+        ----------
+        lr : float
+            new learning rate value to be assigned
+        """
+        if isinstance(self.optimizer, tf.keras.optimizers.Optimizer):
+            self.optimizer.lr = lr
+        else:
+            raise NotImplementedError(
+                f"Learning rate cannot be changed for optimizer: {self.optimizer}"
+            )
diff --git a/choice_learn/models/learning_mnl.py b/choice_learn/models/learning_mnl.py
@@ -0,0 +1,195 @@
+"""
+Implementation of Enhancing Discrete Choice Models with Representation Learning.
+
+https://arxiv.org/abs/1812.09747 .
+"""
+
+import logging
+
+import tensorflow as tf
+
+from .conditional_logit import ConditionalLogit
+
+
+class LearningMNL(ConditionalLogit):
+    """Learning MNL from paper https://arxiv.org/abs/1812.09747 .
+
+    Arguments:
+    ----------
+    coefficients: dict or MNLCoefficients
+        Specfication of the model to be estimated.
+    """
+
+    def __init__(
+        self,
+        coefficients=None,
+        nn_features=[],
+        nn_layers_widths=[10],
+        nn_activation="relu",
+        add_exit_choice=False,
+        optimizer="Adam",
+        lr=0.001,
+        **kwargs,
+    ):
+        """Initialize of Conditional-MNL.
+
+        Parameters
+        ----------
+        coefficients : dict or MNLCoefficients
+            Dictionnary containing the coefficients parametrization of the model.
+            The dictionnary must have the following structure:
+            {feature_name_1: mode_1, feature_name_2: mode_2, ...}
+            mode must be among "constant", "item", "item-full" for now
+            (same specifications as torch-choice).
+        nn_features: list of str
+            List of features names that will be used in the neural network.
+            Features used as NN inputs MUST BE shared_features !
+        nn_layers_widths: list of int
+            List of integers representing the width of each hidden layer in the neural network.
+        add_exit_choice : bool, optional
+            Whether or not to normalize the probabilities computation with an exit choice
+            whose utility would be 1, by default True
+        """
+        super().__init__(add_exit_choice=add_exit_choice, optimizer=optimizer, lr=lr, **kwargs)
+        self.coefficients = coefficients
+        self.nn_features = nn_features
+        self.nn_layers_widths = nn_layers_widths
+        self.nn_activation = nn_activation
+        self.instantiated = False
+
+    def instantiate(self, choice_dataset):
+        """Instantiate the model using the features in the choice_dataset.
+
+        Parameters
+        ----------
+        choice_dataset: ChoiceDataset
+            Used to match the features names with the model coefficients.
+        """
+        if not self.instantiated:
+            # Instantiate NN
+            nn_input = tf.keras.Input(shape=(len(self.nn_features), 1, 1))
+            nn_output = tf.keras.layers.Conv2D(
+                filters=self.nn_layers_widths[0],
+                kernel_size=[len(self.nn_features), 1],
+                activation="relu",
+                padding="valid",
+                name="Dense_NN_per_frame",
+            )(nn_input)
+            nn_output = tf.keras.layers.Dropout(0.2, name="Regularizer")(nn_output)
+            # nn_output = tf.reshape(nn_output, (-1, self.nn_layers_widths[0]))
+            nn_output = tf.keras.layers.Reshape((self.nn_layers_widths[0],))(nn_output)
+
+            for i in range(len(self.nn_layers_widths) - 1):
+                nn_output = tf.keras.layers.Dense(
+                    units=self.nn_layers_widths[i + 1], activation="relu", name="Dense{}".format(i)
+                )(nn_output)
+                nn_output = tf.keras.layers.Dropout(0.2, name="Drop{}".format(i))(nn_output)
+            nn_output = tf.keras.layers.Dense(
+                units=choice_dataset.get_n_items(), name="Output_new_feature"
+            )(nn_output)
+
+            # nn_input = tf.keras.Input(shape=(len(self.nn_features), ))
+            # x = nn_input
+            # for width in self.nn_layers_widths:
+            #     x = tf.keras.layers.Dense(width, activation=self.nn_activation)(x)
+            #     x = tf.keras.layers.Dropout(0.2, name="Regularizer")(x)
+            # nn_output = tf.keras.layers.Dense(choice_dataset.get_n_items())(x)
+            self.nn_model = tf.keras.Model(inputs=nn_input, outputs=nn_output)
+
+            super().instantiate(choice_dataset)
+
+    @property
+    def trainable_weights(self):
+        """Trainable weights of the model."""
+        return self._trainable_weights + self.nn_model.trainable_variables
+
+    def compute_batch_utility(
+        self,
+        shared_features_by_choice,
+        items_features_by_choice,
+        available_items_by_choice,
+        choices,
+        verbose=1,
+    ):
+        """Compute the utility when the model is constructed from a MNLCoefficients object.
+
+        Parameters
+        ----------
+        shared_features_by_choice : tuple of np.ndarray (choices_features)
+            a batch of shared features
+            Shape must be (n_choices, n_shared_features)
+        items_features_by_choice : tuple of np.ndarray (choices_items_features)
+            a batch of items features
+            Shape must be (n_choices, n_items_features)
+        available_items_by_choice : np.ndarray
+            A batch of items availabilities
+            Shape must be (n_choices, n_items)
+        choices: np.ndarray
+            Choices
+            Shape must be (n_choices, )
+        verbose : int, optional
+            Parametrization of the logging outputs, by default 1
+
+        Returns
+        -------
+        tf.Tensor
+            Utilities corresponding of shape (n_choices, n_items)
+        """
+        if not isinstance(shared_features_by_choice, tuple):
+            shared_features_by_choice = (shared_features_by_choice,)
+        if not isinstance(items_features_by_choice, tuple):
+            items_features_by_choice = (items_features_by_choice,)
+        knowledge_driven_utilities = super().compute_batch_utility(
+            shared_features_by_choice,
+            items_features_by_choice,
+            available_items_by_choice,
+            choices,
+            verbose=verbose,
+        )
+        data_driven_inputs = []
+        if self._shared_features_by_choice_names is not None:
+            for nn_feature in self.nn_features:
+                for i, feat_tuple in enumerate(self._shared_features_by_choice_names):
+                    for j, feat in enumerate(feat_tuple):
+                        if feat == nn_feature:
+                            data_driven_inputs.append(shared_features_by_choice[i][:, j])
+        else:
+            logging.warn("No shared features found in the dataset.")
+        data_driven_utilities = self.nn_model(
+            tf.expand_dims(tf.expand_dims(tf.stack(data_driven_inputs, axis=1), axis=-1), axis=-1)
+        )
+        return knowledge_driven_utilities + data_driven_utilities
+
+    def clone(self):
+        """Return a clone of the model."""
+        clone = LearningMNL(
+            coefficients=self.coefficients,
+            add_exit_choice=self.add_exit_choice,
+            optimizer=self.optimizer_name,
+            nn_features=self.nn_features,
+            nn_layers_widths=self.nn_layers_widths,
+            nn_activation=self.nn_activation,
+        )
+        if hasattr(self, "history"):
+            clone.history = self.history
+        if hasattr(self, "is_fitted"):
+            clone.is_fitted = self.is_fitted
+        if hasattr(self, "instantiated"):
+            clone.instantiated = self.instantiated
+        clone.loss = self.loss
+        clone.label_smoothing = self.label_smoothing
+        if hasattr(self, "report"):
+            clone.report = self.report
+        if hasattr(self, "trainable_weights"):
+            clone._trainable_weights = self.trainable_weights
+        if hasattr(self, "nn_model"):
+            clone.nn_model = self.nn_model
+        if hasattr(self, "lr"):
+            clone.lr = self.lr
+        if hasattr(self, "_shared_features_by_choice_names"):
+            clone._shared_features_by_choice_names = self._shared_features_by_choice_names
+        if hasattr(self, "_items_features_by_choice_names"):
+            clone._items_features_by_choice_names = self._items_features_by_choice_names
+        if hasattr(self, "_items_features_by_choice_names"):
+            clone._items_features_by_choice_names = self._items_features_by_choice_names
+        return clone
diff --git a/choice_learn/models/rumnet.py b/choice_learn/models/rumnet.py
@@ -529,7 +529,7 @@ def __init__(
         label_smoothing : float, optional
             Value of smoothing to apply in CrossEntropy loss computation, by default 0.0
         """
-        super().__init__(add_exit_choice=add_exit_choice, optimizer=optimizer, **kwargs)
+        super().__init__(add_exit_choice=add_exit_choice, optimizer=optimizer, lr=lr, **kwargs)
         # Number of features
         if num_customer_features <= 0:
             raise ValueError("Number of customer features must be at least 1.")
@@ -671,9 +671,8 @@ def compute_batch_utility(
                         ]
                     )
                     utilities[-1].append(self.u_model(_u))
-
         # Reshape utilities: (batch_size, num_items, heterogeneity)
-        return tf.squeeze(tf.stack(utilities, axis=1), -1)
+        return tf.squeeze(tf.transpose(tf.stack(utilities, axis=1)), 0)
 
     @tf.function
     def train_step(
@@ -720,13 +719,11 @@ def train_step(
                 available_items_by_choice=available_items_by_choice,
                 choices=choices,
             )
-
             # Iterate over heterogeneities
             eps_probabilities = tf.nn.softmax(all_u, axis=1)
 
             # Average probabilities over heterogeneities
             probabilities = tf.reduce_mean(eps_probabilities, axis=-1)
-
             # It is not in the paper, but let's normalize with availabilities
             probabilities = tf.multiply(probabilities, available_items_by_choice)
             probabilities = tf.divide(