Challenge Summary: In this challenge, we will work as a data scientist for a video streaming company. Our task is to build a machine learning model to predict which existing subscribers are likely to continue their subscriptions for another month, also known as churn prediction. Churn can happen for various reasons, and the company wants to identify customers at the highest risk of canceling their subscriptions so that appropriate interventions can be implemented.
Dataset: We will have access to two datasets: "train.csv" and "test.csv." The "train.csv" dataset contains information about 70% of past subscriptions, including whether the subscription continued into the next month (the "ground truth"). The "test.csv" dataset contains similar information for the remaining 30% of subscriptions but does not disclose whether they continued or not. Your task is to predict the outcome for subscriptions in the "test.csv" dataset based on patterns in the "train.csv" data.
Objective: We will use our machine learning skills to predict whether subscriptions in the "test.csv" dataset will continue for another month or not, helping the company allocate resources effectively to retain customers.