Skip to content

Commit

Permalink
Tutorial on Continuous Monitoring using ML Pipelines (#3644)
Browse files Browse the repository at this point in the history
# Description

A tutorial about creating a pipeline for continuous monitoring.
It describes an advanced use case of [running flows in ML
pipelines](https://github.com/microsoft/promptflow/tree/main/examples/tutorials/run-flow-with-pipeline).

# All Promptflow Contribution checklist:
- [x] **The pull request does not introduce [breaking changes].**
- [x] **CHANGELOG is updated for new features, bug fixes or other
significant changes.**
- [x] **I have read the [contribution
guidelines](https://github.com/microsoft/promptflow/blob/main/CONTRIBUTING.md).**
- [x] **I confirm that all new dependencies are compatible with the MIT
license.**
- [x] **Create an issue and link to the pull request to get dedicated
review from promptflow team. Learn more: [suggested
workflow](../CONTRIBUTING.md#suggested-workflow).**

## General Guidelines and Best Practices
- [x] Title of the pull request is clear and informative.
- [x] There are a small number of commits, each of which have an
informative message. This means that previously merged commits do not
appear in the history of the PR. For more information on cleaning up the
commits in your PR, [see this
page](https://github.com/Azure/azure-powershell/blob/master/documentation/development-docs/cleaning-up-commits.md).

### Testing Guidelines
- [ ] Pull request includes test coverage for the included changes.

Co-authored-by: Philip Gao <yigao@microsoft.com>
Co-authored-by: Brynn Yin <24237253+brynn-code@users.noreply.github.com>
  • Loading branch information
3 people authored Aug 14, 2024
1 parent 027dbd9 commit f9936e5
Show file tree
Hide file tree
Showing 15 changed files with 1,023 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# Continuous Monitoring Pipeline

This tutorial describes an advanced use case of [running flows in Azure ML Pipelines](https://github.com/microsoft/promptflow/blob/main/examples/tutorials/run-flow-with-pipeline/pipeline.ipynb).
The detailed explanations of the prerequisites and principles can be found in the aforementioned article.
Continuous monitoring is necessary to maintain the quality, performance and efficiency of Generative AI applications.
These factors directly impact the user experience and operational costs.

We will run evaluations on a basic chatbot flow, then aggregate the results to export and visualize the metrics.
The flows used in this pipeline are described below:
- [Basic Chat](https://github.com/microsoft/promptflow/tree/main/examples/flows/chat/chat-basic)
- [Q&A Evaluation](https://github.com/microsoft/promptflow/tree/main/examples/flows/evaluation/eval-qna-rag-metrics)
- [Perceived Intelligence Evaluation](https://github.com/microsoft/promptflow/tree/main/examples/flows/evaluation/eval-perceived-intelligence)
- [Summarization Evaluation](https://github.com/microsoft/promptflow/tree/main/examples/flows/evaluation/eval-summarization)

Connections used in this flow:
- `azure_open_ai_connection` connection (Azure OpenAI).

## Prerequisites

### Prompt flow SDK:
- Azure cloud setup:
- An Azure account with an active subscription - [Create an account for free](https://azure.microsoft.com/free/?WT.mc_id=A261C142F)
- Create an Azure ML resource from Azure portal - [Create a Azure ML workspace](https://ms.portal.azure.com/#view/Microsoft_Azure_Marketplace/MarketplaceOffersBlade/searchQuery/machine%20learning)
- Connect to your workspace then setup a basic computer cluster - [Configure workspace](https://github.com/microsoft/promptflow/blob/main/examples/configuration.ipynb)
- Local environment setup:
- A python environment
- Installed Azure Machine Learning Python SDK v2 - [install instructions](https://github.com/microsoft/promptflow/blob/main/examples/README.md) - check the getting started section and make sure version of 'azure-ai-ml' is higher than `1.12.0`

Note: when using the Prompt flow SDK, it may be useful to also install the [`Prompt flow for VS Code`](https://marketplace.visualstudio.com/items?itemName=prompt-flow.prompt-flow) extension (if using VS Code).

### Azure AI/ML Studio:
Start a compute session.
The follow the installation steps described in the notebook.

## Setup connections
Ensure that you have a connection to Azure OpenAI with the following deployments:
- `gpt-35-turbo`
- `gpt-4`

## Run pipeline

Run the notebook's steps until `3.2.2 Submit the job` to start the pipeline in Azure ML Studio.

## Pipeline description
The first node reads the evaluation dataset.
The second node is the main flow that will be monitored, it takes the output of the evaluation dataset as a `data` input.
After the main flow's node has completed, its output will go to 3 nodes:
- Q&A Evaluation
- Perceived Intelligence Evaluation
- Simple Summarization

The Simple Summarization and the main nodes' outputs will become the Summarization Evaluation node's input.

Finally, all the evaluation metrics will be aggregated and displayed in Azure ML Pipeline's interface.

![continuous_monitoring_pipeline.png](./monitoring/media/continuous_monitoring_pipeline.png)

## Metrics visualization
In the node `Convert evaluation results to parquet` Metrics tab, the aggregated metrics will be displayed.

![metrics_tab.png](./monitoring/media/metrics_tab.png)

The evolution of the metrics can be monitored by comparing multiple pipeline runs:

![compare_button.png](./monitoring/media/compare_button.png)

![compare_metrics.png](./monitoring/media/compare_metrics.png)
## Contact
Please reach out to Lou Bigard (<loubigard@microsoft.com>) with any issues.
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
$schema: https://azuremlschemas.azureedge.net/promptflow/latest/Flow.schema.json
environment:
python_requirements_txt: requirements.txt
inputs:
answer:
type: string
outputs:
summary:
type: string
reference: ${summarize_text_content.output}
nodes:
- name: summarize_text_content
use_variants: true
node_variants:
summarize_text_content:
default_variant_id: variant_0
variants:
variant_0:
node:
type: llm
source:
type: code
path: summarize_text_content.jinja2
inputs:
deployment_name: gpt-35-turbo
model: gpt-3.5-turbo
max_tokens: 128
temperature: 0.2
text: ${inputs.answer}
connection: open_ai_connection
api: chat
variant_1:
node:
type: llm
source:
type: code
path: summarize_text_content__variant_1.jinja2
inputs:
deployment_name: gpt-35-turbo
model: gpt-3.5-turbo
max_tokens: 256
temperature: 0.3
text: ${inputs.answer}
connection: open_ai_connection
api: chat
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
promptflow[azure]>=1.7.0
promptflow-tools
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# system:
Please summarize the following text in one paragraph. 100 words.
Do not add any information that is not in the text.

# user:
Text: {{text}}
Summary:
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# system:
Please summarize some keywords of this paragraph and have some details of each keywords.
Do not add any information that is not in the text.

# user:
Text: {{text}}
Summary:
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
name: convert_to_parquet
channels:
- defaults
dependencies:
- python=3.10
- pip=22.2
- pip:
- azureml-mlflow==1.56.0
- pandas==2.2.2
- pyarrow
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
from pathlib import Path
import pandas as pd
import mlflow
import argparse
import datetime
from functools import reduce


def parse_args():
# setup argparse
parser = argparse.ArgumentParser()

# add arguments
parser.add_argument(
"--eval_qna_rag_metrics_output_folder",
type=str,
help="path containing data for qna rag evaluation metrics",
)
parser.add_argument(
"--eval_perceived_intelligence_output_folder",
type=str,
default="./",
help="input path for perceived intelligence evaluation metrics",
)

parser.add_argument(
"--eval_summarization_output_folder",
type=str,
default="./",
help="input path for summarization evaluation metrics",
)

parser.add_argument(
"--eval_results_output",
type=str,
default="./",
help="output path for aggregated metrics",
)

# parse args
args = parser.parse_args()

# return args
return args


def get_file(f):
f = Path(f)
if f.is_file():
return f
else:
files = list(f.iterdir())
if len(files) == 1:
return files[0]
else:
raise Exception("********This path contains more than one file*******")


def convert_to_parquet(
eval_qna_rag_metrics_output_folder,
eval_perceived_intelligence_output_folder,
eval_summarization_output_folder,
eval_results_output,
):
now = f"{datetime.datetime.now():%Y%m%d%H%M%S}"

eval_qna_rag_metrics_file = get_file(eval_qna_rag_metrics_output_folder)
eval_qna_rag_metrics_data = pd.read_json(eval_qna_rag_metrics_file, lines=True)

eval_perceived_intelligence_file = get_file(
eval_perceived_intelligence_output_folder
)
eval_perceived_intelligence_data = pd.read_json(
eval_perceived_intelligence_file, lines=True
)

eval_summarization_file = get_file(eval_summarization_output_folder)
eval_summarization_data = pd.read_json(eval_summarization_file, lines=True)

all_dataframes = [
eval_qna_rag_metrics_data,
eval_perceived_intelligence_data,
eval_summarization_data,
]
eval_results_data = reduce(
lambda left, right: pd.merge(left, right, on="line_number"), all_dataframes
)

eval_results_data["timestamp"] = pd.Timestamp("now")

eval_results_data.to_parquet(eval_results_output + f"/{now}_eval_results.parquet")

eval_results_data_mean = eval_results_data.mean(numeric_only=True)

for metric, avg in eval_results_data_mean.items():
if metric == "line_number":
continue
mlflow.log_metric(metric, avg)


def main(args):
convert_to_parquet(
args.eval_qna_rag_metrics_output_folder,
args.eval_perceived_intelligence_output_folder,
args.eval_summarization_output_folder,
args.eval_results_output,
)


# run script
if __name__ == "__main__":
# parse args
args = parse_args()

# call main function
main(args)
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
$schema: https://azuremlschemas.azureedge.net/latest/commandComponent.schema.json
type: command

name: convert_to_parquet
display_name: Convert evaluation results to parquet
inputs:
eval_qna_rag_metrics_output_folder:
type: uri_folder
eval_perceived_intelligence_output_folder:
type: uri_folder
eval_summarization_output_folder:
type: uri_folder
outputs:
eval_results_output:
type: uri_folder
code: ./
command: python convert_parquet.py --eval_qna_rag_metrics_output_folder ${{inputs.eval_qna_rag_metrics_output_folder}} --eval_perceived_intelligence_output_folder ${{inputs.eval_perceived_intelligence_output_folder}} --eval_summarization_output_folder ${{inputs.eval_summarization_output_folder}} --eval_results_output ${{outputs.eval_results_output}}
environment:
conda_file: ./conda.yaml
image: mcr.microsoft.com/azureml/inference-base-2004:latest
Loading

0 comments on commit f9936e5

Please sign in to comment.