Skip to content

Commit

Permalink
update example flow for llmlingua prompt compression tool (#3320)
Browse files Browse the repository at this point in the history
# Description

Please add an informative description that covers that changes made by
the pull request and link all relevant issues.

# All Promptflow Contribution checklist:
- [x] **The pull request does not introduce [breaking changes].**
- [x] **CHANGELOG is updated for new features, bug fixes or other
significant changes.**
- [x] **I have read the [contribution guidelines](../CONTRIBUTING.md).**
- [ ] **Create an issue and link to the pull request to get dedicated
review from promptflow team. Learn more: [suggested
workflow](../CONTRIBUTING.md#suggested-workflow).**

## General Guidelines and Best Practices
- [x] Title of the pull request is clear and informative.
- [x] There are a small number of commits, each of which have an
informative message. This means that previously merged commits do not
appear in the history of the PR. For more information on cleaning up the
commits in your PR, [see this
page](https://github.com/Azure/azure-powershell/blob/master/documentation/development-docs/cleaning-up-commits.md).

### Testing Guidelines
- [x] Pull request includes test coverage for the included changes.
  • Loading branch information
SiyunZhao authored May 21, 2024
1 parent 114b7a3 commit f80a4d9
Show file tree
Hide file tree
Showing 12 changed files with 257 additions and 1 deletion.
4 changes: 3 additions & 1 deletion .cspell.json
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,9 @@
"Entra",
"uvicorn",
"attribited",
"MistralAI"
"MistralAI",
"llmlingua",
"myconn"
],
"ignoreWords": [
"openmpi",
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# few shot example compression in GSM8k

## Flow description

A flow to test the accuracy of LLM (Large Language Model) in answering questions using a context that has been compressed.

GSM8K (Grade School Math 8K) is a dataset of 8.5K high quality, linguistically diverse grade school math word problems. The dataset was created to support the task of question answering on basic mathematical problems that require multi-step reasoning. The following steps are performed in this flow:
1. Read the `.txt` file of few-shot examples.
2. Use LLMLingua prompt compression tool to compress the GSM8K few-shot examples.
3. Test the LLM by using the compressed few-shot examples as context to determine if the answers are correct.

See the [`llmlingua-promptflow`](https://pypi.org/project/llmlingua-promptflow/) tool package reference documentation for further information.

Tools used in this flow:
- `python` tool.
- `LLMLingua Prompt Compression Tool` from the `llmlingua-promptflow` package.
- `prompt` tool.
- `LLM` tool.

Connections used in this flow:
- `Custom` connection.
- `AzureOpenAI` connection.

## Prerequisites

### Prompt flow SDK:
Install promptflow sdk and other dependencies:
```
pip install -r requirements.txt
```

Note: when using the Prompt flow SDK, it may be useful to also install the [`Prompt flow for VS Code`](https://marketplace.visualstudio.com/items?itemName=prompt-flow.prompt-flow) extension (if using VS Code).

### Azure AI/ML Studio:
Start an compute session. Required packages will automatically be installed from the `requirements.txt` file.

## Setup connections

### Custom connection
Create a connection to a MaaS resource for calculating log probability in Azure model catalog. You can use Llama, gpt-2, or other language models.

Take the Llama model as an example, you can learn how to deploy and consume Meta Llama models with model as a service by [the guidance for Azure AI Studio](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/deploy-models-llama?tabs=llama-three#deploy-meta-llama-models-with-pay-as-you-go) or [the guidance for Azure Machine Learning Studio
](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-deploy-models-llama?view=azureml-api-2&tabs=llama-three#deploy-meta-llama-models-with-pay-as-you-go).

The required keys to set are:
1. **api_url**
- This value can be found at the previously created inferencing endpoint.
2. **api_key**
- Ensure to set this as a secret value.
- This value can be found at the previously created inferencing endpoint.

Create a Custom connection with `api_url` and `api_key`.

### AzureOpenAI connection
To use the `LLM` tool, you must have an [Azure OpenAI Service Resource](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/create-resource?pivots=web-portal). Create one if necessary. From your Azure OpenAI Service Resource, obtain its `api_key` and `endpoint`.

Create a connection to your Azure OpenAI Service Resource.
## Run flow

### Prompt flow SDK:
```
# Test with default input values in flow.dag.yaml:
pf flow test --flow .
```

### Azure AI/ML Studio:
Run flow.

## Contact
Please reach out to LLMLingua Team (<llmlingua@microsoft.com>) with any issues.
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@

from promptflow.core import tool
import re

# The inputs section will change based on the arguments of the tool function, after you save the code
# Adding type to arguments and return value will help the system show the types properly
# Please update the function name/signature per need


def extract_ans(ans_model):
ans_model = ans_model.split("\n")
ans = []
residual = []
for li, al in enumerate(ans_model):
ans.append(al)
if "answer is" in al:
break
residual = list(ans_model[li + 1 :])
ans = "\n".join(ans)
residual = "\n".join(residual)
return ans, residual


def get_result(text: str):
pattern = r"\d*\.?\d+"
res = re.findall(pattern, text)
# return res[-1].replace(".00", "") if res else ""
return res[-1] if res else ""


def test_answer(pred_str, ans_str):
pred, gold = get_result(pred_str), get_result(ans_str)
return pred == gold


@tool
def my_python_tool(llm_response: str, answer: str) -> int:
print("LLM response: ", llm_response)
print("Ground Truth Answer: ", answer)
ans_, residual = extract_ans(llm_response)
model_ans = ans_.replace("Q:", "").replace("A:", "")
if test_answer(model_ans, answer):
return 1
return 0
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
$schema: https://azuremlschemas.azureedge.net/promptflow/latest/AzureOpenAIConnection.schema.json
name: your_open_ai_connection
type: azure_open_ai
api_key: "<user-input>"
api_base: "aoai-api-endpoint"
api_type: "azure"
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
$schema: https://azuremlschemas.azureedge.net/promptflow/latest/CustomConnection.schema.json
name: maas_connection
type: custom
configs:
api_url: "<maas-endpoint>"
secrets:
api_key: "<to-be-replaced>"
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Question: Angelo and Melanie want to plan how many hours over the next week they should study together for their test next week. They have 2 chapters of their textbook to study and 4 worksheets to memorize. They figure out that they should dedicate 3 hours to each chapter of their textbook and 1.5 hours for each worksheet. If they plan to study no more than 4 hours each day, how many days should they plan to study total over the next week if they take a 10-minute break every hour, include 3 10-minute snack breaks each day, and 30 minutes for lunch each day?\nLet's think step by step\nAngelo and Melanie think they should dedicate 3 hours to each of the 2 chapters, 3 hours x 2 chapters = 6 hours total.\nFor the worksheets they plan to dedicate 1.5 hours for each worksheet, 1.5 hours x 4 worksheets = 6 hours total.\nAngelo and Melanie need to start with planning 12 hours to study, at 4 hours a day, 12 / 4 = 3 days.\nHowever, they need to include time for breaks and lunch. Every hour they want to include a 10-minute break, so 12 total hours x 10 minutes = 120 extra minutes for breaks.\nThey also want to include 3 10-minute snack breaks, 3 x 10 minutes = 30 minutes.\nAnd they want to include 30 minutes for lunch each day, so 120 minutes for breaks + 30 minutes for snack breaks + 30 minutes for lunch = 180 minutes, or 180 / 60 minutes per hour = 3 extra hours.\nSo Angelo and Melanie want to plan 12 hours to study + 3 hours of breaks = 15 hours total.\nThey want to study no more than 4 hours each day, 15 hours / 4 hours each day = 3.75\nThey will need to plan to study 4 days to allow for all the time they need.\nThe answer is 4\n\nQuestion: Mark's basketball team scores 25 2 pointers, 8 3 pointers and 10 free throws. Their opponents score double the 2 pointers but half the 3 pointers and free throws. What's the total number of points scored by both teams added together?\nLet's think step by step\nMark's team scores 25 2 pointers, meaning they scored 25*2= 50 points in 2 pointers.\nHis team also scores 6 3 pointers, meaning they scored 8*3= 24 points in 3 pointers\nThey scored 10 free throws, and free throws count as one point so they scored 10*1=10 points in free throws.\nAll together his team scored 50+24+10= 84 points\nMark's opponents scored double his team's number of 2 pointers, meaning they scored 50*2=100 points in 2 pointers.\nHis opponents scored half his team's number of 3 pointers, meaning they scored 24/2= 12 points in 3 pointers.\nThey also scored half Mark's team's points in free throws, meaning they scored 10/2=5 points in free throws.\nAll together Mark's opponents scored 100+12+5=117 points\nThe total score for the game is both team's scores added together, so it is 84+117=201 points\nThe answer is 201\n\nQuestion: Bella has two times as many marbles as frisbees. She also has 20 more frisbees than deck cards. If she buys 2/5 times more of each item, what would be the total number of the items she will have if she currently has 60 marbles?\nLet's think step by step\nWhen Bella buys 2/5 times more marbles, she'll have increased the number of marbles by 2/5*60 = 24\nThe total number of marbles she'll have is 60+24 = 84\nIf Bella currently has 60 marbles, and she has two times as many marbles as frisbees, she has 60/2 = 30 frisbees.\nIf Bella buys 2/5 times more frisbees, she'll have 2/5*30 = 12 more frisbees.\nThe total number of frisbees she'll have will increase to 30+12 = 42\nBella also has 20 more frisbees than deck cards, meaning she has 30-20 = 10 deck cards\nIf she buys 2/5 times more deck cards, she'll have 2/5*10 = 4 more deck cards.\nThe total number of deck cards she'll have is 10+4 = 14\nTogether, Bella will have a total of 14+42+84 = 140 items\nThe answer is 140\n\nQuestion: A group of 4 fruit baskets contains 9 apples, 15 oranges, and 14 bananas in the first three baskets and 2 less of each fruit in the fourth basket. How many fruits are there?\nLet's think step by step\nFor the first three baskets, the number of apples and oranges in one basket is 9+15=24\nIn total, together with bananas, the number of fruits in one basket is 24+14=38 for the first three baskets.\nSince there are three baskets each having 38 fruits, there are 3*38=114 fruits in the first three baskets.\nThe number of apples in the fourth basket is 9-2=7\nThere are also 15-2=13 oranges in the fourth basket\nThe combined number of oranges and apples in the fourth basket is 13+7=20\nThe fourth basket also contains 14-2=12 bananas.\nIn total, the fourth basket has 20+12=32 fruits.\nThe four baskets together have 32+114=146 fruits.\nThe answer is 146\n\nQuestion: You can buy 4 apples or 1 watermelon for the same price. You bought 36 fruits evenly split between oranges, apples and watermelons, and the price of 1 orange is $0.50. How much does 1 apple cost if your total bill was $66?\nLet's think step by step\nIf 36 fruits were evenly split between 3 types of fruits, then I bought 36/3 = 12 units of each fruit\nIf 1 orange costs $0.50 then 12 oranges will cost $0.50 * 12 = $6\nIf my total bill was $66 and I spent $6 on oranges then I spent $66 - $6 = $60 on the other 2 fruit types.\nAssuming the price of watermelon is W, and knowing that you can buy 4 apples for the same price and that the price of one apple is A, then 1W=4A\nIf we know we bought 12 watermelons and 12 apples for $60, then we know that $60 = 12W + 12A\nKnowing that 1W=4A, then we can convert the above to $60 = 12(4A) + 12A\n$60 = 48A + 12A\n$60 = 60A\nThen we know the price of one apple (A) is $60/60= $1\nThe answer is 1\n\nQuestion: Susy goes to a large school with 800 students, while Sarah goes to a smaller school with only 300 students. At the start of the school year, Susy had 100 social media followers. She gained 40 new followers in the first week of the school year, half that in the second week, and half of that in the third week. Sarah only had 50 social media followers at the start of the year, but she gained 90 new followers the first week, a third of that in the second week, and a third of that in the third week. After three weeks, how many social media followers did the girl with the most total followers have?\nLet's think step by step\nAfter one week, Susy has 100+40 = 140 followers.\nIn the second week, Susy gains 40/2 = 20 new followers.\nIn the third week, Susy gains 20/2 = 10 new followers.\nIn total, Susy finishes the three weeks with 140+20+10 = 170 total followers.\nAfter one week, Sarah has 50+90 = 140 followers.\nAfter the second week, Sarah gains 90/3 = 30 followers.\nAfter the third week, Sarah gains 30/3 = 10 followers.\nSo, Sarah finishes the three weeks with 140+30+10 = 180 total followers.\nThus, Sarah is the girl with the most total followers with a total of 180.\nThe answer is 180\n\nQuestion: Sam bought a dozen boxes, each with 30 highlighter pens inside, for $10 each box. He rearranged five of these boxes into packages of six highlighters each and sold them for $3 per package. He sold the rest of the highlighters separately at the rate of three pens for $2. How much profit did he make in total, in dollars?\nLet's think step by step\nSam bought 12 boxes x $10 = $120 worth of highlighters.\nHe bought 12 * 30 = 360 highlighters in total.\nSam then took 5 boxes × 6 highlighters/box = 30 highlighters.\nHe sold these boxes for 5 * $3 = $15\nAfter selling these 5 boxes there were 360 - 30 = 330 highlighters remaining.\nThese form 330 / 3 = 110 groups of three pens.\nHe sold each of these groups for $2 each, so made 110 * 2 = $220 from them.\nIn total, then, he earned $220 + $15 = $235.\nSince his original cost was $120, he earned $235 - $120 = $115 in profit.\nThe answer is 115\n\nQuestion: In a certain school, 2/3 of the male students like to play basketball, but only 1/5 of the female students like to play basketball. What percent of the population of the school do not like to play basketball if the ratio of the male to female students is 3:2 and there are 1000 students?\nLet's think step by step\nThe students are divided into 3 + 2 = 5 parts where 3 parts are for males and 2 parts are for females.\nEach part represents 1000/5 = 200 students.\nSo, there are 3 x 200 = 600 males.\nAnd there are 2 x 200 = 400 females.\nHence, 600 x 2/3 = 400 males play basketball.\nAnd 400 x 1/5 = 80 females play basketball.\nA total of 400 + 80 = 480 students play basketball.\nTherefore, 1000 - 480 = 520 do not like to play basketball.\nThe percentage of the school that do not like to play basketball is 520/1000 * 100 = 52\nThe answer is 52\n
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
$schema: https://azuremlschemas.azureedge.net/promptflow/latest/Flow.schema.json
inputs:
question:
type: string
default: Janet\u2019s ducks lay 16 eggs per day. She eats three for breakfast
every morning and bakes muffins for her friends every day with four. She
sells the remainder at the farmers' market daily for $2 per fresh duck
egg. How much in dollars does she make every day at the farmers' market?
answer:
type: string
default: Janet sells 16 - 3 - 4 = <<16-3-4=9>>9 duck eggs a day.\nShe makes 9 *
2 = $<<9*2=18>>18 every day at the farmer\u2019s market.\n#### 18
few_shot_exp_path:
type: string
default: ./few_shot_examples.txt
chat_history:
type: list
default: []
compression_rate:
type: double
default: 0.6
outputs:
Accuracy:
type: string
reference: ${answer_evaluate.output}
nodes:
- name: read_file
type: python
source:
type: code
path: read_file.py
inputs:
file_path: ${inputs.few_shot_exp_path}
- name: LLMLingua_Prompt_Compression_Tool
type: python
source:
type: package
tool: llmlingua_promptflow.tools.llmlingua.prompt_compress
inputs:
myconn: maas_connection
prompt: ${read_file.output}
rate: ${inputs.compression_rate}
- name: prompt_node
type: prompt
source:
type: code
path: prompt_node.jinja2
inputs:
question: ${inputs.question}
few_shot_examples: ${LLMLingua_Prompt_Compression_Tool.output}
- name: llm_node
type: llm
source:
type: code
path: gpt.jinja2
inputs:
chat_history: ${inputs.chat_history}
question: ${prompt_node.output}
deployment_name: your_deployment_name
connection: your_open_ai_connection
api: chat
- name: answer_evaluate
type: python
source:
type: code
path: answer_eval.py
inputs:
llm_response: ${llm_node.output}
answer: ${inputs.answer}
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@

system:
You are a helpful assistant.

{% for item in chat_history %}
user:
{{item.inputs.question}}
assistant:
{{item.outputs.answer}}
{% endfor %}

user:
{{question}}
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
Please reference the following examples to answer the math question,
{{few_shot_examples}}

Questions: {{question}}
Let's think step by step.
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
from promptflow.core import tool


@tool
def read_file(file_path: str) -> str:
"""
This tool opens a file and reads its contents into a string.
:param file_path: the file path of the file to be read.
"""

with open(file_path, 'r', encoding="utf8") as f:
file = f.read()
return file
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
transformers>=4.26.0
tiktoken
nltk
numpy
llmlingua-promptflow
Loading

0 comments on commit f80a4d9

Please sign in to comment.