Skip to content

Commit

Permalink
Merge pull request #389 from eric-catalysis/main
Browse files Browse the repository at this point in the history
Updated 2.4, draft version for 2.5
  • Loading branch information
eric-catalysis authored Oct 28, 2024
2 parents 479e191 + ecddf74 commit c6e4bbf
Show file tree
Hide file tree
Showing 14 changed files with 291 additions and 3 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -87,9 +87,9 @@ Let's take a quick look at the [**default quality metrics**](https://learn.micro
| Metric | What does it assess? | How does it work? | When should you use it? | Inputs Needed |
|:--|:--|:--|:--|:--|
| **Groundedness** <br/> 1=ungrounded <br/> 5=grounded | How well does model's generated answers align with information from source data ("context")? | Checks if response corresponds _verifiably_ to source context |When factual correctness and contextual accuracy are key - e.g., is it grounded in "my" product data? | Question, Context, Generated Response |
| **Relevance** <br/> 1=bad <br/> 5=good | Are the model's generated responses pertinent, and directly related, to the given queries? | Assesses ability of responses to capture the key points of context that relate to the query | When evaluating your application's ability to understand the inputs and generate _contextually-relevant_ responses | |
| **Groundedness**| Given support knowledge, does the ANSWER use the information provided by the CONTEXT? | | | |
| **Relevance**| How well does the ANSWER address the main aspects of the QUESTION, based on the CONTEXT? | | | |
| **Relevance** <br/> 1=bad <br/> 5=good | Are the model's generated responses pertinent, and directly related, to the given queries? | Assesses ability of responses to capture the key points of context that relate to the query | When evaluating your application's ability to understand the inputs and generate _contextually-relevant_ responses | Question, Answer|
| **Fluency** 1=bad <br/> 5=fluent| How grammatically and linguistically correct the model's predicted answer is. | Checks quality of individual sentences in the ANSWER? Are they well-written and grammatically correct? | When evaluating your application's ability to generate _readable_ responses| Question, Answer |
| **Coherence** <br/> 1=bad <br/> 5=good | Measures the quality of all sentences in a model's predicted answer and how they fit together naturally. | Checks how well do all sentences in the ANSWER fit together? Do they sound natural when taken as a whole? | When the _readability_ of the response is important | Question, Answer|

To create these custom evaluators, we need to do three things:

Expand Down
288 changes: 288 additions & 0 deletions website/blog-30-days-of-ia-2024/2024-10-18/deploy-with-aca.md

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit c6e4bbf

Please sign in to comment.