Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Weighted win rates #189

Merged
merged 24 commits into from
Jan 3, 2024
Merged

[ENH] Weighted win rates #189

merged 24 commits into from
Jan 3, 2024

Conversation

YannDubs
Copy link
Collaborator

@YannDubs YannDubs commented Dec 26, 2023

adds the ability to use log probs for commuting the win rate, instead of a discrete win-rate

Todo:

  • implement weighted win rate
  • test backward compatibility
  • add tests for weighted win-rate
  • add a good prompt for weighted win rate
  • add generalized metrics that work for logprob preference
  • add ranking and correlation as a metric for the evaluator
  • upload new baseline to HF
  • compute minimal leaderboard with/without logprob and with/out cot

@YannDubs YannDubs changed the title [WIP] AlpacaEval 2.0 [WIP] Weighted win rate Dec 26, 2023
@YannDubs YannDubs changed the title [WIP] Weighted win rate [ENH] add weighted win rates Dec 30, 2023
@YannDubs YannDubs force-pushed the yann/prompt_tuning_chat4 branch from 6f31def to f02b0e4 Compare January 2, 2024 16:39
@YannDubs YannDubs changed the title [ENH] add weighted win rates [ENH] Alpaca 2.0 Jan 3, 2024
@YannDubs YannDubs changed the title [ENH] Alpaca 2.0 [ENH] Weighted win rates Jan 3, 2024
@YannDubs YannDubs merged commit 15fd513 into main Jan 3, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant