Add Aligner-2B+Qwen1.5-72B-Chat & Aligner-2B+Claude3 Opus to AlpacaEval #259

AlignInc · 2024-03-18T14:27:39Z

We would like to add Aligner-2B+Qwen1.5-72B-Chat & Aligner-2B+Claude3 Opus to AlpacaEval 2.0. Thank you for such a valuable leaderboard!

It is the reproduction of the paper - Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction

Arxiv url: https://arxiv.org/pdf/2402.02416.pdf

Core insight: It is more easier to learn the correctional residual difference between bad and good answers than to directly master the generation of good answers.

YannDubs · 2024-03-20T03:06:50Z

Woaw @AlignInc those are amazing results, and really cool that you can run your method on any (even closed) model!

Unfortunately, I just merged the length-controlled (LC) PR to main and as a result there's a merge conflict. Can you please pull from main and run alpaca_eval --model_outputs … --is_recompute_metrics_only True that will compute LC win-rate without requiring any new annotations. Sorry for that!

The good news is that your model should perform even better on LC AlpacaEval.

Lmk if you face any issues!

AlignInc · 2024-03-22T05:54:42Z

Hi! @YannDubs,
Can you check it again? We have already resolved the conflict and updated it. Thank you again for such a valuable leaderboard!

YannDubs · 2024-03-22T13:24:52Z

Why adding cohere to requirements?
files look good to me besides that

AlignInc · 2024-03-22T14:46:25Z

setup.py

@@ -35,7 +35,7 @@
 ]
 PACKAGES_ALL_API = [
    "anthropic>=0.18",
-    "cohere",
+    "cohere<5.0.0a0",


We are importing class cohere.CohereError here:

alpaca_eval/src/alpaca_eval/decoders/cohere.py

Line 10 in f5046ae

from cohere import CohereError

The CohereError was removed in cohere v5, which was released yesterday (release history on PyPI).

cohere.CohereError in cohere v4: https://github.com/cohere-ai/cohere-python/blob/v4/cohere/__init__.py#L5
cohere.*Error in cohere v5: https://github.com/cohere-ai/cohere-python/blob/67620c348329308186d0b7e771a06795ea718226/src/cohere/__init__.py#L122-L130

That makes sense, thanks!

YannDubs · 2024-03-22T18:19:30Z

setup.py

@@ -35,7 +35,7 @@
 ]
 PACKAGES_ALL_API = [
    "anthropic>=0.18",
-    "cohere",
+    "cohere<5.0.0a0",


That makes sense, thanks!

YannDubs · 2024-03-22T18:21:29Z

requirements.txt

@@ -1,3 +1,4 @@
+cohere<5.0.0a0


But cohere should not be a main requirement, please remove this line!
it can be a requirement in setup.py if you use [all]

YannDubs · 2024-03-22T18:37:31Z

Congrats @AlignInc, those a really impressive results and I’m looking forward to see how the community picks it up 💯

sorry for the additional work you had to do for this PR!

AlignInc · 2024-03-22T18:41:14Z

Thanks for your time~

AlignInc added 2 commits March 22, 2024 22:06

update benchmark results

ac25e87

pin cohere version

f2e43fe

AlignInc commented Mar 22, 2024

View reviewed changes

YannDubs requested changes Mar 22, 2024

View reviewed changes

remove line in requirements.txt

28e6411

AlignInc requested a review from YannDubs March 22, 2024 18:29

YannDubs merged commit d7ff7c9 into tatsu-lab:main Mar 22, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Aligner-2B+Qwen1.5-72B-Chat & Aligner-2B+Claude3 Opus to AlpacaEval #259

Add Aligner-2B+Qwen1.5-72B-Chat & Aligner-2B+Claude3 Opus to AlpacaEval #259

AlignInc commented Mar 18, 2024

YannDubs commented Mar 20, 2024

AlignInc commented Mar 22, 2024

YannDubs commented Mar 22, 2024

AlignInc Mar 22, 2024

YannDubs Mar 22, 2024

YannDubs Mar 22, 2024

YannDubs Mar 22, 2024

YannDubs commented Mar 22, 2024

AlignInc commented Mar 22, 2024

Add Aligner-2B+Qwen1.5-72B-Chat & Aligner-2B+Claude3 Opus to AlpacaEval #259

Add Aligner-2B+Qwen1.5-72B-Chat & Aligner-2B+Claude3 Opus to AlpacaEval #259

Conversation

AlignInc commented Mar 18, 2024

YannDubs commented Mar 20, 2024

AlignInc commented Mar 22, 2024

YannDubs commented Mar 22, 2024

AlignInc Mar 22, 2024

Choose a reason for hiding this comment

YannDubs Mar 22, 2024

Choose a reason for hiding this comment

YannDubs Mar 22, 2024

Choose a reason for hiding this comment

YannDubs Mar 22, 2024

Choose a reason for hiding this comment

YannDubs commented Mar 22, 2024

AlignInc commented Mar 22, 2024