Skip to content

Actions: openai/evals

Actions

All workflows

Actions

Loading...
Loading

Showing runs from all workflows
389 workflow runs
389 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

MMP v2 eval
Run unit tests #1492: Pull request #1403 opened by ojaffe
November 10, 2023 14:02 1m 53s ojaffe:ollie/MMP_v2
November 10, 2023 14:02 1m 53s
Bluff eval
Run new evals #2145: Pull request #1402 synchronize by johny-b
November 10, 2023 12:43 1m 47s johny-b:bluff
November 10, 2023 12:43 1m 47s
Bluff eval
Run unit tests #1491: Pull request #1402 synchronize by johny-b
November 10, 2023 12:43 1m 49s johny-b:bluff
November 10, 2023 12:43 1m 49s
Bluff eval
Run unit tests #1490: Pull request #1402 synchronize by johny-b
November 10, 2023 12:40 2m 2s johny-b:bluff
November 10, 2023 12:40 2m 2s
Bluff eval
Run new evals #2144: Pull request #1402 synchronize by johny-b
November 10, 2023 12:40 1m 51s johny-b:bluff
November 10, 2023 12:40 1m 51s
Bluff eval
Run unit tests #1489: Pull request #1402 opened by johny-b
November 10, 2023 12:35 2m 3s johny-b:bluff
November 10, 2023 12:35 2m 3s
Bluff eval
Run new evals #2143: Pull request #1402 opened by johny-b
November 10, 2023 12:35 1m 55s johny-b:bluff
November 10, 2023 12:35 1m 55s
Self-Prompting eval
Run new evals #2142: Pull request #1401 opened by JunShern
November 10, 2023 12:19 1m 54s JunShern:jun/self-prompting-eval
November 10, 2023 12:19 1m 54s
Self-Prompting eval
Run unit tests #1488: Pull request #1401 opened by JunShern
November 10, 2023 12:19 2m 0s JunShern:jun/self-prompting-eval
November 10, 2023 12:19 2m 0s
icelandic gec eval
Run new evals #2141: Pull request #1400 opened by svanhvitlilja
November 7, 2023 14:08 2m 20s svanhvitlilja:gec-icelandic
November 7, 2023 14:08 2m 20s
icelandic gec eval
Run unit tests #1487: Pull request #1400 opened by svanhvitlilja
November 7, 2023 14:08 1m 56s svanhvitlilja:gec-icelandic
November 7, 2023 14:08 1m 56s
Add new Solvers framework
Run unit tests #1485: Pull request #1397 opened by JunShern
November 5, 2023 15:05 2m 15s JunShern:jun/solvers
November 5, 2023 15:05 2m 15s
Solve #1394
Run unit tests #1484: Pull request #1395 opened by LoryPack
November 3, 2023 15:01 1m 51s LoryPack:fix_few_shot
November 3, 2023 15:01 1m 51s
Schelling Point v2
Run unit tests #1483: Pull request #1391 synchronize by ojaffe
November 1, 2023 15:47 2m 33s james-aung:schelling-patch
November 1, 2023 15:47 2m 33s
Schelling Point v2
Run new evals #2140: Pull request #1391 synchronize by ojaffe
November 1, 2023 15:47 2m 13s james-aung:schelling-patch
November 1, 2023 15:47 2m 13s
Ballots v2
Run new evals #2139: Pull request #1390 synchronize by ojaffe
November 1, 2023 15:46 2m 59s james-aung:ballots-patch
November 1, 2023 15:46 2m 59s
Ballots v2
Run unit tests #1482: Pull request #1390 synchronize by ojaffe
November 1, 2023 15:46 2m 33s james-aung:ballots-patch
November 1, 2023 15:46 2m 33s
Add Eval: name well known security weaknesses
Run new evals #2138: Pull request #1392 opened by ourmony
October 28, 2023 05:43 1m 53s ourmony:main
October 28, 2023 05:43 1m 53s
Add Eval: name well known security weaknesses
Run unit tests #1481: Pull request #1392 opened by ourmony
October 28, 2023 05:43 2m 39s ourmony:main
October 28, 2023 05:43 2m 39s
Schelling Point v2
Run unit tests #1480: Pull request #1391 opened by james-aung
October 27, 2023 16:19 2m 13s james-aung:schelling-patch
October 27, 2023 16:19 2m 13s
Schelling Point v2
Run new evals #2137: Pull request #1391 opened by james-aung
October 27, 2023 16:19 2m 11s james-aung:schelling-patch
October 27, 2023 16:19 2m 11s
Ballots v2
Run new evals #2136: Pull request #1390 opened by james-aung
October 27, 2023 16:11 2m 12s james-aung:ballots-patch
October 27, 2023 16:11 2m 12s
Ballots v2
Run unit tests #1479: Pull request #1390 opened by james-aung
October 27, 2023 16:11 2m 46s james-aung:ballots-patch
October 27, 2023 16:11 2m 46s
Add a recorder for function calls
Run unit tests #1478: Pull request #1389 opened by danesherbs
October 24, 2023 08:21 3m 12s add-fn-call-recording
October 24, 2023 08:21 3m 12s
Add gpt-3.5-turbo-16k support to ctx len getter
Run unit tests #1477: Pull request #1388 opened by danesherbs
October 24, 2023 03:10 2m 20s main
October 24, 2023 03:10 2m 20s
ProTip! You can narrow down the results and go further in time using created:<2023-10-24 or the other filters available.