Skip to content

Actions: openai/evals

Actions

All workflows

Actions

Loading...
Loading

Showing runs from all workflows
389 workflow runs
389 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Sandbagging eval
Run new evals #2157: Pull request #1409 opened by ojaffe
November 14, 2023 10:59 1m 46s ojaffe:ollie/Sandbagging-v1
November 14, 2023 10:59 1m 46s
MMP v2 eval
Run unit tests #1505: Pull request #1403 synchronize by ojaffe
November 14, 2023 10:45 2m 6s ojaffe:ollie/MMP_v2
November 14, 2023 10:45 2m 6s
MMP v2 eval
Run new evals #2156: Pull request #1403 synchronize by ojaffe
November 14, 2023 10:45 1m 46s ojaffe:ollie/MMP_v2
November 14, 2023 10:45 1m 46s
Add theory of mind eval
Run unit tests #1504: Pull request #1405 synchronize by inwaves
November 14, 2023 10:05 1m 49s inwaves:feature/theory_of_mind
November 14, 2023 10:05 1m 49s
Add theory of mind eval
Run new evals #2155: Pull request #1405 synchronize by inwaves
November 14, 2023 10:05 2m 19s inwaves:feature/theory_of_mind
November 14, 2023 10:05 2m 19s
Add theory of mind eval
Run new evals #2154: Pull request #1405 synchronize by inwaves
November 14, 2023 10:03 1m 57s inwaves:feature/theory_of_mind
November 14, 2023 10:03 1m 57s
Add theory of mind eval
Run unit tests #1503: Pull request #1405 synchronize by inwaves
November 14, 2023 10:03 1m 54s inwaves:feature/theory_of_mind
November 14, 2023 10:03 1m 54s
Add theory of mind eval
Run new evals #2153: Pull request #1405 synchronize by inwaves
November 14, 2023 09:57 1m 42s inwaves:feature/theory_of_mind
November 14, 2023 09:57 1m 42s
Add theory of mind eval
Run unit tests #1502: Pull request #1405 synchronize by inwaves
November 14, 2023 09:57 1m 56s inwaves:feature/theory_of_mind
November 14, 2023 09:57 1m 56s
Migrate from openai==0.28.1 to openai==1.2.4
Run unit tests #1501: Pull request #1407 opened by johny-b
November 14, 2023 09:52 2m 33s johny-b:migrate
November 14, 2023 09:52 2m 33s
Migrate from openai==0.28.1 to openai==1.2.4
Run new evals #2152: Pull request #1407 opened by johny-b
November 14, 2023 09:52 1m 42s johny-b:migrate
November 14, 2023 09:52 1m 42s
Self-Prompting eval
Run new evals #2151: Pull request #1401 synchronize by JunShern
November 14, 2023 09:13 2m 27s JunShern:jun/self-prompting-eval
November 14, 2023 09:13 2m 27s
Self-Prompting eval
Run unit tests #1500: Pull request #1401 synchronize by JunShern
November 14, 2023 09:13 1m 48s JunShern:jun/self-prompting-eval
November 14, 2023 09:13 1m 48s
Bluff eval
Run new evals #2150: Pull request #1402 synchronize by johny-b
November 14, 2023 08:54 1m 57s johny-b:bluff
November 14, 2023 08:54 1m 57s
Bluff eval
Run unit tests #1499: Pull request #1402 synchronize by johny-b
November 14, 2023 08:54 2m 50s johny-b:bluff
November 14, 2023 08:54 2m 50s
Bluff eval
Run new evals #2149: Pull request #1402 synchronize by johny-b
November 14, 2023 08:22 2m 33s johny-b:bluff
November 14, 2023 08:22 2m 33s
Bluff eval
Run unit tests #1498: Pull request #1402 synchronize by johny-b
November 14, 2023 08:22 2m 29s johny-b:bluff
November 14, 2023 08:22 2m 29s
[Evals] Update the errors we except for retries
Run unit tests #1497: Pull request #1406 synchronize by andrew-openai
November 13, 2023 17:33 2m 43s andrew/update-errors
November 13, 2023 17:33 2m 43s
[Evals] Update the errors we except for retries
Run unit tests #1496: Pull request #1406 synchronize by andrew-openai
November 13, 2023 17:16 2m 25s andrew/update-errors
November 13, 2023 17:16 2m 25s
[Evals] Update the errors we except for retries
Run unit tests #1495: Pull request #1406 opened by andrew-openai
November 13, 2023 17:06 1m 53s andrew/update-errors
November 13, 2023 17:06 1m 53s
Add theory of mind eval
Run new evals #2148: Pull request #1405 opened by inwaves
November 10, 2023 14:24 2m 8s inwaves:feature/theory_of_mind
November 10, 2023 14:24 2m 8s
Add theory of mind eval
Run unit tests #1494: Pull request #1405 opened by inwaves
November 10, 2023 14:24 2m 10s inwaves:feature/theory_of_mind
November 10, 2023 14:24 2m 10s
Sandbagging eval
Run new evals #2147: Pull request #1404 opened by ojaffe
November 10, 2023 14:16 1m 46s ojaffe:ollie/Sandbagging
November 10, 2023 14:16 1m 46s
Sandbagging eval
Run unit tests #1493: Pull request #1404 opened by ojaffe
November 10, 2023 14:16 2m 16s ojaffe:ollie/Sandbagging
November 10, 2023 14:16 2m 16s
MMP v2 eval
Run new evals #2146: Pull request #1403 opened by ojaffe
November 10, 2023 14:02 1m 48s ojaffe:ollie/MMP_v2
November 10, 2023 14:02 1m 48s
ProTip! You can narrow down the results and go further in time using created:<2023-11-10 or the other filters available.