Skip to content

Actions: openai/evals

Actions

All workflows

Actions

Loading...
Loading

Showing runs from all workflows
389 workflow runs
389 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Already Said That Eval
Run unit tests #1650: Pull request #1490 synchronize by thesofakillers
March 15, 2024 14:22 2m 23s thesofakillers:ast
March 15, 2024 14:22 2m 23s
Already Said That Eval
Run new evals #2221: Pull request #1490 synchronize by thesofakillers
March 15, 2024 14:22 2m 29s thesofakillers:ast
March 15, 2024 14:22 2m 29s
Track the Stat Eval
Run unit tests #1649: Pull request #1489 opened by thesofakillers
March 15, 2024 14:06 2m 51s thesofakillers:tts
March 15, 2024 14:06 2m 51s
Track the Stat Eval
Run new evals #2220: Pull request #1489 opened by thesofakillers
March 15, 2024 14:06 3m 36s thesofakillers:tts
March 15, 2024 14:06 3m 36s
Identifying Variables Eval
Run new evals #2219: Pull request #1488 synchronize by thesofakillers
March 15, 2024 13:46 3m 33s thesofakillers:idvars
March 15, 2024 13:46 3m 33s
Identifying Variables Eval
Run unit tests #1648: Pull request #1488 synchronize by thesofakillers
March 15, 2024 13:46 2m 35s thesofakillers:idvars
March 15, 2024 13:46 2m 35s
Identifying Variables Eval
Run new evals #2218: Pull request #1488 synchronize by thesofakillers
March 15, 2024 13:45 4m 5s thesofakillers:idvars
March 15, 2024 13:45 4m 5s
Identifying Variables Eval
Run unit tests #1647: Pull request #1488 synchronize by thesofakillers
March 15, 2024 13:45 2m 26s thesofakillers:idvars
March 15, 2024 13:45 2m 26s
Identifying Variables Eval
Run unit tests #1646: Pull request #1488 opened by thesofakillers
March 15, 2024 13:38 2m 39s thesofakillers:idvars
March 15, 2024 13:38 2m 39s
Identifying Variables Eval
Run new evals #2217: Pull request #1488 opened by thesofakillers
March 15, 2024 13:38 2m 33s thesofakillers:idvars
March 15, 2024 13:38 2m 33s
Can't Do That Anymore Eval
Run unit tests #1645: Pull request #1487 opened by ojaffe
March 15, 2024 10:54 2m 9s ojaffe:ollie/cant_do_that_anymore
March 15, 2024 10:54 2m 9s
Can't Do That Anymore Eval
Run new evals #2216: Pull request #1487 opened by ojaffe
March 15, 2024 10:54 2m 7s ojaffe:ollie/cant_do_that_anymore
March 15, 2024 10:54 2m 7s
Bugged Tools Eval
Run new evals #2215: Pull request #1486 opened by ojaffe
March 15, 2024 10:37 2m 5s ojaffe:ollie/bugged_tools
March 15, 2024 10:37 2m 5s
Bugged Tools Eval
Run unit tests #1644: Pull request #1486 opened by ojaffe
March 15, 2024 10:37 2m 6s ojaffe:ollie/bugged_tools
March 15, 2024 10:37 2m 6s
Error Recovery Eval
Run new evals #2214: Pull request #1485 synchronize by ojaffe
March 15, 2024 10:32 2m 10s ojaffe:ollie/error_recovery
March 15, 2024 10:32 2m 10s
Error Recovery Eval
Run unit tests #1643: Pull request #1485 synchronize by ojaffe
March 15, 2024 10:32 2m 10s ojaffe:ollie/error_recovery
March 15, 2024 10:32 2m 10s
Error Recovery Eval
Run new evals #2213: Pull request #1485 opened by ojaffe
March 15, 2024 10:25 2m 48s ojaffe:ollie/error_recovery
March 15, 2024 10:25 2m 48s
Error Recovery Eval
Run unit tests #1642: Pull request #1485 opened by ojaffe
March 15, 2024 10:25 2m 47s ojaffe:ollie/error_recovery
March 15, 2024 10:25 2m 47s
Updates on existing evals; readmes; solvers (#1483)
Run unit tests #1641: Commit 11c30b2 pushed by JunShern
March 13, 2024 10:20 2m 28s main
March 13, 2024 10:20 2m 28s
Updates on existing evals; readmes; solvers
Run new evals #2212: Pull request #1483 opened by ojaffe
March 13, 2024 09:45 2m 16s ojaffe:ollie/updates-20240313
March 13, 2024 09:45 2m 16s
Updates on existing evals; readmes; solvers
Run unit tests #1640: Pull request #1483 opened by ojaffe
March 13, 2024 09:45 2m 25s ojaffe:ollie/updates-20240313
March 13, 2024 09:45 2m 25s
Log model and usage stats in record.sampling
Run unit tests #1638: Pull request #1449 synchronize by JunShern
March 13, 2024 07:48 2m 29s jun/log-token-counts
March 13, 2024 07:48 2m 29s
Drop two datasets from steganography (#1481)
Run unit tests #1637: Commit 7e958fe pushed by JunShern
March 12, 2024 09:23 2m 50s main
March 12, 2024 09:23 2m 50s
Drop two datasets from steganography
Run unit tests #1636: Pull request #1481 opened by thesofakillers
March 12, 2024 07:54 2m 3s thesofakillers:steg-data
March 12, 2024 07:54 2m 3s