Skip to content

Commit

Permalink
main readme
Browse files Browse the repository at this point in the history
  • Loading branch information
thesofakillers committed Mar 15, 2024
1 parent 5bde851 commit cf2525c
Showing 1 changed file with 7 additions and 0 deletions.
7 changes: 7 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@

# OpenAI Evals

Evals provide a framework for evaluating large language models (LLMs) or systems built using LLMs. We offer an existing registry of evals to test different dimensions of OpenAI models and the ability to write your own custom evals for use cases you care about. You can also use your data to build private evals which represent the common LLMs patterns in your workflow without exposing any of that data publicly.
Expand All @@ -6,6 +7,12 @@ If you are building with LLMs, creating high quality evals is one of the most im

<img width="596" alt="https://x.com/gdb/status/1733553161884127435?s=20" src="https://github.com/openai/evals/assets/35577566/ce7840ff-43a8-4d88-bb2f-6b207410333b">

| Eval | Summary of evaluation | Capability targeted |
| --- | --- | --- |
| [Identifying Variables](evals/elsuite/identifying_variables) | Identify the correct experimental variables for testing a hypothesis | AI R&D |

---

## Setup

To run evals, you will need to set up and specify your [OpenAI API key](https://platform.openai.com/account/api-keys). After you obtain an API key, specify it using the [`OPENAI_API_KEY` environment variable](https://platform.openai.com/docs/quickstart/step-2-setup-your-api-key). Please be aware of the [costs](https://openai.com/pricing) associated with using the API when running evals. You can also run and create evals using [Weights & Biases](https://wandb.ai/wandb_fc/openai-evals/reports/OpenAI-Evals-Demo-Using-W-B-Prompts-to-Run-Evaluations--Vmlldzo0MTI4ODA3).
Expand Down

0 comments on commit cf2525c

Please sign in to comment.