Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add JUnit 5 ScorerExtension for AI model evaluation and associated library #1173

Merged
merged 1 commit into from
Dec 18, 2024

Conversation

cescoffier
Copy link
Collaborator

  • Implement ScorerExtension to inject and manage Scorer instances in tests.
  • Support field and parameter injection for Scorer using @ScorerConfiguration.
  • Add support for parameter injection of samples via @SampleLocation annotation.
  • Provide built-in evaluation strategies:
    • SemanticSimilarityStrategy (cosine similarity-based evaluation).
    • AiJudgeStrategy (AI-powered evaluation with customizable prompts).
  • Add tests for ScorerExtension:
    • Validate field and parameter injection of Scorer.
    • Test sample injection from YAML files.
    • Verify evaluation strategies and reporting.
  • Document ScorerExtension:
    • Explain concepts: Scorer, Samples, Evaluation Strategies, Reports.
    • Usage examples for field/parameter injection and evaluation.
    • Guide for using built-in strategies and creating custom strategies.

@cescoffier cescoffier requested a review from a team as a code owner December 18, 2024 09:42
Copy link
Member

@gsmet gsmet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read the doc to better understand this thing and I noticed a few tiny things (but feel free to ignore, they are tiny tiny).

docs/modules/ROOT/pages/testing.adoc Outdated Show resolved Hide resolved
docs/modules/ROOT/pages/testing.adoc Outdated Show resolved Hide resolved
docs/modules/ROOT/pages/testing.adoc Outdated Show resolved Hide resolved
docs/modules/ROOT/pages/testing.adoc Outdated Show resolved Hide resolved
docs/modules/ROOT/pages/testing.adoc Show resolved Hide resolved
Copy link
Collaborator

@geoand geoand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very neat, this was sorely needed!

I added a few small comments

@Override
public boolean evaluate(EvaluationSample<String> sample, String output) {
String expectedOutput = sample.expectedOutput();
String prompt = this.prompt
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could see this being an actual template in the future, but no reason to overcomplicate for now

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This strategy needs a bit of tuning.

This comment has been minimized.

This comment has been minimized.

…library

- Implement `ScorerExtension` to inject and manage Scorer instances in tests.
- Support field and parameter injection for Scorer using `@ScorerConfiguration`.
- Add support for parameter injection of samples via `@SampleLocation` annotation.
- Provide built-in evaluation strategies:
  - `SemanticSimilarityStrategy` (cosine similarity-based evaluation).
  - `AiJudgeStrategy` (AI-powered evaluation with customizable prompts).
- Add tests for ScorerExtension:
  - Validate field and parameter injection of Scorer.
  - Test sample injection from YAML files.
  - Verify evaluation strategies and reporting.
- Document ScorerExtension:
  - Explain concepts: Scorer, Samples, Evaluation Strategies, Reports.
  - Usage examples for field/parameter injection and evaluation.
  - Guide for using built-in strategies and creating custom strategies.
Copy link
Collaborator

@geoand geoand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

Copy link

quarkus-bot bot commented Dec 18, 2024

Status for workflow Build (on pull request)

This is the status report for running Build (on pull request) on commit 7591433.

✅ The latest workflow run for the pull request has completed successfully.

It should be safe to merge provided you have a look at the other checks in the summary.

@geoand geoand merged commit 32e3cd2 into quarkiverse:main Dec 18, 2024
67 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants