Releases: truera/trulens
Releases · truera/trulens
TruLens-Eval-0.18.2
Changelog
- Unpin typing-extensions, typing-inspect (#590)
- Make CI Pipeline run daily (#599)
- Increase test coverage to all /quickstart* notebooks (#601)
Examples
- Add notebook to use for dev and debugging (#605)
- Add example for multimodal rag eval (#617)
- Add example for finetuning experiments in bedrock (#618)
Bug Fixes
- Fix helpfulness prompt (#594)
- Serialize OpenAI Client (#595)
- Removed extra reset cell in quickstart (#597)
- Fix langchain prompt template imports in examples (#602)
- Change model_id -> model_engine in Bedrock example (#612)
- Fix prompt swapping in model agreement feedback (#615)
- Fix > character in groundedness prompt (#623)
TruLens-Eval-0.18.0
Evaluate and Track LLM Applications
Changelog
- Migrate to OpenAI v1.
Known issues with async.
TruLens Eval v0.17.0
Changelog:
- Add criteria and improve chain of thought prompting for evals
- Allow feedback functions to be in different directions with appropriate coloring/emojis
- Filter leaderboard feedback function results to only those available for the given app id
- Add smoke testing/benchmarking for groundedness based on SummEval dataset
Bug Fixes:
- Fix issue with LiteLLM provider
- Allow Groundedness to run with any LLM provider
Examples
- Using Anthropic Claude to run feedback functions
TruLens Eval v0.16.0
Library containing evaluations of LLM Applications
Changelog
- [MLNN-1020] App runner UI updates by @walnutdust in #503
- Fix groundedness aggregation flakiness + incorrect 0 resolution. by @joshreini1 in #501
- [MLNN-1053] Add groundedness to Pinecone notebook by @ejisoo in #506
- [MLNN-1046] Example app with TruLens by @daniel-huang-1230 in #500
- [MLNN-1053] Update dependencies in pinecone notebook by @ejisoo in #507
- threading robustness and feedback retrieval by @piotrm0 in #480
- Merge generated docs and test files into main by @github-actions in #508
- updating JSONPath and features by @piotrm0 in #502
- handle no pii by @joshreini1 in #504
- Update Instrumentation Overview page: fix link, add trucustom by @joshreini1 in #505
- bugfixes by @piotrm0 in #510
- dashboard appui quickstart by @piotrm0 in #511
- Fix app UI showing the record twice by @walnutdust in #513
- Release branch 0.16.0 by @daniel-huang-1230 in #514
Bug Fixes
- Fix App UI, links, icons
New Contributors
- @ejisoo made their first contribution in #506
- @daniel-huang-1230 made their first contribution in #500
Full Changelog: trulens-eval-0.15.3...trulens-eval-0.16.0
TruLens Eval v0.15.3
Library containing evaluations of LLM Applications
Bug Fixes
- Fixed OpenAI provider issues for feedback functions
TruLens Eval v0.15.1
Library containing evaluations of LLM Applications
Changelog
- PII Detection Feedback Function
- Embedding Distance Feedback Function
- App UI Playground
Examples
- All new User Guides Docs Section
- Language Verification
- PII Detection
- Hallucination Detection
- Retrieval Quality
Bug Fixes
- Unicode Issue on Windows
TruLens Eval v0.14.0
Library containing evaluations of LLM Applications
Changelog
- Added a stereotypes feedback function
- Added a summarization feedback function
- Added litellm as a provider
- Support for llama index agent instrumentation
- Added an interactive UI for jupyter notebooks to explore the App structure
Bugfixes
- Fixed an issue with langchain async not logging
TruLens Eval v0.13.0
Library containing evaluations of LLM Applications
Changelog
- Updated all documentation to show context recorder usage
- Smoke Tests are tested with trulens eval
Examples
- Examples are restructured for better discoverability
- Added a Milvus Vector DB Example
Bug Fixes
- Removed metadata_fn in examples
TruLens Eval v0.12.0
Library containing evaluations of LLM Applications
Changelog
- Added chain of thought and reason metadata to LLM based feedback functions
- Feedback function docs upgrade
- Feedback Function APIs now showing actual APIs with code
- App wrappers (TruChain/TruLLama/etc) docs with code
- More concise selector documentation with code
Examples
- Updated examples to use context recording
Bug Fixes
- Fix for basic app with multiple args
- Fix aggregation bug in multi context groundedness introduced in 0.11.0
- Now shows index of json path if available in timeline UI
- No longer overwrites user changes to streamlit .toml files
- Slow or hanging thread bug fix
TruLens Eval v0.11.0
Changelog
- Add ability to add metadata to records
- Add Feedback functions for bertscore, rouge, and bleu scores
- More instrumentation for Langchain Agents
- Added capability to instrument more than the default calls such as LangchainP Prompt Templates
- Added support for tracking via python context managers
- Added badges showing test results on documentation page
Examples
- Added Llama Index RAG application with a vector store using Milvus
Bug Fixes
- Fix for multi-result introduced in 0.10.0
- Allow FeedbackCall to have JSON args
- Fix error for OpenAi Chat LLM with ChatPromptTemplate