We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for ensemble-based neural sparse search that combines results from multiple sparse models to improve search quality and robustness.
Research shows that ensemble of sparse retrievers provides:
Key research:
PUT _neural/sparse_model/ensemble { "name": "sparse_ensemble", "models": [ { "model_id": "splade_v2", "weight": 0.6 }, { "model_id": "unicoil", "weight": 0.4 } ], "combination_method": "weighted_sum", // or "max", "mean" "cache_policy": { "enabled": true, "ttl": "1h" } }
GET my-index/_search { "query": { "neural_sparse_ensemble": { "query_text": "search query", "ensemble_id": "sparse_ensemble", "k": 100 } } }
As shown in the configuration section, we can use caching to improve remote call latency.
The text was updated successfully, but these errors were encountered:
Does this imply that only the query embedding varies across models? Shouldn't the index embedding also differ for each model?
Sorry, something went wrong.
good question, yes we need to include the ingestion in the scope. Embeddings for multiple models can be generated during ingestion.
[Catch All Triage - 1, 2, 3, 4]
No branches or pull requests
Add support for ensemble-based neural sparse search that combines results from multiple sparse models to improve search quality and robustness.
Motivation
Research shows that ensemble of sparse retrievers provides:
Key research:
Proposed Functionality
Ensemble Configuration
Search API
As shown in the configuration section, we can use caching to improve remote call latency.
The text was updated successfully, but these errors were encountered: