Add the ability for index
to include a DocumentTransformer
#26894
peterbraden
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Checked
Feature request
Add an optional parameter to index, to allow a DocumentTransformer to be passed, which transforms the documents before they are inserted into the vector store.
ie.
Motivation
The current
index
function allows us to pass aDocumentLoader
, and will add the documents that it yields to a vector store. But sometimes we may want to apply transformations to the documents before insertion.At the moment, you need to transform the documents before passing them to
index
, however if the transformation is slow (an example would be using an LLM to summarize a chunk) then we don't benefit from the indexing - the slow transformation must happen each time.If we instead allowed the index function to take the transformer, we could only apply the expensive transformation to documents that had changed, thus preventing duplicate effort.
Proposal (If applicable)
No response
Beta Was this translation helpful? Give feedback.
All reactions