-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(bedrock): implement new data source structure #668
Conversation
The implementation is practically finished, just needs extensive testing. Would love to have comments on the final structure and on the interfaces. Once the structure has been validated, will proceed with:
Big update that would solve #666, #655, and #587. Working example const kb = new KnowledgeBase(stack, 'MyKnowledgeBase', {
name: 'MyKnowledgeBase',
embeddingsModel: BedrockFoundationModel.COHERE_EMBED_MULTILINGUAL_V3,
});
const bucket = new Bucket(stack, 'Bucket', {});
const lambdaFunction = new Function(stack, 'MyFunction', {
runtime: cdk.aws_lambda.Runtime.PYTHON_3_9,
handler: 'index.handler',
code: cdk.aws_lambda.Code.fromInline('print("Hello, World!")'),
});
kb.addWebCrawlerDataSource({
sourceUrls: ['https://docs.aws.amazon.com/'],
chunkingStrategy: ChunkingStrategy.HIERARCHICAL_COHERE,
customTransformation: CustomTransformation.lambda({
lambdaFunction: lambdaFunction,
s3BucketUri: `s3://${bucket.bucketName}/chunk-processor/`,
}),
});
kb.addS3DataSource({
bucket,
chunkingStrategy: ChunkingStrategy.SEMANTIC,
parsingStrategy: ParsingStategy.foundationModel({
parsingModel: BedrockFoundationModel.ANTHROPIC_CLAUDE_SONNET_V1_0.asIModel(stack),
}),
}); |
Just FYI CDK v.2.155 has updates impacting the resources used in this PR, see aws/aws-cdk#31193 |
2aa76f9
to
d858f99
Compare
Overall LGTM ! Thanks for this ! @aws-rafams I am fine with the structure and interfaces, do you still need to work on:
|
I have completed the tests and documentation updates. However, I would really appreciate a hand with the Python snippets on the docs, which I believe are the only remaining task. The primary breaking change is in the S3DataSource resource and the ChunkingStrategy enum which is now a class. The previous structure and properties will not work with the new version, so we should highlight this in the release notes. The changes to the KnowledgeBase are primarily the addition of helper methods, while the rest are new resources that did not exist previously. |
Thanks @aws-rafams, I'll run some tests with the package and update here ! |
Thanks @aws-rafams !
As discussed, only detected issue is on permissions related to FM parsing and lambda in custom transformation that are missing. Readme seems good, we will update it soon as we need to add support for the new languages. Could you please:
Note:
|
Fixes #655
Draft, work in progress.
Just linking the draft to get some comments on the planned structure / interface.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the project license.