Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(bedrock): implement new data source structure #668

Merged
merged 26 commits into from
Sep 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
3b088bd
feat(bedrock): add data source implementation
aws-rafams Sep 6, 2024
6b5d583
chore(bedrock): add unit tests
aws-rafams Sep 6, 2024
1696a66
chore(bedrock): docs and tests
aws-rafams Sep 6, 2024
d503513
chore(bedrock): additional tests and small fixes
aws-rafams Sep 6, 2024
caa9cb8
docs(bedrock): added new feature descriptions
aws-rafams Sep 6, 2024
53037d6
Merge branch 'main' into s3-data-sources
aws-rafams Sep 6, 2024
388bb25
Merge branch 'main' into s3-data-sources
aws-rafams Sep 9, 2024
98b56a3
Merge branch 'main' into s3-data-sources
krokoko Sep 16, 2024
62fc779
fix(bedrock): add default prompt
aws-rafams Sep 17, 2024
0d00693
docs(bedrock): minor doc fix
aws-rafams Sep 17, 2024
eea5922
chore(bedrock): merge remote-tracking branch 'upstream/main' into s3-…
aws-rafams Sep 19, 2024
959d0b2
Merge branch 'main' into s3-data-sources
krokoko Sep 19, 2024
6c74dd9
fix(bedrock): fix test
aws-rafams Sep 19, 2024
5f3f36b
chore(bedrock): fix test
aws-rafams Sep 19, 2024
061c5a2
fix(bedrock): re-enable default crawling scope, previously disabled d…
aws-rafams Sep 19, 2024
b460b0a
Merge branch 'main' into s3-data-sources
krokoko Sep 20, 2024
c9cc578
Merge branch 'main' into s3-data-sources
aws-rafams Sep 23, 2024
44b38bc
Merge branch 'main' into s3-data-sources
krokoko Sep 23, 2024
8c84328
feat(bedrock): add imported KB support
aws-rafams Sep 24, 2024
9a2bf31
Merge branch 'main' into s3-data-sources
krokoko Sep 24, 2024
f4a5d79
chore(attributes): add readonly
krokoko Sep 24, 2024
7007901
fix(bedrock): fix IRole for imported KB
aws-rafams Sep 24, 2024
ad46a6a
fix(bedrock): update integ test with lambda custom transformation
aws-rafams Sep 24, 2024
a31e129
fix(bedrock): fix data source permission FM Parsing and Lambda Transf…
aws-rafams Sep 25, 2024
dbdcf1e
chore(bedrock): remove integ tests
aws-rafams Sep 25, 2024
8788a85
docs(bedrock): update docs
aws-rafams Sep 25, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 0 additions & 9 deletions .gitignore

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 0 additions & 2 deletions .npmignore

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

69 changes: 0 additions & 69 deletions .projen/tasks.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

41 changes: 40 additions & 1 deletion apidocs/namespaces/bedrock/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,12 @@
### Enumerations

- [CanadaSpecific](enumerations/CanadaSpecific.md)
- [ChunkingStrategy](enumerations/ChunkingStrategy.md)
- [ConfluenceDataSourceAuthType](enumerations/ConfluenceDataSourceAuthType.md)
- [ConfluenceObjectType](enumerations/ConfluenceObjectType.md)
- [ContextualGroundingFilterConfigType](enumerations/ContextualGroundingFilterConfigType.md)
- [CrawlingScope](enumerations/CrawlingScope.md)
- [DataDeletionPolicy](enumerations/DataDeletionPolicy.md)
- [DataSourceType](enumerations/DataSourceType.md)
- [FiltersConfigStrength](enumerations/FiltersConfigStrength.md)
- [FiltersConfigType](enumerations/FiltersConfigType.md)
- [Finance](enumerations/Finance.md)
Expand All @@ -24,6 +28,11 @@
- [PromptState](enumerations/PromptState.md)
- [PromptTemplateType](enumerations/PromptTemplateType.md)
- [PromptType](enumerations/PromptType.md)
- [SalesforceDataSourceAuthType](enumerations/SalesforceDataSourceAuthType.md)
- [SalesforceObjectType](enumerations/SalesforceObjectType.md)
- [SharePointDataSourceAuthType](enumerations/SharePointDataSourceAuthType.md)
- [SharePointObjectType](enumerations/SharePointObjectType.md)
- [TransformationStep](enumerations/TransformationStep.md)
- [UKSpecific](enumerations/UKSpecific.md)
- [USASpecific](enumerations/USASpecific.md)

Expand All @@ -34,18 +43,28 @@
- [AgentAlias](classes/AgentAlias.md)
- [ApiSchema](classes/ApiSchema.md)
- [BedrockFoundationModel](classes/BedrockFoundationModel.md)
- [ChunkingStrategy](classes/ChunkingStrategy.md)
- [ConfluenceDataSource](classes/ConfluenceDataSource.md)
- [ContentPolicyConfig](classes/ContentPolicyConfig.md)
- [CustomTransformation](classes/CustomTransformation.md)
- [DataSource](classes/DataSource.md)
- [DataSourceBase](classes/DataSourceBase.md)
- [DataSourceNew](classes/DataSourceNew.md)
- [Guardrail](classes/Guardrail.md)
- [GuardrailVersion](classes/GuardrailVersion.md)
- [InlineApiSchema](classes/InlineApiSchema.md)
- [KnowledgeBase](classes/KnowledgeBase.md)
- [ParsingStategy](classes/ParsingStategy.md)
- [Prompt](classes/Prompt.md)
- [PromptVariant](classes/PromptVariant.md)
- [PromptVersion](classes/PromptVersion.md)
- [S3ApiSchema](classes/S3ApiSchema.md)
- [S3DataSource](classes/S3DataSource.md)
- [SalesforceDataSource](classes/SalesforceDataSource.md)
- [SensitiveInformationPolicyConfig](classes/SensitiveInformationPolicyConfig.md)
- [SharePointDataSource](classes/SharePointDataSource.md)
- [Topic](classes/Topic.md)
- [WebCrawlerDataSource](classes/WebCrawlerDataSource.md)

### Interfaces

Expand All @@ -57,23 +76,43 @@
- [ApiSchemaConfig](interfaces/ApiSchemaConfig.md)
- [BedrockFoundationModelProps](interfaces/BedrockFoundationModelProps.md)
- [CommonPromptVariantProps](interfaces/CommonPromptVariantProps.md)
- [ConfluenceCrawlingFilters](interfaces/ConfluenceCrawlingFilters.md)
- [ConfluenceDataSourceAssociationProps](interfaces/ConfluenceDataSourceAssociationProps.md)
- [ConfluenceDataSourceProps](interfaces/ConfluenceDataSourceProps.md)
- [ContentPolicyConfigProps](interfaces/ContentPolicyConfigProps.md)
- [ContextualGroundingPolicyConfigProps](interfaces/ContextualGroundingPolicyConfigProps.md)
- [CrawlingFilters](interfaces/CrawlingFilters.md)
- [DataSourceAssociationProps](interfaces/DataSourceAssociationProps.md)
- [FoundationModelParsingStategyProps](interfaces/FoundationModelParsingStategyProps.md)
- [GuardrailConfiguration](interfaces/GuardrailConfiguration.md)
- [GuardrailProps](interfaces/GuardrailProps.md)
- [HierarchicalChunkingProps](interfaces/HierarchicalChunkingProps.md)
- [IAgentAlias](interfaces/IAgentAlias.md)
- [IDataSource](interfaces/IDataSource.md)
- [IKnowledgeBase](interfaces/IKnowledgeBase.md)
- [InferenceConfiguration](interfaces/InferenceConfiguration.md)
- [IPrompt](interfaces/IPrompt.md)
- [KnowledgeBaseAttributes](interfaces/KnowledgeBaseAttributes.md)
- [KnowledgeBaseProps](interfaces/KnowledgeBaseProps.md)
- [LambdaCustomTransformationProps](interfaces/LambdaCustomTransformationProps.md)
- [PromptConfiguration](interfaces/PromptConfiguration.md)
- [PromptOverrideConfiguration](interfaces/PromptOverrideConfiguration.md)
- [PromptProps](interfaces/PromptProps.md)
- [PromptVersionProps](interfaces/PromptVersionProps.md)
- [S3DataSourceAssociationProps](interfaces/S3DataSourceAssociationProps.md)
- [S3DataSourceProps](interfaces/S3DataSourceProps.md)
- [S3Identifier](interfaces/S3Identifier.md)
- [SalesforceCrawlingFilters](interfaces/SalesforceCrawlingFilters.md)
- [SalesforceDataSourceAssociationProps](interfaces/SalesforceDataSourceAssociationProps.md)
- [SalesforceDataSourceProps](interfaces/SalesforceDataSourceProps.md)
- [SensitiveInformationPolicyConfigProps](interfaces/SensitiveInformationPolicyConfigProps.md)
- [SharePointCrawlingFilters](interfaces/SharePointCrawlingFilters.md)
- [SharePointDataSourceAssociationProps](interfaces/SharePointDataSourceAssociationProps.md)
- [SharePointDataSourceProps](interfaces/SharePointDataSourceProps.md)
- [TextPromptVariantProps](interfaces/TextPromptVariantProps.md)
- [TopicProps](interfaces/TopicProps.md)
- [WebCrawlerDataSourceAssociationProps](interfaces/WebCrawlerDataSourceAssociationProps.md)
- [WebCrawlerDataSourceProps](interfaces/WebCrawlerDataSourceProps.md)

### Functions

Expand Down
129 changes: 129 additions & 0 deletions apidocs/namespaces/bedrock/classes/ChunkingStrategy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
[**@cdklabs/generative-ai-cdk-constructs**](../../../README.md) • **Docs**

***

[@cdklabs/generative-ai-cdk-constructs](../../../README.md) / [bedrock](../README.md) / ChunkingStrategy

# Class: `abstract` ChunkingStrategy

## Properties

### configuration

> `abstract` **configuration**: `ChunkingConfigurationProperty`

The CloudFormation property representation of this configuration

***

### DEFAULT

> `readonly` `static` **DEFAULT**: [`ChunkingStrategy`](ChunkingStrategy.md)

Fixed Sized Chunking with the default chunk size of 300 tokens and 20% overlap.

***

### FIXED\_SIZE

> `readonly` `static` **FIXED\_SIZE**: [`ChunkingStrategy`](ChunkingStrategy.md)

Fixed Sized Chunking with the default chunk size of 300 tokens and 20% overlap.
You can adjust these values based on your specific requirements using the
`ChunkingStrategy.fixedSize(params)` method.

***

### HIERARCHICAL\_COHERE

> `readonly` `static` **HIERARCHICAL\_COHERE**: [`ChunkingStrategy`](ChunkingStrategy.md)

Hierarchical Chunking with the default for Cohere Models.
- Overlap tokens: 30
- Max parent token size: 500
- Max child token size: 100

***

### HIERARCHICAL\_TITAN

> `readonly` `static` **HIERARCHICAL\_TITAN**: [`ChunkingStrategy`](ChunkingStrategy.md)

Hierarchical Chunking with the default for Titan Models.
- Overlap tokens: 60
- Max parent token size: 1500
- Max child token size: 300

***

### NONE

> `readonly` `static` **NONE**: [`ChunkingStrategy`](ChunkingStrategy.md)

Amazon Bedrock treats each file as one chunk. Suitable for documents that
are already pre-processed or text split.

***

### SEMANTIC

> `readonly` `static` **SEMANTIC**: [`ChunkingStrategy`](ChunkingStrategy.md)

Semantic Chunking with the default of bufferSize: 0,
breakpointPercentileThreshold: 95, and maxTokens: 300.
You can adjust these values based on your specific requirements using the
`ChunkingStrategy.semantic(params)` method.

## Methods

### fixedSize()

> `static` **fixedSize**(`props`): [`ChunkingStrategy`](ChunkingStrategy.md)

Method for customizing a fixed sized chunking strategy.

#### Parameters

• **props**: `FixedSizeChunkingConfigurationProperty`

#### Returns

[`ChunkingStrategy`](ChunkingStrategy.md)

***

### hierarchical()

> `static` **hierarchical**(`props`): [`ChunkingStrategy`](ChunkingStrategy.md)

Method for customizing a hierarchical chunking strategy.
For custom chunking, the maximum token chunk size depends on the model.
- Amazon Titan Text Embeddings: 8192
- Cohere Embed models: 512

#### Parameters

• **props**: [`HierarchicalChunkingProps`](../interfaces/HierarchicalChunkingProps.md)

#### Returns

[`ChunkingStrategy`](ChunkingStrategy.md)

***

### semantic()

> `static` **semantic**(`props`): [`ChunkingStrategy`](ChunkingStrategy.md)

Method for customizing a semantic chunking strategy.
For custom chunking, the maximum token chunk size depends on the model.
- Amazon Titan Text Embeddings: 8192
- Cohere Embed models: 512

#### Parameters

• **props**: `SemanticChunkingConfigurationProperty`

#### Returns

[`ChunkingStrategy`](ChunkingStrategy.md)
Loading
Loading