Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vectore store improvements #210

Merged
merged 18 commits into from
Dec 18, 2024
Merged

Vectore store improvements #210

merged 18 commits into from
Dec 18, 2024

Conversation

dustins
Copy link
Contributor

@dustins dustins commented Dec 13, 2024

Issue #, if available:

Description of changes:
LISA was initially designed to support up to two vector stores: PGVector and/or OpenSearch. Based on customer feedback, we've added functionality to support multiple instances of each vector store type, along with basic access control mechanisms.

Key enhancements introduced in this PR include:

  • Flexible Configuration and Deployment: LISA now supports configuring and deploying one or more instances of each vector store type.
  • Access Control: Vector stores can define an optional allowedGroups property, restricting access to users belonging to specified groups as determined by the identity provider's response.

Note

The list of available vector stores is still managed through an SSM parameter, which has a maximum size of 4 KB. This limitation effectively restricts the number of vector stores to approximately 20, depending on the length of store names and the contents of the allowedGroups property.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

- added config support for multiple vector stores
- updated config options for opensearch
- added authorization for vectore stores based on groups
def user_has_group(user_groups: List[str], allowed_groups: List[str]) -> bool:
"""Returns if user groups has at least one intersections with allowed groups.

If allowed groups is empty this will return True.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you elaborate on this? "If allowed_groups is empty, there are no restrictions on the given resource"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. The requirement was to not have access restrictions if no groups were defined.


user_groups = json.loads(event["requestContext"]["authorizer"]["groups"]) or []
if not user_has_group(user_groups, repository["allowedGroups"]):
raise HTTPException(status_code=403, message="User does not have permission to access this repository")
Copy link
Contributor

@bedanley bedanley Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you pull out this logic into a has_repository_access function so we can reuse it for the other API calls?

def has_repository_access(event, repo_id):

Bonus points if you make this an annotation we can throw onto an API definition.

@@ -34,12 +34,18 @@
secretsmanager_client = boto3.client("secretsmanager", region_name=os.environ["AWS_REGION"], config=retry_config)


def get_vector_store_client(store: str, index: str, embeddings: Embeddings) -> VectorStore:
def get_vector_store_client(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you get the repo_type from the repo_id? We use this get_vector_store_client a bunch, and the API endpoints all pass in a repo_type, which wouldn't be needed if we are fully using repo_id.

bedanley
bedanley previously approved these changes Dec 13, 2024
estohlmann
estohlmann previously approved these changes Dec 13, 2024
@dustins dustins dismissed stale reviews from estohlmann and bedanley via 8bd25d8 December 17, 2024 17:53
@estohlmann estohlmann marked this pull request as ready for review December 18, 2024 03:27
@estohlmann estohlmann merged commit 7a83065 into develop Dec 18, 2024
4 checks passed
@estohlmann estohlmann deleted the bug/multi-vs branch December 18, 2024 03:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants