-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vectore store improvements #210
Conversation
- added config support for multiple vector stores - updated config options for opensearch - added authorization for vectore stores based on groups
def user_has_group(user_groups: List[str], allowed_groups: List[str]) -> bool: | ||
"""Returns if user groups has at least one intersections with allowed groups. | ||
|
||
If allowed groups is empty this will return True. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you elaborate on this? "If allowed_groups is empty, there are no restrictions on the given resource"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. The requirement was to not have access restrictions if no groups were defined.
|
||
user_groups = json.loads(event["requestContext"]["authorizer"]["groups"]) or [] | ||
if not user_has_group(user_groups, repository["allowedGroups"]): | ||
raise HTTPException(status_code=403, message="User does not have permission to access this repository") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you pull out this logic into a has_repository_access
function so we can reuse it for the other API calls?
def has_repository_access(event, repo_id):
Bonus points if you make this an annotation we can throw onto an API definition.
lambda/utilities/vector_store.py
Outdated
@@ -34,12 +34,18 @@ | |||
secretsmanager_client = boto3.client("secretsmanager", region_name=os.environ["AWS_REGION"], config=retry_config) | |||
|
|||
|
|||
def get_vector_store_client(store: str, index: str, embeddings: Embeddings) -> VectorStore: | |||
def get_vector_store_client( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you get the repo_type from the repo_id? We use this get_vector_store_client a bunch, and the API endpoints all pass in a repo_type, which wouldn't be needed if we are fully using repo_id.
Issue #, if available:
Description of changes:
LISA was initially designed to support up to two vector stores: PGVector and/or OpenSearch. Based on customer feedback, we've added functionality to support multiple instances of each vector store type, along with basic access control mechanisms.
Key enhancements introduced in this PR include:
allowedGroups
property, restricting access to users belonging to specified groups as determined by the identity provider's response.Note
The list of available vector stores is still managed through an SSM parameter, which has a maximum size of 4 KB. This limitation effectively restricts the number of vector stores to approximately 20, depending on the length of store names and the contents of the
allowedGroups
property.By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.