Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for cohere/titan embedding models #361

Merged
merged 18 commits into from
Mar 14, 2024
Merged

Add support for cohere/titan embedding models #361

merged 18 commits into from
Mar 14, 2024

Conversation

QuinnGT
Copy link
Contributor

@QuinnGT QuinnGT commented Feb 5, 2024

Issue #, if available:

Issues open that this resolves, #218

Description of changes:

Brought over changes from spugachev 3 month old branch on cohere. Added changes for new schema. Deployed and tested in fresh environment last night but I see massi-ang opened #359 this morning. However cohere embedding needs additional changes for how it handles inputs. Documented here.

search_document type and search_query type are used. Classification and clustering are not needed at this time.

embedding_types was not defined as float is appropriate for this implementation.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Copy link
Collaborator

@bigadsoleiman bigadsoleiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this submission!

Just added a couple of feedbacks

@bigadsoleiman bigadsoleiman changed the title Add support for cohere embedding models Add support for cohere/titan embedding models Feb 8, 2024
bin/config.ts Show resolved Hide resolved
Copy link
Collaborator

@massi-ang massi-ang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR.

lib/chatbot-api/functions/api-handler/routes/embeddings.py Outdated Show resolved Hide resolved
@QuinnGT
Copy link
Contributor Author

QuinnGT commented Feb 13, 2024

Quick update. I pushed updates for everything except for titan multimodal. Currently deploying those changes and testing before taking on titan multimodal.

In regards to Titan Multimodal Embeddings, it will need a bit more than just calling out the additional model. It will need support for it's API payload like max input text tokens and max input image size. However we are not capturing images in our use of Unstructured or rather lack of use of unstructured, like the partition_image function. Also I see unstructured is poorly used as we pull the full container yet only use the loader.

As soon as the cohere changes finish and tests pass, I'll circle back on titan multimodal and unstructured. Feel free to post thoughts or feedback.

@massi-ang
Copy link
Collaborator

massi-ang commented Feb 13, 2024

I uderstand your concerns, but I think they are out of scope to just adding this embedding model for text embeddings. maxInputTokens is optional and we currently only need to support text embeddings. This was already tested in #359 and working fine.

We should then work to better leverage the multimodal embeddings, but that is an unrelated feature to this one.

@QuinnGT
Copy link
Contributor Author

QuinnGT commented Feb 13, 2024

Alright, all requested changes have been made. I also fixed the embedding prompt to show when enableRag was true vs just being skipped no matter what the use selected, in the magic-config.

New environment has been deployed. Bedrock and Sagemaker embedding models tested as functional.

@QuinnGT QuinnGT marked this pull request as draft February 14, 2024 01:29
@QuinnGT
Copy link
Contributor Author

QuinnGT commented Feb 14, 2024

Found issue with creating workspace. Not sure if it's from me or changes merged into this PR. Need to test and see.

cli/magic-config.ts Outdated Show resolved Hide resolved
@QuinnGT QuinnGT marked this pull request as ready for review February 14, 2024 06:04
@QuinnGT
Copy link
Contributor Author

QuinnGT commented Feb 14, 2024

Resolved the latest issues. Feel free to test and let me know if anything else is needed.

@QuinnGT
Copy link
Contributor Author

QuinnGT commented Feb 15, 2024

@bigadsoleiman I noticed an error in deployment that's happening for log groups due to size of name. I tested it here and it works. Do you want me to bring that into this PR or wait till this is approved?

@massi-ang
Copy link
Collaborator

@bigadsoleiman I noticed an error in deployment that's happening for log groups due to size of name. I tested it here and it works. Do you want me to bring that into this PR or wait till this is approved?

Hi Quinn, I checked your change and my concern is that if you deploy two stacks, they will share the same log group, since the value of id will be the same in for both.

@QuinnGT
Copy link
Contributor Author

QuinnGT commented Feb 22, 2024

@bigadsoleiman I noticed an error in deployment that's happening for log groups due to size of name. I tested it here and it works. Do you want me to bring that into this PR or wait till this is approved?

Hi Quinn, I checked your change and my concern is that if you deploy two stacks, they will share the same log group, since the value of id will be the same in for both.

Great point @massi-ang . How about I use the config.prefix with the id so it looks something like logGroupName: /aws/vendedlogs/states/${config.prefix}-${id}.

Before: /aws/vendedlogs/states/CreateAuroraWorkspace-CreateAuroraWorkspaceSMLogGroup
After: /aws/vendedlogs/states/dev-CreateAuroraWorkspaceSMLogGroup

@massi-ang
Copy link
Collaborator

I would suggest using /aws/vendedlogs/states/${cdk.Stack.of(this).name}-${id}, but there could still be issues in case we apply the same solution to other constructs which have the same id (since id in only unique in the scope of the parent construct).

@QuinnGT
Copy link
Contributor Author

QuinnGT commented Feb 26, 2024

@bigadsoleiman is there anything else pending that I missed? If not, can you approve this PR? Thank you!

@Cerrix
Copy link

Cerrix commented Feb 27, 2024

Hi @QuinnGT,
I tested the pull request because I was looking for the Cohere Multilingual implementation. When I selected that model for embeddings, I received the following error response from AppSync:
{ "data": null, "errors": [ { "path": [ "calculateEmbeddings" ], "data": null, "errorType": "TypeError", "errorInfo": null, "locations": [ { "line": 2, "column": 3, "sourceName": null } ], "message": "Object of type Task is not JSON serializable" } ] }
both creating the workspace or testing the embedding.
With Titan everything is fine.

Thank you!

@QuinnGT
Copy link
Contributor Author

QuinnGT commented Feb 28, 2024

Hi @QuinnGT, I tested the pull request because I was looking for the Cohere Multilingual implementation. When I selected that model for embeddings, I received the following error response from AppSync: { "data": null, "errors": [ { "path": [ "calculateEmbeddings" ], "data": null, "errorType": "TypeError", "errorInfo": null, "locations": [ { "line": 2, "column": 3, "sourceName": null } ], "message": "Object of type Task is not JSON serializable" } ] } both creating the workspace or testing the embedding. With Titan everything is fine.

Thank you!

@Cerrix thank you for reporting this. Can you share what RAG you were testing with?

@Cerrix
Copy link

Cerrix commented Feb 28, 2024

Yep: I’m testing in with Open Search (and just that). I get the same error even if testing cohere through the “embeddings” page (not only creating a new workspace).

thank you

@QuinnGT
Copy link
Contributor Author

QuinnGT commented Feb 29, 2024

Yep: I’m testing in with Open Search (and just that). I get the same error even if testing cohere through the “embeddings” page (not only creating a new workspace).

thank you

Hey @Cerrix I enabled and deployed an opensearch instance and I'm not seeing any errors with workspace creation or testing embeddings. I'm not sure if it's a new or existing deployment you are testing in but try doing a fresh deployment from the latest, along with a new npm install and build. Feel free to report back with more information to troubleshoot, if needed.

@QuinnGT
Copy link
Contributor Author

QuinnGT commented Feb 29, 2024

Comparison of embedding models from demo data.
image

@Cerrix
Copy link

Cerrix commented Feb 29, 2024

It is a week old deploy. But let me try again from scratch! I'll keep you posted!

@Cerrix
Copy link

Cerrix commented Mar 3, 2024

It is so strange. I re-deployed from scratch the repository branching your PR in Oregon (us-west-2).

Here is my config file:

{
  "prefix": "llmchatbot-demo",
  "privateWebsite": false,
  "certificate": "",
  "domain": "",
  "bedrock": {
    "enabled": true,
    "region": "us-west-2"
  },
  "llms": {
    "sagemaker": []
  },
  "rag": {
    "enabled": true,
    "engines": {
      "aurora": {
        "enabled": false
      },
      "opensearch": {
        "enabled": true
      },
      "kendra": {
        "enabled": false,
        "createIndex": false,
        "external": [],
        "enterprise": false
      }
    },
    "embeddingsModels": [
      {
        "provider": "sagemaker",
        "name": "intfloat/multilingual-e5-large",
        "dimensions": 1024
      },
      {
        "provider": "sagemaker",
        "name": "sentence-transformers/all-MiniLM-L6-v2",
        "dimensions": 384
      },
      {
        "provider": "bedrock",
        "name": "amazon.titan-embed-text-v1",
        "dimensions": 1536
      },
      {
        "provider": "bedrock",
        "name": "amazon.titan-embed-image-v1",
        "dimensions": 1024
      },
      {
        "provider": "bedrock",
        "name": "cohere.embed-english-v3",
        "dimensions": 1024
      },
      {
        "provider": "bedrock",
        "name": "cohere.embed-multilingual-v3",
        "dimensions": 1024,
        "default": true
      },
      {
        "provider": "openai",
        "name": "text-embedding-ada-002",
        "dimensions": 1536
      }
    ],
    "crossEncoderModels": [
      {
        "provider": "sagemaker",
        "name": "cross-encoder/ms-marco-MiniLM-L-12-v2",
        "default": true
      }
    ]
  }
}

Still I got the same error:
Screenshot 2024-03-04 at 00 06 01

@QuinnGT
Copy link
Contributor Author

QuinnGT commented Mar 5, 2024

@Cerrix Interesting, I'm not sure. It looks like it's an error from the appsync function. Almost like the old version of app sync functions are trying to be used with the newer genai_core python functions. Did you purge your cdk.out when doing a fresh deployment? Also double check cdk version is updated on your local as there seems to be some rebuild bug introduced on each release.

@massi-ang I see Bigad approved the PR but we're still waiting on you or another repo owner for approval. What else do you need to from me?

@Cerrix
Copy link

Cerrix commented Mar 5, 2024

I'm deploying following the Cloud9 instructions of the sample documentation: https://aws-samples.github.io/aws-genai-llm-chatbot/guide/deploy.html

If we need to update some dependencies I think it must be documented before merging the PR in main.

I can try with a vanilla cloud9 instance but we need to double check the issue. I don't know if someone from the team would like to try the deploy!

Copy link
Collaborator

@massi-ang massi-ang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the Provider type with the right semantics

cli/magic-config.ts Show resolved Hide resolved
bin/config.ts Show resolved Hide resolved
@bigadsoleiman bigadsoleiman merged commit 8838856 into aws-samples:main Mar 14, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

5 participants