Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for vLLM containers #16

Merged
merged 1 commit into from
Jun 4, 2024
Merged

Conversation

petermuller
Copy link
Contributor

This commit adds support for vLLM containers with some example documentation to support it. The schema has been updated to allow for the vLLM container. These changes help toward the migration to the v2.x architecture.

Description of changes:

  1. Update README with examples of usage
  2. Update schema to guarantee no periods in names so that I can split on them later
  3. Add Dockerfile and entrypoint script for vLLM-served models
  4. Add a short-circuit for non-TGI/TEI containers so that we don't fail on "registering" the model, which will be deprecated as part of an upcoming v2 release.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

This commit adds support for vLLM containers with some example documentation to support it.
The schema has been updated to allow for the vLLM container. These changes help toward the
migration to the v2.x architecture.
@petermuller petermuller requested a review from estohlmann June 4, 2024 02:06
@petermuller petermuller self-assigned this Jun 4, 2024

declare -a vars=("S3_BUCKET_MODELS" "LOCAL_MODEL_PATH" "MODEL_NAME" "S3_MOUNT_POINT" "THREADS")

# Check the necessary environment variables
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these variables always apart of these base images? or is this something that we should make sure we add to the README?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are provided from the container build params within the typescript definition, so they should all be defined as part of this block:

/**
* Generates environment variables for Docker at runtime based on the configuration. The environment variables
* include the local model path, S3 bucket for models, model name, and other variables depending on the model type.
*
* @param {Config} config - The application configuration.
* @param {ModelConfig} modelConfig - Configuration for the specific model.
* @returns {Object} An object containing the environment variables. The object has string keys and values, which
* represent the environment variables for Docker at runtime.
*/
private getEnvironmentVariables(config: Config, modelConfig: ModelConfig): { [key: string]: string } {
const environment: { [key: string]: string } = {
LOCAL_MODEL_PATH: `${config.nvmeContainerMountPath}/model`,
S3_BUCKET_MODELS: config.s3BucketModels,
MODEL_NAME: modelConfig.modelName,
LOCAL_CODE_PATH: modelConfig.localModelCode, // Only needed when s5cmd is used, but just keep for now
AWS_REGION: config.region, // needed for s5cmd
};
if (modelConfig.modelType === 'embedding') {
environment.SAGEMAKER_BASE_DIR = config.nvmeContainerMountPath;
}
if (config.mountS3DebUrl) {
environment.S3_MOUNT_POINT = 's3-models-mount';
// More threads than files during S3 mount point copy to NVMe is fine; by default use half threads
environment.THREADS = Math.ceil(Ec2Metadata.get(modelConfig.instanceType).vCpus / 2).toString();
}
if (modelConfig.containerConfig.environment) {
for (const [key, value] of Object.entries(modelConfig.containerConfig.environment)) {
if (value !== null) {
environment[key] = String(value);
}
}
}
return environment;
}

@petermuller petermuller merged commit b3bf672 into main Jun 4, 2024
2 checks passed
@petermuller petermuller deleted the feature/vllm-support branch June 4, 2024 05:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants