Skip to content

Commit

Permalink
Update document headings
Browse files Browse the repository at this point in the history
  • Loading branch information
bedanley authored Nov 20, 2024
1 parent 376d71d commit 8e759d6
Show file tree
Hide file tree
Showing 15 changed files with 256 additions and 205 deletions.
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ repos:
hooks:
- id: detect-secrets
exclude: (?x)^(
.*.ipynb|config.yaml
.*.ipynb|config.yaml|.*.md
)$

- repo: https://github.com/pre-commit/pre-commit-hooks
Expand Down
28 changes: 20 additions & 8 deletions lib/docs/.vitepress/config.mts
Original file line number Diff line number Diff line change
Expand Up @@ -21,22 +21,34 @@ const navLinks = [
text: 'System Administrator Guide',
items: [
{ text: 'What is LISA?', link: '/admin/overview' },
{ text: 'Architecture Overview', link: '/admin/architecture' },
{
text: 'Architecture Overview',
items: [
{ text: 'LISA Components', link: '/admin/architecture#lisa-components' },
],
link: '/admin/architecture',
},
{ text: 'Getting Started', link: '/admin/getting-started' },
{ text: 'Configure IdP: Cognito & Keycloak Examples', link: '/admin/idp-config' },
{ text: 'Deployment', link: '/admin/deploy' },
{ text: 'Model Management API Usage', link: '/admin/model-management' },
{ text: 'Chat UI Configuration', link: '/admin/ui-configuration' },
{ text: 'API Request Error Handling', link: '/admin/error' },
{ text: 'Setting Model Management Admin Group', link: '/admin/model-management-admin' },
{ text: 'LiteLLM', link: '/admin/litellm' },
{ text: 'API Overview', link: '/admin/api-overview' },
{ text: 'API Request Error Handling', link: '/admin/api-error' },
{ text: 'Security', link: '/admin/security' },
],
},
{
text: 'Advanced Configuration',
items: [
{ text: 'Configuration Schema', link: '/config/configuration' },
{ text: 'Programmatic API Tokens', link: '/config/api-tokens' },
{ text: 'Model Compatibility', link: '/config/model-compatibility' },
{ text: 'Rag Vector Stores', link: '/config/vector-stores' },
{ text: 'Configure IdP: Cognito & Keycloak Examples', link: '/config/idp' },
{ text: 'LiteLLM', link: '/config/lite-llm' },
{ text: 'Model Management API', link: '/config/model-management-api' },
{ text: 'Model Management UI', link: '/config/model-management-ui' },
{ text: 'Usage & Features', link: '/config/usage' },
{ text: 'RAG Vector Stores', link: '/config/vector-stores' },
{ text: 'Branding', link: '/config/branding' },
{ text: 'Configuration Schema', link: '/config/configuration' },
],
},
{
Expand Down
File renamed without changes.
81 changes: 81 additions & 0 deletions lib/docs/admin/api-overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
# API Usage Overview

LISA provides robust API endpoints for managing models, both for users and administrators. These endpoints allow for
operations such as listing, creating, updating, and deleting models.

## API Gateway and ALB Endpoints

LISA uses two primary APIs for model management:

1. **[User-facing OpenAI-Compatible API](#litellm-routing-in-all-models)**: Available to all users for inference tasks
and accessible through the
LISA
Serve ALB. This API provides an interface for querying and interacting with models deployed on Amazon ECS, Amazon
Bedrock, or through LiteLLM.
2. **[Admin-level Model Management API](/config/model-management-api)**: Available only to administrators through the
API Gateway (APIGW). This API
allows for full control of model lifecycle management, including creating, updating, and deleting models.

### LiteLLM Routing in All Models

Every model request is routed through LiteLLM, regardless of whether infrastructure (like ECS) is created for it.
Whether deployed on ECS, external models via Bedrock, or managed through LiteLLM, all models are added to LiteLLM for
traffic routing. The distinction is whether infrastructure is created (determined by request payloads), but LiteLLM
integration is consistent for all models. The model management APIs will handle adding or removing model configurations
from LiteLLM, and the LISA Serve endpoint will handle the inference requests against models available in LiteLLM.

## User-facing OpenAI-Compatible API

The OpenAI-compatible API is accessible through the LISA Serve ALB and allows users to list models available for
inference tasks. Although not specifically part of the model management APIs, any model that is added or removed from
LiteLLM via the model management API Gateway APIs will be reflected immediately upon queries to LiteLLM through the LISA
Serve ALB.

### Listing Models

The `/v2/serve/models` endpoint on the LISA Serve ALB allows users to list all models available for inference in the
LISA system.

#### Request Example:

```bash
curl -s -H 'Authorization: Bearer <your_token>' -X GET https://<alb_endpoint>/v2/serve/models
```

#### Response Example:

```json
{
"data": [
{
"id": "bedrock-embed-text-v2",
"object": "model",
"created": 1677610602,
"owned_by": "openai"
},
{
"id": "titan-express-v1",
"object": "model",
"created": 1677610602,
"owned_by": "openai"
},
{
"id": "sagemaker-amazon-mistrallite",
"object": "model",
"created": 1677610602,
"owned_by": "openai"
}
],
"object": "list"
}
```

#### Explanation of Response Fields:

These fields are all defined by the OpenAI API specification, which is
documented [here](https://platform.openai.com/docs/api-reference/models/list).

- `id`: A unique identifier for the model.
- `object`: The type of object, which is "model" in this case.
- `created`: A Unix timestamp representing when the model was created.
- `owned_by`: The entity responsible for the model, such as "openai."
10 changes: 6 additions & 4 deletions lib/docs/admin/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,7 @@ This command verifies if the model's weights are already present in your S3 buck

> **WARNING**
> As of LISA 3.0, the `ecsModels` parameter in `config-custom.yaml` is solely for staging model weights in your S3 bucket.
> Previously, before models could be managed through the [API](/admin/model-management) or via the Model Management
> Previously, before models could be managed through the [API](/config/model-management-api) or via the Model Management
> section of the [Chatbot](/user/chat), this parameter also
> dictated which models were deployed.
Expand All @@ -140,13 +140,14 @@ In the `config-custom.yaml` file, configure the `authConfig` block for authentic
- `jwtGroupsProperty`: Path to the groups field in the JWT token
- `additionalScopes` (optional): Extra scopes for group membership information

IDP Configuration examples using AWS Cognito and Keycloak can be found: [IDP Configuration Examples](/config/idp)
IDP Configuration examples using AWS Cognito and Keycloak can be found: [IDP Configuration Examples](/admin/idp-config)


## Step 7: Configure LiteLLM
We utilize LiteLLM under the hood to allow LISA to respond to the [OpenAI specification](https://platform.openai.com/docs/api-reference).
For LiteLLM configuration, a key must be set up so that the system may communicate with a database for tracking all the models that are added or removed
using the [Model Management API](/admin/model-management). The key must start with `sk-` and then can be any arbitrary
using the [Model Management API](/config/model-management-api). The key must start with `sk-` and then can be any
arbitrary
string. We recommend generating a new UUID and then using that as
the key. Configuration example is below.

Expand Down Expand Up @@ -229,5 +230,6 @@ services are in the same region as the LISA installation, LISA can use them alon

**Important:** Endpoints or Models statically defined during LISA deployment cannot be removed or updated using the
LISA Model Management API, and they will not show in the Chat UI. These will only show as part of the OpenAI `/models` API.
Although there is support for it, we recommend using the [Model Management API](/admin/model-management) instead of the
Although there is support for it, we recommend using the [Model Management API](/config/model-management-api) instead of
the
following static configuration.
File renamed without changes.
1 change: 1 addition & 0 deletions lib/docs/admin/litellm.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# TODO
1 change: 1 addition & 0 deletions lib/docs/admin/model-management-admin.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# TODO
1 change: 1 addition & 0 deletions lib/docs/admin/security.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# TODO
File renamed without changes.
1 change: 1 addition & 0 deletions lib/docs/config/branding.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# TODO
Original file line number Diff line number Diff line change
@@ -1,85 +1,18 @@

# Model Management API Usage

LISA provides robust API endpoints for managing models, both for users and administrators. These endpoints allow for operations such as listing, creating, updating, and deleting models.

## API Gateway and ALB Endpoints

LISA uses two primary APIs for model management:

1. **User-facing OpenAI-Compatible API**: Available to all users for inference tasks and accessible through the LISA Serve ALB. This API provides an interface for querying and interacting with models deployed on Amazon ECS, Amazon Bedrock, or through LiteLLM.
2. **Admin-level Model Management API**: Available only to administrators through the API Gateway (APIGW). This API allows for full control of model lifecycle management, including creating, updating, and deleting models.

### LiteLLM Routing in All Models

Every model request is routed through LiteLLM, regardless of whether infrastructure (like ECS) is created for it. Whether deployed on ECS, external models via Bedrock, or managed through LiteLLM, all models are added to LiteLLM for traffic routing. The distinction is whether infrastructure is created (determined by request payloads), but LiteLLM integration is consistent for all models. The model management APIs will handle adding or removing model configurations from LiteLLM, and the LISA Serve endpoint will handle the inference requests against models available in LiteLLM.

## User-facing OpenAI-Compatible API

The OpenAI-compatible API is accessible through the LISA Serve ALB and allows users to list models available for inference tasks. Although not specifically part of the model management APIs, any model that is added or removed from LiteLLM via the model management API Gateway APIs will be reflected immediately upon queries to LiteLLM through the LISA Serve ALB.

### Listing Models

The `/v2/serve/models` endpoint on the LISA Serve ALB allows users to list all models available for inference in the LISA system.

#### Request Example:

```bash
curl -s -H 'Authorization: Bearer <your_token>' -X GET https://<alb_endpoint>/v2/serve/models
```

#### Response Example:

```json
{
"data": [
{
"id": "bedrock-embed-text-v2",
"object": "model",
"created": 1677610602,
"owned_by": "openai"
},
{
"id": "titan-express-v1",
"object": "model",
"created": 1677610602,
"owned_by": "openai"
},
{
"id": "sagemaker-amazon-mistrallite",
"object": "model",
"created": 1677610602,
"owned_by": "openai"
}
],
"object": "list"
}
```

#### Explanation of Response Fields:

These fields are all defined by the OpenAI API specification, which is documented [here](https://platform.openai.com/docs/api-reference/models/list).

- `id`: A unique identifier for the model.
- `object`: The type of object, which is "model" in this case.
- `created`: A Unix timestamp representing when the model was created.
- `owned_by`: The entity responsible for the model, such as "openai."

## Admin-level Model Management API
# Admin-level Model Management API

This API is only accessible by administrators via the API Gateway and is used to create, update, and delete models. It supports full model lifecycle management.

### Listing Models (Admin API)
## Listing Models (Admin API)

The `/models` route allows admins to list all models managed by the system. This includes models that are either creating, deleting, already active, or in a failed state. Models can be deployed via ECS or managed externally through a LiteLLM configuration.

#### Request Example:
### Request Example:

```bash
curl -s -H "Authorization: Bearer <admin_token>" -X GET https://<apigw_endpoint>/models
```

#### Response Example:
### Response Example:

```json
{
Expand Down Expand Up @@ -152,28 +85,28 @@ curl -s -H "Authorization: Bearer <admin_token>" -X GET https://<apigw_endpoint>
}
```

#### Explanation of Response Fields:
### Explanation of Response Fields:

- `modelId`: A unique identifier for the model.
- `modelName`: The name of the model, typically referencing the underlying service (Bedrock, SageMaker, etc.).
- `status`: The current state of the model, e.g., "Creating," "Active," or "Failed."
- `streaming`: Whether the model supports streaming inference.
- `instanceType` (optional): The instance type if the model is deployed via ECS.

### Creating a Model (Admin API)
## Creating a Model (Admin API)

LISA provides the `/models` endpoint for creating both ECS and LiteLLM-hosted models. Depending on the request payload, infrastructure will be created or bypassed (e.g., for LiteLLM-only models).

This API accepts the same model definition parameters that were accepted in the V2 model definitions within the config.yaml file with one notable difference: the `containerConfig.image.path` field is
now omitted because it corresponded with the `inferenceContainer` selection. As a convenience, this path is no longer required.

#### Request Example:
### Request Example:

```
POST https://<apigw_endpoint>/models
```

#### Example Payload for ECS Model:
### Example Payload for ECS Model:

```json
{
Expand Down Expand Up @@ -226,7 +159,7 @@ POST https://<apigw_endpoint>/models
}
```

#### Creating a LiteLLM-Only Model:
### Creating a LiteLLM-Only Model:

```json
{
Expand All @@ -237,7 +170,7 @@ POST https://<apigw_endpoint>/models
}
```

#### Explanation of Key Fields for Creation Payload:
### Explanation of Key Fields for Creation Payload:

- `modelId`: The unique identifier for the model. This is any name you would like it to be.
- `modelName`: The name of the model as it appears in the system. For LISA-hosted models, this must be the S3 Key to your model artifacts, otherwise
Expand All @@ -254,17 +187,17 @@ POST https://<apigw_endpoint>/models
- `autoScalingConfig`: Configuration related to ECS autoscaling.
- `loadBalancerConfig`: Health check configuration for load balancers.

### Deleting a Model (Admin API)
## Deleting a Model (Admin API)

Admins can delete a model using the following endpoint. Deleting a model removes the infrastructure (ECS) or disconnects from LiteLLM.

#### Request Example:
### Request Example:

```
DELETE https://<apigw_endpoint>/models/{modelId}
```

#### Response Example:
### Response Example:

```json
{
Expand All @@ -273,7 +206,7 @@ DELETE https://<apigw_endpoint>/models/{modelId}
}
```

### Updating a Model
## Updating a Model

LISA offers basic updating functionality for both LISA-hosted and LiteLLM-only models. For both types, the model type and streaming support can be updated
in the cases that the models were originally created with the wrong parameters. For example, if an embedding model was accidentally created as a `textgen`
Expand All @@ -287,15 +220,15 @@ as updating its AutoScaling configuration, as these would introduce ambiguous in
requires the usage of the enable/disable functionality to allow models to fully scale down or turn back on. Metadata updates, such as changing the model type
or streaming compatibility, can happen in either type of update or simply by themselves.

#### Request Example
### Request Example

```
PUT https://<apigw_endpoint>/models/{modelId}
```

#### Example Payloads
### Example Payloads

##### Update Model Metadata
#### Update Model Metadata

This payload will simply update the model metadata, which will complete within seconds of invoking. If setting a model as an `embedding` model, then the
`streaming` option must be set to `false` or omitted as LISA does not support streaming with embedding models. Both the `streaming` and `modelType` options
Expand All @@ -308,7 +241,7 @@ may be included in any other update request.
}
```

##### Update AutoScaling Configuration
#### Update AutoScaling Configuration

This payload will update the AutoScaling configuration for minimum, maximum, and desired number of instances. The desired number must be between the
minimum or maximum numbers, inclusive, and all the numbers must be strictly greater than 0. If the model currently has less than the minimum number, then
Expand All @@ -332,7 +265,7 @@ then that is the only option that you need to specify in the request object with
}
```

##### Stop Model - Scale Down to 0 Instances
#### Stop Model - Scale Down to 0 Instances

This payload will stop all model EC2 instances and remove the model reference from LiteLLM so that users are unable to make inference requests against a model
with no capacity. This option is useful for users who wish to manage costs and turn off instances when the model is not currently needed but will be used again
Expand All @@ -347,7 +280,7 @@ handled as separate operations.
}
```

##### Start Model - Restore Previous AutoScaling Configuration
#### Start Model - Restore Previous AutoScaling Configuration

After stopping a model, this payload will turn the model back on by spinning up instances, waiting for the expected spin-up time to allow models to initialize, and then
adding the reference back to LiteLLM so that users may query the model again. This is expected to be a much faster operation than creating the model through the CreateModel
Expand Down
1 change: 1 addition & 0 deletions lib/docs/config/model-management-ui.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# TODO
Loading

0 comments on commit 8e759d6

Please sign in to comment.