Skip to content

Commit

Permalink
feat: add demoui for openai api (#777)
Browse files Browse the repository at this point in the history
add chainlit frontend for openai api
  • Loading branch information
zhuangqh authored Dec 12, 2024
1 parent 83f25cd commit 7da6586
Show file tree
Hide file tree
Showing 6 changed files with 83 additions and 13 deletions.
4 changes: 2 additions & 2 deletions charts/DemoUI/inference/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,12 @@ Before deploying the Demo front-end, you must set the `workspaceServiceURL` envi
To set this value, modify the `values.override.yaml` file or use the `--set` flag during Helm install/upgrade:

```bash
helm install inference-frontend ./charts/DemoUI/inference/values.yaml --set env.workspaceServiceURL="http://<CLUSTER_IP>:80/chat"
helm install inference-frontend ./charts/DemoUI/inference --set env.workspaceServiceURL="http://<CLUSTER_IP>:80"
```

Or through a custom `values` file (`values.override.yaml`):
```bash
helm install inference-frontend ./charts/DemoUI/inference/values.yaml -f values.override.yaml
helm install inference-frontend ./charts/DemoUI/inference -f values.override.yaml
```

## Values
Expand Down
21 changes: 17 additions & 4 deletions charts/DemoUI/inference/templates/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -37,13 +37,26 @@ spec:
args:
- -c
- |
mkdir -p /app/frontend && \
pip install chainlit requests && \
wget -O /app/frontend/inference.py https://raw.githubusercontent.com/kaito-project/kaito/main/demo/inferenceUI/chainlit.py && \
chainlit run frontend/inference.py -w
mkdir -p /app/frontend
pip install chainlit pydantic==2.10.1 requests openai --quiet
case "$RUNTIME" in
vllm)
wget -O /app/frontend/inference.py https://raw.githubusercontent.com/kaito-project/kaito/refs/heads/main/demo/inferenceUI/chainlit_openai.py
;;
transformers)
wget -O /app/frontend/inference.py https://raw.githubusercontent.com/kaito-project/kaito/refs/heads/main/demo/inferenceUI/chainlit_transformers.py
;;
*)
echo "Error: Unsupported RUNTIME value" >&2
exit 1
;;
esac
chainlit run --host 0.0.0.0 /app/frontend/inference.py -w
env:
- name: WORKSPACE_SERVICE_URL
value: "{{ .Values.env.workspaceServiceURL }}"
- name: RUNTIME
value: "{{ .Values.env.runtime }}"
workingDir: /app
ports:
- name: http
Expand Down
10 changes: 6 additions & 4 deletions charts/DemoUI/inference/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ replicaCount: 1
image:
repository: python
pullPolicy: IfNotPresent
tag: "3.8"
tag: "3.12"
imagePullSecrets: []
podAnnotations: {}
serviceAccount:
Expand All @@ -18,9 +18,9 @@ service:
# Specify the URL for the Workspace Service inference endpoint. Use the DNS name within the cluster for reliability.
#
# Examples:
# Cluster IP: "http://<CLUSTER_IP>:80/chat"
# DNS name: "http://<SERVICE_NAME>.<NAMESPACE>.svc.cluster.local:80/chat"
# e.g., "http://workspace-falcon-7b.default.svc.cluster.local:80/chat"
# Cluster IP: "http://<CLUSTER_IP>:80"
# DNS name: "http://<SERVICE_NAME>.<NAMESPACE>.svc.cluster.local:80"
# e.g., "http://workspace-falcon-7b.default.svc.cluster.local:80"
#
# workspaceServiceURL: "<YOUR_SERVICE_URL>"
resources:
Expand All @@ -44,6 +44,8 @@ readinessProbe:
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
env:
runtime: "vllm" # "vllm" or "transformers"
nodeSelector: {}
tolerations: []
affinity: {}
Expand Down
4 changes: 2 additions & 2 deletions demo/inferenceUI/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,12 @@ Workspace Service endpoint.
- Using the --set flag:

```
helm install inference-frontend ./charts/DemoUI/inference --set env.workspaceServiceURL="http://<SERVICE_NAME>.<NAMESPACE>.svc.cluster.local:80/chat"
helm install inference-frontend ./charts/DemoUI/inference --set env.workspaceServiceURL="http://<SERVICE_NAME>.<NAMESPACE>.svc.cluster.local:80"
```
- Using a custom `values.override.yaml` file:
```
env:
workspaceServiceURL: "http://<SERVICE_NAME>.<NAMESPACE>.svc.cluster.local:80/chat"
workspaceServiceURL: "http://<SERVICE_NAME>.<NAMESPACE>.svc.cluster.local:80"
```
Then deploy with custom values file:
```
Expand Down
54 changes: 54 additions & 0 deletions demo/inferenceUI/chainlit_openai.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
import os
from urllib.parse import urljoin

from openai import AsyncOpenAI
import chainlit as cl

URL = os.environ.get('WORKSPACE_SERVICE_URL')

client = AsyncOpenAI(base_url=urljoin(URL, "v1"), api_key="YOUR_OPENAI_API_KEY")
cl.instrument_openai()

settings = {
"temperature": 0.7,
"max_tokens": 500,
"top_p": 1,
"frequency_penalty": 0,
"presence_penalty": 0,
}

@cl.on_chat_start
async def start_chat():
models = await client.models.list()
print(f"Using model: {models}")
if len(models.data) == 0:
raise ValueError("No models found")

global model
model = models.data[0].id
print(f"Using model: {model}")

@cl.on_message
async def main(message: cl.Message):
messages=[
{
"content": "You are a helpful assistant.",
"role": "system"
},
{
"content": message.content,
"role": "user"
}
]
msg = cl.Message(content="")

stream = await client.chat.completions.create(
messages=messages, model=model,
stream=True,
**settings
)

async for part in stream:
if token := part.choices[0].delta.content or "":
await msg.stream_token(token)
await msg.update()
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import os
from urllib.parse import urljoin

import chainlit as cl
import requests
Expand All @@ -25,7 +26,7 @@ def inference(prompt):
}
}

response = requests.post(URL, json=data)
response = requests.post(urljoin(URL, "chat"), json=data)

if response.status_code == 200:
response_data = response.json()
Expand Down

0 comments on commit 7da6586

Please sign in to comment.