-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable OpenTelemetry Tracing for ChatQnA TGI serving on Gaudi #1316
base: main
Are you sure you want to change the base?
Conversation
Dependency Review✅ No vulnerabilities or license issues found.Scanned Files |
Please fix the issue reported by CI. |
a4298ad
to
45aa2f0
Compare
addressed pre-commit and merge-conflict. should I take care missing dockerfile issue? it doesn't seem to be related to this PR. |
Please merge main branch into yours, and the issues will disappear. |
ac76846
to
28bc80f
Compare
28bc80f
to
f020688
Compare
1866502
to
bccd8b1
Compare
no_proxy: ${no_proxy} | ||
http_proxy: ${http_proxy} | ||
https_proxy: ${https_proxy} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where are those set / replaced?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they are set via environment variable. TEI session around line 38 also has same settings.
tgi-service: | ||
image: ghcr.io/huggingface/tgi-gaudi:2.0.6 | ||
privileged: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add a comment on why privileged is needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
privileged is needed to access /dev/accel/accel0-7 device nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
privileged is needed to access /dev/accel/accel0-7 device nodes.
it's a little weird as other cmds should also need to access those dev nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. need to find out how habana docker runtime manages this access issue but we also have this privileged set as true for our XPU docker run.
@@ -344,3 +344,22 @@ OPEA microservice deployment can easily be monitored through Grafana dashboards | |||
|
|||
![chatqna dashboards](./assets/img/chatqna_dashboards.png) | |||
![tgi dashboard](./assets/img/tgi_dashboard.png) | |||
|
|||
## Tracing Services with OpenTelemetry Tracing and Jaeger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no doubt about this PR. but shall we enable this feature on all examples and devices but not ChatQnA TGI only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do if no concern for this PR.
bccd8b1
to
4f31aa0
Compare
4f31aa0
to
dfa90ff
Compare
Please check the CI issue. |
dfa90ff
to
d1717f2
Compare
@chensuyue I rebased to the latest codes couple time, but still have dockerfile test issue. any suggestion? |
@chensuyue also docker compose doesn't get the export OTLP env variable from set_env.sh CI doesn't use set_env.sh. I added those env variable into test script instead. |
551d966
to
551b9e3
Compare
551b9e3
to
9d1cc2f
Compare
Signed-off-by: louie-tsai <[email protected]>
Signed-off-by: Tsai, Louie <[email protected]>
9d1cc2f
to
9383f72
Compare
Description
Enable Jaeger UI and OpenTelemetry tracing for ChatQnA TGI serving on Gaudi.
user could see OPEA, TEI, and TGI open telemetry tracing on Jaeger UI.
Issues
n/a
.Type of change
List the type of change like below. Please delete options that are not relevant.
Dependencies
Jaeger docker image
Tests
Manually testing on Gaudi