Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ai-k8s is experiencing large ingestion latency into AI component #375

Open
SophiaZHANG509 opened this issue Nov 7, 2024 · 1 comment
Open
Assignees
Labels
need more info Need more info for investigation troubleshoot

Comments

@SophiaZHANG509
Copy link

Hi experts,

We noticed huge ingestion latency issue while ai-k8s App Insights sdk sending telemetry into Application Insights component. The delayed ingestion issue is occurring permanently, not temporary issue. This latency can be up to 24 hours. The version in use is ai-k8s:2.0.3.0

The delayed ingestion issue only occurs on ai-k8s App Insights SDK, while all logs capture by other App Insights SDKs are working properly without latency error.

  • ai-k8s App Insights SDK:
    Image

  • other types of App Insights SDK, including dotnet:2.22.0-997, dotnetc:2.20.0-103
    Image

As per below official document, since time gap is between Timestamp/TimeGenerated and _TimeReceived, and hence, the delay should happen from client side (the server where the app is hosted), there’s no latency issue in Azure App Insights backend.
https://learn.microsoft.com/en-us/azure/azure-monitor/logs/data-ingestion-time#check-ingestion-time

We have already checked the network connection between client and AI endpoints, no network error could be found.

Raising this ticket to seek for advises and insights from your ends to help us look deeper into this huge latency issue and mitigate it.
Any assistance would be really appreciated.

@xiaomi7732 xiaomi7732 self-assigned this Dec 4, 2024
@xiaomi7732 xiaomi7732 added the need more info Need more info for investigation label Dec 5, 2024
@xiaomi7732
Copy link
Member

Hey @SophiaZHANG509 thanks for the reporting.

Do you actually see the delay (like the telemetry entry actually arrives late) or is it a bug on timestamps? (Like we thought it was 24 hours ago)

Is it possible to temporarily pull out application insights Kubernetes enricher and try it again?

To do that, comment out this line at the start of your application:

services.AddApplicationInsightsKubernetesEnricher();

I suspect what you see is a correlation between the issue and this SDK, but not the causation. And that will help us confirm where the issue is and thus find the right expert to investigate it for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need more info Need more info for investigation troubleshoot
Projects
None yet
Development

No branches or pull requests

2 participants