Skip to content
This repository has been archived by the owner on Jul 9, 2024. It is now read-only.

Received server error (500) from primary and could not load the entire response body from endpoint #1331

Open
zinebtabet opened this issue Jul 10, 2023 · 4 comments
Labels
bug Something isn't working needs-triage Triage required

Comments

@zinebtabet
Copy link

[ERROR] ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (500) from primary and could not load the entire response body. See https://eu-west-3.console.aws.amazon.com/cloudwatch/home?region=eu-west-3#logEventViewer:group=/aws/sagemaker/Endpoints/pytorch-inference-2023-07-10-09-51-02-299 in account 086892845792 for more information. Traceback (most recent call last): File "/var/task/lambda_function.py", line 442, in lambda_handler pred_prob = invoke_endpoint_with_idx(endpointname = ENDPOINT_NAME, target_id = transaction_id, subgraph_dict = subgraph_dict, n_feats = transaction_embed_value_dict) File "/var/task/lambda_function.py", line 314, in invoke_endpoint_with_idx response = runtime.invoke_endpoint(EndpointName=endpointname, File "/opt/python/lib/python3.8/site-packages/botocore/client.py", line 508, in _api_call return self._make_api_call(operation_name, kwargs) File "/opt/python/lib/python3.8/site-packages/botocore/client.py", line 911, in _make_api_call raise error_class(parsed_response, operation_name) enter image description here please i got this error while running the following code https://github.com/awslabs/realtime-fraud-detection-with-gnn-on-dgl/tree/main/src/sagemaker.

@zinebtabet zinebtabet added bug Something isn't working needs-triage Triage required labels Jul 10, 2023
@zxkane
Copy link
Contributor

zxkane commented Jul 11, 2023

@zinebtabet Could you add the detailed reproduciable steps how you using the code?

@zinebtabet
Copy link
Author

zinebtabet commented Jul 11, 2023

I used the same code as you have in the SageMaker repository. The only thing I modified was the Docker file since I am in EU West 3. I set it up like this: ARG IMAGE_REPO=763104351884.dkr.ecr.eu-west-3.amazonaws.com FROM $IMAGE_REPO/pytorch-training:1.11.0-cpu-py38-ubuntu20.04-sagemaker ENV PATH="/opt/ml/model:${PATH}" ENV SAGEMAKER_SUBMIT_DIRECTORY /opt/ml/model COPY * /opt/ml/code/ ENV SAGEMAKER_PROGRAM fd_sl_train_entry_point.py RUN pip install dgl dglgo -f https://data.dgl.ai/wheels/repo.html

I used the same version I specified in the Docker file for the deployment as well. Then I invoked my endpoint with the Lambda function. Once I executed the test event, I received this error. I have had this error for over a month now. I will provide screenshots of the error:

image

the error in the lambda test event: Traceback (most recent call last): File "/var/task/lambda_function.py", line 442, in lambda_handler pred_prob = invoke_endpoint_with_idx(endpointname = ENDPOINT_NAME, target_id = transaction_id, subgraph_dict = subgraph_dict, n_feats = transaction_embed_value_dict) File "/var/task/lambda_function.py", line 314, in invoke_endpoint_with_idx response = runtime.invoke_endpoint(EndpointName=endpointname, File "/opt/python/lib/python3.8/site-packages/botocore/client.py", line 508, in _api_call return self._make_api_call(operation_name, kwargs) File "/opt/python/lib/python3.8/site-packages/botocore/client.py", line 911, in _make_api_call raise error_class(parsed_response, operation_name)

@aminHelkinz
Copy link

Hello @zxkane,

Thank you for the nice project. I learn a lot from it.

We use the SageMaker notebook & studio to reproduce the project. The model was created and repackaged successfully and the endpoints of them work well. Suddenly, (in a middle of a demo) the endpoint didn't respond.

Right now, we have only one model that has a workable endpoint which is trained and repackaged with SageMaker notebook.

From then none of our endpoints (the models created with notebook or studio) does not work anymore.

I appreciate any help or suggestion!

@zxkane
Copy link
Contributor

zxkane commented Aug 2, 2023

@zhjwy9343 is the data scentist for authoring those Notebook. James, could you have a look at those problems?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working needs-triage Triage required
Projects
None yet
Development

No branches or pull requests

3 participants