Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable E2E Tests for Kubeflow Pipeline on ppc64le Architecture #11477

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

valen-mascarenhas14
Copy link

@valen-mascarenhas14 valen-mascarenhas14 commented Dec 20, 2024

Signed-off-by: Valen Mascarenhas [email protected]

Overview

This PR demonstrates the efforts to enable running Kubeflow Pipeline (KFP) E2E tests on the ppc64le architecture using a self-hosted runner. It also highlights current challenges, adjustments made, and areas that require community feedback.


Changes Introduced

  1. Self-Hosted Runner Workflow:

    • Configured a self-hosted runner with ppc64le architecture running Ubuntu.
    • Added cleanup actions to delete the Kubernetes cluster after each job to maintain a clean environment for subsequent workflows.
  2. Image Handling:

    • Integrated pre-built images stored in JFrog Artifactory, bypassing the need for runtime image builds.
  3. Workflow Adjustments:

    • Adjusted the workflow to skip actions unnecessary for ppc64le architecture.

Current Challenges

  1. Test Failures:

    • Frontend-Integration Test:
      • The test is currently failing.
      • Proposed skipping this test until the root cause is identified (Discussion Link).
    • Basic Sample Test:
      • Debugging in progress to resolve the issues.
  2. Cluster Management:

    • Ensured that clusters are deleted after job completion to avoid conflicts and resource leakage.
  3. ppc64le Build Requirements:

    • Architecture-specific changes are required to build KFP images for ppc64le.

References

  • Discussion on test failures: #11143
  • Proposal for skipping Frontend-Integration Tests: #11195

Request for Feedback

We invite the community to review this PR and provide feedback on the following:

  1. The feasibility and completeness of the approach.
  2. Suggestions to resolve current test failures effectively.
  3. Recommendations for handling ppc64le-specific challenges.
  4. Feedback on skipping or modifying workflows/tests based on our setup.

Next Steps

Based on community feedback, we plan to:

  1. Refine workflows to address identified gaps.
  2. Implement fixes for failing tests.
  3. Investigate long-term solutions for building KFP images natively on ppc64le.

Looking forward to collaborating with the community to enable robust support for ppc64le architecture!

Copy link

Hi @valen-mascarenhas14. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@rimolive
Copy link
Member

@valen-mascarenhas14 Please sign-off the commits

Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign humairak for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow bot added size/XXL and removed size/L labels Dec 26, 2024
@valen-mascarenhas14
Copy link
Author

@rimolive done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants