Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Istio Ambient support #2676

Open
Tracked by #2763
peterj opened this issue Apr 11, 2024 · 23 comments · May be fixed by #2822
Open
Tracked by #2763

Istio Ambient support #2676

peterj opened this issue Apr 11, 2024 · 23 comments · May be fixed by #2822

Comments

@peterj
Copy link

peterj commented Apr 11, 2024

Istio Ambient mode is a different deployment model from the “traditional” (sidecar) mode of Istio. The ambient mode (sidecar-less) doesn’t require injecting sidecars into the deployments.

Here are the high-level differences between the two modes:

  • The concerns handled by the sidecar proxy in the sidecar Istio are split into two components in Istio ambient: 

    • Ztunnel (handles L4 concerns, mTLS, authorization policies without any HTTP)

      • Ztunnel is installed automatically when profile=ambient
    • Waypoint proxy (handles L7 concerns, i.e. traffic splitting, matching, header manipulation, etc., more or less everything that gets defined in the VirtualService)

      • Waypoint is optional (not installed by default) and it can be deployed per service account (handles all workloads using the same service account) or  per namespace (handles L7 proxying for all workloads in the namespace)  
  • Ingress gateway isn’t installed by default anymore when using profile=ambient

    • It might be worth migrating over to Kubernetes Gateway APIs and deploying the Istio ingress gateway like that, as we’ll have to use the Gateway APIs to deploy waypoint proxies anyway
  • Kubernetes Gateway API is used for ingress (and waypoint proxy) deployments

  • Any L7 VirtualServices or AuthorizationPolicies must have a “targetRef” section that specifies which waypoint proxy handles the L7 configuration

Waypoint proxies

Any VirtualService or AuthorizationPolicy that uses HTTP concepts will require a waypoint proxy. Given that there are 3 namespaces (that I identified so far), I’d suggest a per-namespace deployment of a waypoint proxy.

In addition to the waypoint proxy, the resources will have to be updated to use the waypoint proxies: 

Component Namespace Notes
dex auth
central-dashboard kubeflow
jupyter-web-app kubeflow
volumes-web-app kubeflow
katib-ui kubeflow
ml-pipeline-ui kubeflow
metadata-envoy-service kubeflow
kfp-tekton kubeflow
kubebench-dashboard kubeflow
profiles-kfam kubeflow
tensorboards-web-app-service kubeflow
kserve-models-web-app kserve

Work items

  • Migrate to Kubernetes Gateway API

    • Update the YAML (/common/istio*) for deploying ingress and local ingress to use the Gateway API 
  • Move to the latest Istio (Ambient will be beta in the next release (1.22))

    • We can still continue with the sidecar mode here
  • Move to Ambient mode

    • Identify components that need waypoint proxy

    • Deploy waypoint proxies 

    • Switch the Istio profile from default → ambient (or have an option of doing one or the other) - since Ambient will still be Beta, we shouldn’t make it a default option

@ca-scribner
Copy link
Contributor

What are the components that need to change for this? Would the profile controller need to create waypoint proxies for each user's namespace?

@peterj
Copy link
Author

peterj commented Jun 4, 2024

Waypoint proxies are automatically created when the Gateway resource gets created (so is a bit simpler than crafting deployments/services). You can configure it in such a way that there's 1 instance per namespace that handles all L7 traffic for that namespace.

@juliusvonkohout
Copy link
Member

@peterj one waypoint proxy per dynamic on demand namespace would be a critical change.

@juliusvonkohout juliusvonkohout linked a pull request Jul 28, 2024 that will close this issue
@juliusvonkohout juliusvonkohout linked a pull request Jul 28, 2024 that will close this issue
@peterj
Copy link
Author

peterj commented Jul 29, 2024

That's how the ambient is designed to handle L7. So instead of running sidecar next to every workload, you run 1 waypoint (L7) proxy per namespace.

@ca-scribner
Copy link
Contributor

Yeah I think the profile controller would be instantiating waypoint proxies for each user namespace. That shouldn't be so hard though (the profile controller already creates other per-namespace resources)

This probably gets more complicated wrt to the Gateway API though. Do we need to migrate all VirtualServices and such to the Gateway API too? I'm not clear on what is interoperable

@peterj
Copy link
Author

peterj commented Jul 30, 2024

The mixing of resources - VirtualService and Gateway API resources - is not supported in ambient. So, it's either Gateway API + HTTPRoute/TLSroute/TCPRoute or no Gateway API.

Is there an inventory of VirtualServices (and features within) currently used by kubeflow?

From the Istio docs, the following features are supported in HTTPRoutes (Gateway API):

  • matching on paths, headers
  • mirroring
  • weight-based routing
  • timeouts

@ca-scribner
Copy link
Contributor

Someone else will have to speak to it, but from what I can tell the features from VirtualServices used now are covered by HTTPRoute. I'm less sure about AuthorizationPolicies and things like authentication on the gateway - do you know the state of those with ambient istio+gateway apis?

@peterj
Copy link
Author

peterj commented Jul 30, 2024

The RequestAuthentication and AuthorizationPolicies can be used as-is. The only difference is that you have to use "targetRefs" instead of a selector label to point to the service. Here's an example from the docs:

apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
  name: view-only
  namespace: default
spec:
 # THIS IS DIFFERENT- you're targeting a Gateway, instead of using labels.
  targetRefs:
  - kind: Gateway
    group: gateway.networking.k8s.io
    name: default
  action: ALLOW
  rules:
  - from:
    - source:
        namespaces: ["default", "istio-system"]
    to:
    - operation:
        methods: ["GET"]

And request auth:

apiVersion: security.istio.io/v1
kind: RequestAuthentication
metadata:
 name: "jwt-example"
 namespace: foo
spec:
 targetRef:
   kind: Gateway
   group: gateway.networking.k8s.io
   name: httpbin-gateway
 jwtRules:
 - issuer: "[email protected]"
   jwksUri: "https://raw.githubusercontent.com/istio/istio/release-1.22/security/tools/jwt/samples/jwks.json"

But, of course, it would be great to test these things out beforehand :) Is there a good walkthrough and a collection scenarios one can run through that touch these features? (I am not so familiar with kubeflow, but if someone points me to the scenarios, I could probably test it out with Gateway API)

@juliusvonkohout
Copy link
Member

juliusvonkohout commented Jul 31, 2024

In a PR you will just trigger many scenarios automatically, so that is the easiest way to test. Just check out all the GitHub workflows

@jbottum
Copy link

jbottum commented Jul 31, 2024

this is great work, thanks for moving this forward.

@ca-scribner
Copy link
Contributor

ty @peterj this is really helpful clarifications!

Do you know how the Gateway API works for setting up authentication at the gateway? I'm thinking of this EnvoyFilter - does it have an equivalent workflow for the new api?

@peterj
Copy link
Author

peterj commented Jul 31, 2024

I think there might be an easier way to define the ext-authz that doesn't use EnvoyFile -- https://istio.io/latest/docs/tasks/security/authorization/authz-custom/

I can see the oauth2-proxy is using the configuration above, but I am not sure why the oidc-service is using an EnvoyFilter... It should work with the same configuration I think

@peterj
Copy link
Author

peterj commented Jul 31, 2024

Also, to answer the original question - based on the docs here the targetRefs could be used to configure the Envoyfilter (but again, should be tested :))

@juliusvonkohout
Copy link
Member

oidc-authservice will be eliminated soon :-)

@kimwnasptd
Copy link
Member

kimwnasptd commented Aug 2, 2024

This is an amazing effort @peterj! I'd also like to help on the changes required.

Another important component here though is going to be KServe and Knative, that create a lot of VirtualServices under the hood. Knative specifically sounds very scary since we have no influence over, from the Kubeflow side.

cc @yuzisun

EDIT: Maybe the only way to start with this would be with RawDeployment mode of KServe, which doesn't require Istio. But we need to try it out. My concern would be with the Ingress (instead of GW) that needs istio to implement the IngressClass https://kserve.github.io/website/latest/admin/kubernetes_deployment/#1-install-istio

@peterj
Copy link
Author

peterj commented Aug 2, 2024

The Ingress resource still works with Istio (i.e. Istio implements the ingress class - ref), but it would make sense to move that over to Kubernetes Gateway API.

@ca-scribner
Copy link
Contributor

I haven't tested anything in knative yet, but they have some discussion of ambient working

Copy link

github-actions bot commented Oct 2, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@juliusvonkohout
Copy link
Member

/lifecycle frozen

@ca-scribner
Copy link
Contributor

Something that concerns me with ambient is that I can't figure out how to do deny-by-default. With sidecar, you can create an allow-nothing policy that forces all communication to need an Authorization Policy enabling it. But with ambient, I don't see a good way to do that.

Does anyone have a solution? It feels like a pretty big hole so I assume I'm missing something here

@edwardzjl
Copy link

I'm trying to deploy kubeflow with istio ambient mode but I find that it might be difficult to migrate the ext-auth: istio/istio#51214

@juliusvonkohout
Copy link
Member

I'm trying to deploy kubeflow with istio ambient mode but I find that it might be difficult to migrate the ext-auth: istio/istio#51214

We will try istio-cni first for rootless istio, so please help there as well #2907

@terrytangyuan
Copy link
Member

KServe community is working on Gateway API migration. See kserve/kserve#3952

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: To Do
Development

Successfully merging a pull request may close this issue.

7 participants