-
Notifications
You must be signed in to change notification settings - Fork 25
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Signed-off-by: Yossi Boaron <[email protected]>
- Loading branch information
Showing
5 changed files
with
129 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,129 @@ | ||
# Submariner Enhancement for IPV6 datapath | ||
|
||
<!-- Add link to issue/epic if available --> | ||
|
||
## Summary | ||
|
||
IPv4, IPV6 and dual-stack networking is supported for Kubernetes cluster starting in 1.21. | ||
IPV6 networking allowing assignment of IPv6 addresses. | ||
Dual-stack networking allowing the simultaneous assignment of both IPv4 and IPv6 addresses. | ||
|
||
IPv4/IPv6 dual-stack on your Kubernetes cluster provides the following features: | ||
|
||
* Dual-stack Pod networking (a single IPv4 and IPv6 address assignment per Pod) | ||
* IPv4 and IPv6 enabled Services | ||
* Pod off-cluster egress routing (eg. the Internet) via both IPv4 and IPv6 interfaces | ||
|
||
Currently, Submariner only supports IPV4 datapath and this proposal explains the changes required to support IPV6 and dual-stack clusters. | ||
|
||
## Inter-cluster datapath requirements | ||
|
||
The next table describes the required inter-cluster connectivity for the various combinations of clusters networking configurations: | ||
|
||
| clusterA networking | clusterB networking | Supported connectivity type | | ||
| :---- | :---- | :---- | | ||
| V4 | V4 | V4 | | ||
| V4 | V6 | N/A | | ||
| V6 | dual-stack | V6 | | ||
| V6 | V6 | V6 | | ||
| V4 | dual-stack | V4 | | ||
| dual-stack | dual-stack | V4,V6 | | ||
|
||
connectivity across the clusters should be supported for the following cases: | ||
|
||
* Pods(including host-networking) to Services. | ||
* Pods (including host-networking) to Pods. | ||
|
||
## Proposal | ||
|
||
Currently, Submariner fully supports IPV4 inter-cluster connectivity, including egress in-cluster routing to reach GW node, GlobalNet, and inter-cluster tunnels. | ||
|
||
The idea is to duplicate intra-cluster and inter-cluster connectivity components also for IPV6. | ||
|
||
The active Gateway Engine communicates with the central Broker to advertise its Endpoint and Cluster resources to the other clusters connected to the Broker, also ensuring that it is the sole Endpoint for its cluster. | ||
The Endpoint resource fields should include IP addresses according to the cluster’s networking configuration. | ||
For example for a dual-stack cluster HealthCheckIP, PrivateIP, PublicIP and Subnets should consist of both IPv4 and IPV6 addresses. | ||
|
||
The Route Agent running in the cluster learns about the local Endpoint and remote Endpoints and setup the necessary V4,V6 infrastructure to route cross-cluster traffic from all nodes to the active Gateway Engine node. | ||
|
||
The active Gateway Engine also establishes a watch on the Broker to learn about the active Endpoint and Cluster resources advertised by the other clusters. Once two clusters are aware of each other’s Endpoints, they can establish a secure tunnel/s based on the content of the remote Endpoint and local Endpoint through which traffic can be routed. A tunnel should be created only if the local Endpoint networking type matches remote Endpoint ipfamily. | ||
|
||
The next diagram illustrates Submariner’s datapath architecture for kube-proxy based CNIs: | ||
![non-ovnk-architecture](./images/dual-stack-arch-for-non-ovnk.png) | ||
|
||
With the proposed architecture, Submariner needs to establish both V4 and V6 intra-cluster egress routing to the GW node in the case of dual-stack. | ||
|
||
Pod IPV4 egress packets for CNI != OVNK and kabel-driver=libreswan will be: | ||
![non-ovnk-ipv4-egress](./images/ipv4-non-ovnk-egress-packets.png) | ||
|
||
And Pod IPV6 egress packets for the same configuration will be: | ||
![non-ovnk-ipv6-egress](./images/ipv6-non-ovnk-egress-packets.png) | ||
|
||
**Note**: In future, we may optimize this architecture for a dual-stack case. | ||
For example by using only the intra-cluster V4 VxLAN to route V4 and V6 traffic to the GW. | ||
|
||
## Datapath breakdown | ||
|
||
### Gateway | ||
|
||
To support IPV6 the gateway should: | ||
|
||
* discover publicIP, privateIP, healthcheckIP and cluster’s subnets for each IP family. | ||
* Note: gateway should address corner cases related to this change, for example in dual-stack environment only the V4 publicIP address is successfully resolved. | ||
* run NAT Discovery per IP family in remote Endpoint. | ||
* advertise in local Endpoint IP details based on cluster networking type, for example in duals-stack cluster both V4 Public IP and V6 Public IP should be advertised in Endpoint. | ||
* continue advertising a **single** Endpoint. in case of a dual-stack cluster, fields should consist of both V4 and V6 addresses separated. | ||
* create inter-cluster tunnel only if local endpoint networking type matches remote endpoint ipfamily. | ||
* Continue using IPSec in tunnel mode | ||
* support HealthCheck for both V4 and V6 tunnels. | ||
|
||
The next diagram describes high level flow of inter-cluster tunnel creation in GW : | ||
|
||
![tunnel-creation-flow-diagram-gw](./images/tunnel-creation-flow-diagram-gw.png) | ||
|
||
The components marked in pink should be updated to support also V6. | ||
|
||
### RouteAgent | ||
|
||
Submariner RouteAgent is composed of several event-driven handlers, each handler is responsible for specific functionalities, the list below described the required changes in each handler: | ||
|
||
#### OVN\_GwRoute handler | ||
|
||
Creates GatewayRoute resource for each remote endpoint, this CR defines the routing details on active GW node needed for sending traffic destined to remote clusters. The OVN\_GwRoute should be enhanced to create GatewayRoute resources based on cluster’s networking type. For example, two GatewayRoute resources should be created for dual-stack cluster. | ||
|
||
#### OVN\_NonGwRoute handler | ||
|
||
Similar to OVN\_GwRoute handler, it creates NonGatewayRoute resource for each remote endpoint, this CR defines the routing details needed for non GW nodes to reach active GW node. Also OVN\_NonGwRoute should be updated to create NonGatewayRoute resources based on cluster’s networking type. | ||
|
||
#### OVN handler | ||
|
||
The OVN handler is responsible for configuring routing and packetfilter rules for reaching to remote endpoints, such as NoMasquerade packetfilter rules. OVN handler should be updated to support IPV6. | ||
|
||
#### KubeProxy handler | ||
|
||
The KubeProxy handler is responsible for configuring datapath required for kube-proxy based CNIs, KubeProxy handler configures egress routing to GW node via intra-cluster VxLAN tunnel, including CNI interface discovery and setting ReversePathFilter to Loose Mode for the relevant network interfaces. The KubeProxy handler should be updated to configure egress routing to GW node via inta-cluster VxLAN also for IPV6. | ||
|
||
#### MTU Handler | ||
|
||
The MTU handler is responsible for configuring MSS clamping rules for inter-cluster traffic, it should be updated to support also IPV6 inter-cluster traffic. | ||
|
||
#### Calico IPPool handler | ||
|
||
This handler is relevant only for Calico CNI, it is responsible for creating Calico IPPools to enable iner-cluster traffic, also should be updated to create IPV6 Calico IPPools when needed. | ||
|
||
#### XRFMCleanup Handler | ||
|
||
This handler is responsible for cleaning up IPSec xfrm rules when GW node is transitioned to non-gateway node. It should also be updated to delete V6 IPsec xfrm rules if needed. | ||
|
||
#### VxLANCleanup Handler | ||
|
||
VxLANCleanup is responsible for cleaning up VxLAN cable driver routes and network interfaces when node is transitioned to non-gateway node. | ||
It should also be updated to delete V6 VxLAN cable driver routes if needed. | ||
|
||
#### Healthchecker Handler | ||
|
||
The HealthChecker handler verifies the datapath from each non-gw node to each remote cluster GW. it should be updated to support V6 datapath verification. | ||
|
||
### OVN-Kubernetes CNI | ||
|
||
TBD, describe with details the changes needed for OVN-K |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.