Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cross-Cluster Service Connectivity Fails with "Host is Unreachable" Despite Successful DNS Resolution in Submariner GlobalNet Setup #3204

Open
aswinayyolath opened this issue Oct 31, 2024 · 21 comments
Assignees
Labels
bug Something isn't working datapath Datapath related issues or enhancements flannel flannel CNI

Comments

@aswinayyolath
Copy link

aswinayyolath commented Oct 31, 2024

What happened:
I deployed Submariner with GlobalNet across two Kubernetes clusters. DNS resolution works as expected, but connectivity to services across clusters fails with a Host is unreachable error.

More info is available in below link

https://kubernetes.slack.com/archives/C010RJV694M/p1730390376380879

What you expected to happen:
curl requests from a pod in cluster2 to a service exposed via Submariner in cluster1 should succeed, indicating that cross-cluster communication is functioning.

How to reproduce it (as minimally and precisely as possible):

  • Set up two Kubernetes clusters and deploy Submariner with GlobalNet enabled.
  • cluster1 GlobalNet CIDR: 242.0.0.0/16
  • cluster2 GlobalNet CIDR: 243.0.0.0/16
  • Deploy an nginx pod in cluster1, expose it as a service, and export it using Submariner.
  • Deploy a test pod (tmp-shell) in cluster2.
  • Attempt to access the nginx service in cluster1 from tmp-shell in cluster2 using DNS (nginx-cluster1.default.svc.clusterset.local) or the resolved GlobalNet IP.

Anything else we need to know?:

Environment:

  • Diagnose information (use subctl diagnose all):
Cluster 1 info
Aswin 🔥🔥🔥 $ subctl diagnose all --kubeconfig /Users/aswina/Downloads/sub1
Cluster "sub1"
 ✓ Checking Submariner support for the Kubernetes version
 ✓ Kubernetes version "v1.30.6" is supported

 ✗ Globalnet deployment detected - checking that globalnet CIDRs do not overlap
 ✗ Error getting the Broker's REST config: error getting auth rest config: Get "https://9.66.245.122:6443/apis/submariner.io/v1/namespaces/submariner-k8s-broker/clusters/any": tls: failed to verify certificate: x509: "kube-apiserver" certificate is not trusted

 ⚠ Checking Submariner support for the CNI network plugin
 ⚠ Submariner could not detect the CNI network plugin and is using ("generic") plugin. It may or may not work.
 ✓ Checking gateway connections
 ✗ Checking route agent connections
 ✗ Connection to cluster "cluster2" is not established. Connection details:
{
  "status": "error",
  "statusMessage": "Failed to successfully ping the remote endpoint IP \"243.0.255.254\"",
  "spec": {
    "cluster_id": "cluster2",
    "cable_name": "submariner-cable-cluster2-10-21-82-227",
    "healthCheckIP": "243.0.255.254",
    "hostname": "sub2-worker-1.fyre.ibm.com",
    "subnets": [
      "243.0.0.0/16"
    ],
    "private_ip": "10.21.82.227",
    "public_ip": "129.41.87.3",
    "nat_enabled": true,
    "backend": "libreswan",
    "backend_config": {
      "natt-discovery-port": "4490",
      "preferred-server": "false",
      "udp-port": "4500"
    }
  }
}
 ✗ Connection to cluster "cluster2" is not established. Connection details:
{
  "status": "error",
  "statusMessage": "Failed to successfully ping the remote endpoint IP \"243.0.255.254\"",
  "spec": {
    "cluster_id": "cluster2",
    "cable_name": "submariner-cable-cluster2-10-21-82-227",
    "healthCheckIP": "243.0.255.254",
    "hostname": "sub2-worker-1.fyre.ibm.com",
    "subnets": [
      "243.0.0.0/16"
    ],
    "private_ip": "10.21.82.227",
    "public_ip": "129.41.87.3",
    "nat_enabled": true,
    "backend": "libreswan",
    "backend_config": {
      "natt-discovery-port": "4490",
      "preferred-server": "false",
      "udp-port": "4500"
    }
  }
}
 ✗ Connection to cluster "cluster2" is not established. Connection details:
{
  "status": "error",
  "statusMessage": "Failed to successfully ping the remote endpoint IP \"243.0.255.254\"",
  "spec": {
    "cluster_id": "cluster2",
    "cable_name": "submariner-cable-cluster2-10-21-82-227",
    "healthCheckIP": "243.0.255.254",
    "hostname": "sub2-worker-1.fyre.ibm.com",
    "subnets": [
      "243.0.0.0/16"
    ],
    "private_ip": "10.21.82.227",
    "public_ip": "129.41.87.3",
    "nat_enabled": true,
    "backend": "libreswan",
    "backend_config": {
      "natt-discovery-port": "4490",
      "preferred-server": "false",
      "udp-port": "4500"
    }
  }
}
 ✓ Checking Submariner support for the kube-proxy mode
 ✓ The kube-proxy mode is supported
 ✗ Checking that firewall configuration allows intra-cluster VXLAN traffic
 ✗ The tcpdump output from the sniffer pod does not contain the expected remote endpoint IP 243.0.0.0. Please check that your firewall configuration allows UDP/4800 traffic. Actual pod output:
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on vx-submariner, link-type EN10MB (Ethernet), snapshot length 262144 bytes

0 packets captured
0 packets received by filter
0 packets dropped by kernel

 ✓ Checking that Globalnet is correctly configured and functioning

 ✓ Checking that services have been exported properly

Skipping inter-cluster firewall check as it requires two kubeconfigs. Please run "subctl diagnose firewall inter-cluster" command manually.

subctl version: v0.19.0

Aswin 🔥🔥🔥 $
Cluster 2 info
Aswin 🔥🔥🔥 $ subctl diagnose all --kubeconfig /Users/aswina/Downloads/sub2
Cluster "sub2"
 ✓ Checking Submariner support for the Kubernetes version
 ✓ Kubernetes version "v1.30.6" is supported

 ✗ Globalnet deployment detected - checking that globalnet CIDRs do not overlap
 ✗ Error getting the Broker's REST config: error getting auth rest config: Get "https://9.66.245.122:6443/apis/submariner.io/v1/namespaces/submariner-k8s-broker/clusters/any": tls: failed to verify certificate: x509: "kube-apiserver" certificate is not trusted

 ⚠ Checking Submariner support for the CNI network plugin
 ⚠ Submariner could not detect the CNI network plugin and is using ("generic") plugin. It may or may not work.
 ✓ Checking gateway connections
 ✗ Checking route agent connections
 ✗ Connection to cluster "cluster1" is not established. Connection details:
{
  "status": "error",
  "statusMessage": "Failed to successfully ping the remote endpoint IP \"242.0.255.254\"",
  "spec": {
    "cluster_id": "cluster1",
    "cable_name": "submariner-cable-cluster1-10-21-101-75",
    "healthCheckIP": "242.0.255.254",
    "hostname": "sub1-worker-1.fyre.ibm.com",
    "subnets": [
      "242.0.0.0/16"
    ],
    "private_ip": "10.21.101.75",
    "public_ip": "129.41.87.4",
    "nat_enabled": true,
    "backend": "libreswan",
    "backend_config": {
      "natt-discovery-port": "4490",
      "preferred-server": "false",
      "udp-port": "4500"
    }
  }
}
 ✗ Connection to cluster "cluster1" is not established. Connection details:
{
  "status": "error",
  "statusMessage": "Failed to successfully ping the remote endpoint IP \"242.0.255.254\"",
  "spec": {
    "cluster_id": "cluster1",
    "cable_name": "submariner-cable-cluster1-10-21-101-75",
    "healthCheckIP": "242.0.255.254",
    "hostname": "sub1-worker-1.fyre.ibm.com",
    "subnets": [
      "242.0.0.0/16"
    ],
    "private_ip": "10.21.101.75",
    "public_ip": "129.41.87.4",
    "nat_enabled": true,
    "backend": "libreswan",
    "backend_config": {
      "natt-discovery-port": "4490",
      "preferred-server": "false",
      "udp-port": "4500"
    }
  }
}
 ✗ Connection to cluster "cluster1" is not established. Connection details:
{
  "status": "error",
  "statusMessage": "Failed to successfully ping the remote endpoint IP \"242.0.255.254\"",
  "spec": {
    "cluster_id": "cluster1",
    "cable_name": "submariner-cable-cluster1-10-21-101-75",
    "healthCheckIP": "242.0.255.254",
    "hostname": "sub1-worker-1.fyre.ibm.com",
    "subnets": [
      "242.0.0.0/16"
    ],
    "private_ip": "10.21.101.75",
    "public_ip": "129.41.87.4",
    "nat_enabled": true,
    "backend": "libreswan",
    "backend_config": {
      "natt-discovery-port": "4490",
      "preferred-server": "false",
      "udp-port": "4500"
    }
  }
}
 ✓ Checking Submariner support for the kube-proxy mode
 ✓ The kube-proxy mode is supported
 ✗ Checking that firewall configuration allows intra-cluster VXLAN traffic
 ✗ The tcpdump output from the sniffer pod does not contain the expected remote endpoint IP 242.0.0.0. Please check that your firewall configuration allows UDP/4800 traffic. Actual pod output:
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on vx-submariner, link-type EN10MB (Ethernet), snapshot length 262144 bytes

0 packets captured
0 packets received by filter
0 packets dropped by kernel

 ✓ Checking that Globalnet is correctly configured and functioning

 ✓ Checking that services have been exported properly

Skipping inter-cluster firewall check as it requires two kubeconfigs. Please run "subctl diagnose firewall inter-cluster" command manually.

subctl version: v0.19.0
  • Gather information (use subctl gather):

sub1.zip

sub2.zip

  • Cloud provider or hardware configuration:

K8S is installed on ubuntu VM

OS INFO

PRETTY_NAME="Ubuntu 22.04.4 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.4 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
root@sub2-master:~# kubectl version
Client Version: v1.30.6
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.30.6
root@sub1-master:~# kubectl get networkpolicies --all-namespaces
No resources found
root@sub1-master:~#
@aswinayyolath aswinayyolath added the bug Something isn't working label Oct 31, 2024
@yboaron
Copy link
Contributor

yboaron commented Nov 3, 2024

Thanks for reaching out @aswinayyolath.

A. As mentioned in Slack discussion, inter-cluster libreswan tunnel is up and communication between gw nodes is fine while communication from non-GW node to gw node is failing.

further dapath investigation is needed here, I assume that for some reason (maybe infra firewall, connection tracking) ingress packet is being dropped in gwnode@clusterX to nongwnode@clusterX segment.

Can you please run ping from non-gw node@sub1 to gw-node@sub2 (for gw-node@sub2 IP address you should use endpoint healthcheck IP == 242.0.255.254) and tcpdump the gw node and non-gw node on cluster sub1 ?

B. Also, this is not relevant to datapath issue, but I noticed that Submariner detected the CNI as generic instead of flannel, Submariner uses this code to discover network details for flannel CNI.
Can you share please the daemonsets list from kube-system namespace ?
and if one of those daemonset's name contains the string 'flannel' , share also the volume list of this ds (and if exists volume/configmap with name containing 'flannel' substring share its content)

@yboaron yboaron added flannel flannel CNI datapath Datapath related issues or enhancements labels Nov 3, 2024
@yboaron yboaron added this to Backlog Nov 3, 2024
@github-project-automation github-project-automation bot moved this to Backlog in Backlog Nov 3, 2024
@aswinayyolath
Copy link
Author

Please note: I have to create a new cluster as I messed up the old one trying various stuffs

Endpoint health check IP for the gateway node in sub2 : 243.0.255.254

kubectl get endpoint cluster2-submariner-cable-cluster2-10-21-3-236 -n submariner-operator -o jsonpath='{.spec.healthCheckIP}'

243.0.255.254

GW node of cluster 1

root@st-1-master:~# kubectl get nodes -l submariner.io/gateway=true -o wide
NAME                         STATUS   ROLES    AGE   VERSION   INTERNAL-IP    EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION       CONTAINER-RUNTIME
st-1-worker-1.fyre.ibm.com   Ready    <none>   10h   v1.30.6   10.21.68.239   <none>        Ubuntu 22.04.4 LTS   5.15.0-118-generic   containerd://1.7.22
root@st-1-master:~#

Ping Test from non-GW node (sub1) to GW node (sub2)

root@st-1-worker-3:~# ping 243.0.255.254
PING 243.0.255.254 (243.0.255.254) 56(84) bytes of data.
From 10.244.3.0 icmp_seq=1 Destination Host Unreachable
From 10.244.3.0 icmp_seq=2 Destination Host Unreachable
From 10.244.3.0 icmp_seq=3 Destination Host Unreachable
From 10.244.3.0 icmp_seq=4 Destination Host Unreachable
From 10.244.3.0 icmp_seq=5 Destination Host Unreachable
From 10.244.3.0 icmp_seq=6 Destination Host Unreachable
From 10.244.3.0 icmp_seq=7 Destination Host Unreachable
From 10.244.3.0 icmp_seq=8 Destination Host Unreachable
From 10.244.3.0 icmp_seq=9 Destination Host Unreachable
^C
--- 243.0.255.254 ping statistics ---
10 packets transmitted, 0 received, +9 errors, 100% packet loss, time 9214ms
pipe 4
root@st-1-worker-3:~#

Ping Test from GW node (sub1) to GW node (sub2)

root@st-1-worker-1:~# ping 243.0.255.254
PING 243.0.255.254 (243.0.255.254) 56(84) bytes of data.
64 bytes from 243.0.255.254: icmp_seq=1 ttl=64 time=0.931 ms
64 bytes from 243.0.255.254: icmp_seq=2 ttl=64 time=0.918 ms
64 bytes from 243.0.255.254: icmp_seq=3 ttl=64 time=0.720 ms
64 bytes from 243.0.255.254: icmp_seq=4 ttl=64 time=0.899 ms
64 bytes from 243.0.255.254: icmp_seq=5 ttl=64 time=1.09 ms
^C
--- 243.0.255.254 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4008ms
rtt min/avg/max/mdev = 0.720/0.911/1.091/0.117 ms
root@st-1-worker-1:~#

Capture Traffic on GW Node and non-GW Node (sub1) with tcpdump

image

Results

Ping Test from Non-GW Node

  • The non GW Node in cluster1 tried to ping the health check IP 243.0.255.254 (GW node in cluster2).
  • The ping failed with Destination Host Unreachable -> the non-GW node could not reach 243.0.255.254 😔.

TCPDump on Non-GW Node

  • I started a tcpdump on the non-GW node to capture any traffic related to 243.0.255.254
  • No ICMP requests or replies seem to appear in the output, indicating that either the packets aren't being sent from this node or they're being dropped somewhere along the path.

TCPDump on GW Node in cluster1

  • On the GW node in cluster1 the tcpdump shows multiple ICMP echo replies from 243.0.255.254 to 242.0.0.1(likely another internal address within cluster1 idk)
  • Looks like the GW in cluster1 is receiving traffic from the gateway in cluster2, but it isn’t successfully reaching the non-gateway node or responding to it.

@aswinayyolath
Copy link
Author

DaemonSet List:

NAME         DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
kube-proxy   4         4         4       4            4           kubernetes.io/os=linux   11h

Checked Pods

NAME                                               READY   STATUS    RESTARTS   AGE
coredns-55cb58b774-5xgj7                           1/1     Running   0          11h
coredns-55cb58b774-rs6ln                           1/1     Running   0          11h
etcd-st-1-master.fyre.ibm.com                      1/1     Running   0          11h
kube-apiserver-st-1-master.fyre.ibm.com            1/1     Running   0          11h
kube-controller-manager-st-1-master.fyre.ibm.com   1/1     Running   0          11h
kube-proxy-hrqjb                                   1/1     Running   0          11h
kube-proxy-htzg6                                   1/1     Running   0          11h
kube-proxy-rd267                                   1/1     Running   0          11h

CNI Configuration

root@st-1-master:~# cat /etc/cni/net.d/10-flannel.conflist
{
  "name": "cbr0",
  "cniVersion": "0.3.1",
  "plugins": [
    {
      "type": "flannel",
      "delegate": {
        "hairpinMode": true,
        "isDefaultGateway": true
      }
    },
    {
      "type": "portmap",
      "capabilities": {
        "portMappings": true
      }
    }
  ]
}

@yboaron
Copy link
Contributor

yboaron commented Nov 4, 2024

DaemonSet List:

NAME         DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
kube-proxy   4         4         4       4            4           kubernetes.io/os=linux   11h

Checked Pods

NAME                                               READY   STATUS    RESTARTS   AGE
coredns-55cb58b774-5xgj7                           1/1     Running   0          11h
coredns-55cb58b774-rs6ln                           1/1     Running   0          11h
etcd-st-1-master.fyre.ibm.com                      1/1     Running   0          11h
kube-apiserver-st-1-master.fyre.ibm.com            1/1     Running   0          11h
kube-controller-manager-st-1-master.fyre.ibm.com   1/1     Running   0          11h
kube-proxy-hrqjb                                   1/1     Running   0          11h
kube-proxy-htzg6                                   1/1     Running   0          11h
kube-proxy-rd267                                   1/1     Running   0          11h

CNI Configuration

root@st-1-master:~# cat /etc/cni/net.d/10-flannel.conflist
{
  "name": "cbr0",
  "cniVersion": "0.3.1",
  "plugins": [
    {
      "type": "flannel",
      "delegate": {
        "hairpinMode": true,
        "isDefaultGateway": true
      }
    },
    {
      "type": "portmap",
      "capabilities": {
        "portMappings": true
      }
    }
  ]
}

Is there flannel daemonset in another namespace?

@aswinayyolath
Copy link
Author

Yes

root@st-1-master:~# kubectl get daemonset -A
NAMESPACE             NAME                       DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR                AGE
kube-flannel          kube-flannel-ds            4         4         4       4            4           <none>                       14h
kube-system           kube-proxy                 4         4         4       4            4           kubernetes.io/os=linux       14h
submariner-operator   submariner-gateway         1         1         1       1            1           submariner.io/gateway=true   14h
submariner-operator   submariner-globalnet       1         1         1       1            1           submariner.io/gateway=true   14h
submariner-operator   submariner-metrics-proxy   1         1         1       1            1           submariner.io/gateway=true   14h
submariner-operator   submariner-routeagent      4         4         4       4            4           <none>                       14h
root@st-1-master:~#

@aswinayyolath
Copy link
Author

The kube-flannel-ds DaemonSet has the following volumes

volumes:
- name: run
  hostPath:
    path: /run/flannel
- name: cni-plugin
  hostPath:
    path: /opt/cni/bin
- name: cni
  hostPath:
    path: /etc/cni/net.d
- name: flannel-cfg
  configMap:
    name: kube-flannel-cfg
- name: xtables-lock
  hostPath:
    path: /run/xtables.lock
    type: FileOrCreate

CM details

root@st-1-master:~# kubectl get configmap kube-flannel-cfg -n kube-flannel -o yaml
apiVersion: v1
data:
  cni-conf.json: |
    {
      "name": "cbr0",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "flannel",
          "delegate": {
            "hairpinMode": true,
            "isDefaultGateway": true
          }
        },
        {
          "type": "portmap",
          "capabilities": {
            "portMappings": true
          }
        }
      ]
    }
  net-conf.json: |
    {
      "Network": "10.244.0.0/16",
      "EnableNFTables": false,
      "Backend": {
        "Type": "vxlan"
      }
    }
kind: ConfigMap
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","data":{"cni-conf.json":"{\n  \"name\": \"cbr0\",\n  \"cniVersion\": \"0.3.1\",\n  \"plugins\": [\n    {\n      \"type\": \"flannel\",\n      \"delegate\": {\n        \"hairpinMode\": true,\n        \"isDefaultGateway\": true\n      }\n    },\n    {\n      \"type\": \"portmap\",\n      \"capabilities\": {\n        \"portMappings\": true\n      }\n    }\n  ]\n}\n","net-conf.json":"{\n  \"Network\": \"10.244.0.0/16\",\n  \"EnableNFTables\": false,\n  \"Backend\": {\n    \"Type\": \"vxlan\"\n  }\n}\n"},"kind":"ConfigMap","metadata":{"annotations":{},"labels":{"app":"flannel","k8s-app":"flannel","tier":"node"},"name":"kube-flannel-cfg","namespace":"kube-flannel"}}
  creationTimestamp: "2024-11-03T16:24:55Z"
  labels:
    app: flannel
    k8s-app: flannel
    tier: node
  name: kube-flannel-cfg
  namespace: kube-flannel
  resourceVersion: "282"
  uid: f0058e4b-4ba9-49be-a759-fd0c9843a88d

@yboaron
Copy link
Contributor

yboaron commented Nov 4, 2024

Thanks for the information,

Regarding flannel discovery, it looks like we need to update flannel discovery code.
Maybe we should list ds in all namespaces and filter k8s-app=flannel label

QQ: does **kubectl get ds -A -l k8s-app=flannel** return flannel ds ?

Could you please report a new issue for flannel CNI discovery? please attach relevant information, we welcome any code contribution here :-) .

As per the datapath issue, traffic initiated at nongw node@clusterA towards remoter cluster is encapsulated in VxLAN (port 4800, interface vx-submariner) towards gw node@clusterA and gw node should forward it to remote cluster gw.

Can you double check (maybe use tcpdump -pi ) that no packet is sent in nonGW node ? I can see that on gw node iptables (filter table) packet counter for input traffic on vx-submariner interface is > 0 , check [1] .

[1]
Chain SUBMARINER-INPUT (1 references) num pkts bytes target prot opt in out source destination 1 952 74256 ACCEPT 17 -- * * 0.0.0.0/0 0.0.0.0/0 udp dpt:4800

@aswinayyolath
Copy link
Author

QQ: does kubectl get ds -A -l k8s-app=flannel return flannel ds ?
Ans: Yes

root@st-1-master:~# kubectl get ds -A -l k8s-app=flannel
NAMESPACE      NAME              DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
kube-flannel   kube-flannel-ds   4         4         4       4            4           <none>          17h
root@st-1-master:~#

I will report a new issue and see if I can contribute (I guess changes should be relatively small) ...

Packet Transmission on the Non-GW Node in sub1 (ClusterA)

  • I pinged one of the node in Cluster B from non gw node of Sub1 (Cluster A)
root@st-1-worker-3:~# ping 9.46.96.194
PING 9.46.96.194 (9.46.96.194) 56(84) bytes of data.
64 bytes from 9.46.96.194: icmp_seq=1 ttl=63 time=0.752 ms
64 bytes from 9.46.96.194: icmp_seq=2 ttl=63 time=0.725 ms
64 bytes from 9.46.96.194: icmp_seq=3 ttl=63 time=0.771 ms
^C
--- 9.46.96.194 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 0.725/0.749/0.771/0.018 ms
root@st-1-worker-3:~#

Run Packet Capture on the Non-Gateway Node in ClusterA

root@st-1-worker-3:~# sudo tcpdump -i vx-submariner port 4800
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on vx-submariner, link-type EN10MB (Ethernet), snapshot length 262144 bytes

Verified Reception on the Gateway Node in ClusterA

root@st-1-master:~# sudo iptables -t filter -L SUBMARINER-INPUT -v -n
Chain SUBMARINER-INPUT (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 ACCEPT     udp  --  *      *       0.0.0.0/0            0.0.0.0/0
root@st-1-master:~#

is this what you want me to do? I am not 100% sure

@aswinayyolath
Copy link
Author

I have created a new issue: #3210. A draft change has been pushed here: #3268.

@yboaron, I haven't yet looked into linting, unit tests, or e2es testing; I'm just checking if the changes look something like this (Draft linked above). I also modified the loop structure from

       for k := range daemonsets.Items {
               if strings.Contains(daemonsets.Items[k].Name, "flannel") {
                       volumes = daemonsets.Items[k].Spec.Template.Spec.Volumes

to

       for _, ds := range daemonsets.Items {
               if strings.Contains(ds.Name, "flannel") {
                       flannelDaemonSet = &ds
                       volumes = ds.Spec.Template.Spec.Volumes
                       break
                }
        }

to enhance code readability and clarity. I think thid approach makes it clear that ds represents a DaemonSet obj, eliminating the need for indexing. Additionally, by storing a pointer to the found DS and breaking the loop upon finding it, I believe if we do something like this the code becomes more efficient and reduces the risk of errors associated with accessing elements via an index.

@yboaron
Copy link
Contributor

yboaron commented Nov 4, 2024

I will report a new issue and see if I can contribute (I guess changes should be relatively small) ...

Packet Transmission on the Non-GW Node in sub1 (ClusterA)

  • I pinged one of the node in Cluster B from non gw node of Sub1 (Cluster A)
root@st-1-worker-3:~# ping 9.46.96.194
PING 9.46.96.194 (9.46.96.194) 56(84) bytes of data.
64 bytes from 9.46.96.194: icmp_seq=1 ttl=63 time=0.752 ms
64 bytes from 9.46.96.194: icmp_seq=2 ttl=63 time=0.725 ms
64 bytes from 9.46.96.194: icmp_seq=3 ttl=63 time=0.771 ms
^C
--- 9.46.96.194 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 0.725/0.749/0.771/0.018 ms
root@st-1-worker-3:~#

Run Packet Capture on the Non-Gateway Node in ClusterA

root@st-1-worker-3:~# sudo tcpdump -i vx-submariner port 4800
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on vx-submariner, link-type EN10MB (Ethernet), snapshot length 262144 bytes

Submariner only handles egress routing and only for packets destined to remote clusters (dest IP is from remote pod,service CIDRs, in your case it is globalNet CIDR for remote cluster) , please tcpdump while pinging remote endpoint healthcheck IP address

@aswinayyolath
Copy link
Author

image

@yboaron
Copy link
Contributor

yboaron commented Nov 5, 2024

Can you try running tcpdump -vv -penni any | grep -i icmp on nongw node and check if you get anything ?

@aswinayyolath
Copy link
Author

I am seeing a lot of output from tcpdump -vv -penni any | grep -i icmp, but I don't really understand it.

group record(s) [gaddr ff02::1:ffd6:f6f8 to_in { }] [gaddr ff02::1:fffc:d5dc to_in { }] [gaddr ff02::1:fff3:9969 to_ex { }]
01:50:24.775041 eth0  M   ifindex 2 00:00:0a:15:47:c4 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ffd6:f6f8 to_in { }] [gaddr ff02::1:fffc:d5dc to_in { }] [gaddr ff02::1:fff3:9969 to_ex { }]
01:50:24.775188 eth0  M   ifindex 2 00:00:0a:15:48:2a ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ffd6:f6f8 to_in { }] [gaddr ff02::1:fffc:d5dc to_in { }] [gaddr ff02::1:fff3:9969 to_ex { }]
01:50:24.776386 eth0  M   ifindex 2 00:00:0a:15:41:6d ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ffd6:f6f8 to_in { }] [gaddr ff02::1:fffc:d5dc to_in { }] [gaddr ff02::1:fff3:9969 to_ex { }]
01:50:24.804302 eth0  M   ifindex 2 00:00:0a:15:50:4b ethertype IPv6 (0x86dd), length 116: (hlim 1, next-header Options (0) payload length: 56) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 2 group record(s) [gaddr ff02::1:ffd6:f6f8 to_in { }] [gaddr ff02::1:fff3:9969 to_ex { }]
01:50:24.823240 eth0  M   ifindex 2 00:00:0a:15:49:bb ethertype IPv6 (0x86dd), length 92: (hlim 255, next-header ICMPv6 (58) payload length: 32) :: > ff02::1:fff3:9969: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has fe80::d80a:bb86:85f3:9969
01:50:24.824565 eth0  M   ifindex 2 00:00:0a:15:44:ac ethertype IPv6 (0x86dd), length 92: (hlim 255, next-header ICMPv6 (58) payload length: 32) fe80::d80a:bb86:85f3:9969 > ff02::1: [icmp6 sum ok] ICMP6, neighbor advertisement, length 32, tgt is fe80::d80a:bb86:85f3:9969, Flags [override]
01:50:24.826553 eth0  M   ifindex 2 00:00:0a:15:49:d2 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffd6:f6f8 to_in { }] [gaddr ff02::1:ff8e:4640 to_ex { }]
01:50:24.826709 eth0  M   ifindex 2 00:00:0a:15:4e:24 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffd6:f6f8 to_in { }] [gaddr ff02::1:ff8e:4640 to_ex { }]
01:50:24.826887 eth0  M   ifindex 2 00:00:0a:15:42:a0 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffd6:f6f8 to_in { }] [gaddr ff02::1:ff8e:4640 to_ex { }]
01:50:24.827218 eth0  M   ifindex 2 00:00:0a:15:4d:17 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffd6:f6f8 to_in { }] [gaddr ff02::1:ff8e:4640 to_ex { }]
01:50:24.827280 eth0  M   ifindex 2 00:00:0a:15:4f:48 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffd6:f6f8 to_in { }] [gaddr ff02::1:ff8e:4640 to_ex { }]
01:50:24.827483 eth0  M   ifindex 2 00:00:0a:15:47:f3 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffd6:f6f8 to_in { }] [gaddr ff02::1:ff8e:4640 to_ex { }]
01:50:24.827632 eth0  M   ifindex 2 00:00:0a:15:40:a5 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffd6:f6f8 to_in { }] [gaddr ff02::1:ff8e:4640 to_ex { }]
01:50:24.827839 eth0  M   ifindex 2 00:00:0a:15:43:25 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffd6:f6f8 to_in { }] [gaddr ff02::1:ff8e:4640 to_ex { }]
01:50:24.828020 eth0  M   ifindex 2 00:00:0a:15:49:c3 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffd6:f6f8 to_in { }] [gaddr ff02::1:ff8e:4640 to_ex { }]
01:50:24.828150 eth0  M   ifindex 2 00:00:0a:15:49:bb ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffd6:f6f8 to_in { }] [gaddr ff02::1:ff8e:4640 to_ex { }]
01:50:24.828307 eth0  M   ifindex 2 00:00:0a:15:51:c2 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffd6:f6f8 to_in { }] [gaddr ff02::1:ff8e:4640 to_ex { }]
01:50:24.828473 eth0  M   ifindex 2 00:00:0a:15:47:bd ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffd6:f6f8 to_in { }] [gaddr ff02::1:ff8e:4640 to_ex { }]
01:50:24.828628 eth0  M   ifindex 2 00:00:0a:15:41:d2 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffd6:f6f8 to_in { }] [gaddr ff02::1:ff8e:4640 to_ex { }]
01:50:24.828787 eth0  M   ifindex 2 00:00:0a:15:4b:6a ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffd6:f6f8 to_in { }] [gaddr ff02::1:ff8e:4640 to_ex { }]
01:50:24.828940 eth0  M   ifindex 2 00:00:0a:15:4c:be ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffd6:f6f8 to_in { }] [gaddr ff02::1:ff8e:4640 to_ex { }]
01:50:24.829140 eth0  M   ifindex 2 00:00:0a:15:4d:f4 ethertype IPv6 (0x86dd), length 156: (hlim 1, next-header Options (0) payload length: 96) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 4 group record(s) [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffd6:f6f8 to_in { }] [gaddr ff02::fb to_ex { }] [gaddr ff02::1:ff8e:4640 to_ex { }]
01:50:24.829286 eth0  M   ifindex 2 00:00:0a:15:47:be ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffd6:f6f8 to_in { }] [gaddr ff02::1:ff8e:4640 to_ex { }]
01:50:24.829430 eth0  M   ifindex 2 00:00:0a:15:50:a7 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffd6:f6f8 to_in { }] [gaddr ff02::1:ff8e:4640 to_ex { }]
01:50:24.829591 eth0  M   ifindex 2 00:00:0a:15:46:90 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffd6:f6f8 to_in { }] [gaddr ff02::1:ff8e:4640 to_ex { }]
01:50:24.829780 eth0  M   ifindex 2 00:00:0a:15:48:2a ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffd6:f6f8 to_in { }] [gaddr ff02::1:ff8e:4640 to_ex { }]
01:50:24.829939 eth0  M   ifindex 2 00:00:0a:15:47:c4 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffd6:f6f8 to_in { }] [gaddr ff02::1:ff8e:4640 to_ex { }]
01:50:24.830090 eth0  M   ifindex 2 00:00:0a:15:49:27 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffd6:f6f8 to_in { }] [gaddr ff02::1:ff8e:4640 to_ex { }]
01:50:24.830245 eth0  M   ifindex 2 00:00:0a:15:43:34 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffd6:f6f8 to_in { }] [gaddr ff02::1:ff8e:4640 to_ex { }]
01:50:24.830411 eth0  M   ifindex 2 00:00:0a:15:50:4b ethertype IPv6 (0x86dd), length 116: (hlim 1, next-header Options (0) payload length: 56) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 2 group record(s) [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ff8e:4640 to_ex { }]
01:50:24.830600 eth0  M   ifindex 2 00:00:0a:15:41:6d ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffd6:f6f8 to_in { }] [gaddr ff02::1:ff8e:4640 to_ex { }]
01:50:24.831228 eth0  M   ifindex 2 00:00:0a:15:49:bb ethertype IPv6 (0x86dd), length 92: (hlim 255, next-header ICMPv6 (58) payload length: 32) :: > ff02::1:ff8e:4640: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has fe80::3874:f6a7:af8e:4640
01:50:24.832144 eth0  M   ifindex 2 00:00:0a:15:45:7e ethertype IPv6 (0x86dd), length 92: (hlim 255, next-header ICMPv6 (58) payload length: 32) fe80::3874:f6a7:af8e:4640 > ff02::1: [icmp6 sum ok] ICMP6, neighbor advertisement, length 32, tgt is fe80::3874:f6a7:af8e:4640, Flags [override]
01:50:24.834225 eth0  M   ifindex 2 00:00:0a:15:49:c3 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffa3:b55e to_ex { }]
01:50:24.834496 eth0  M   ifindex 2 00:00:0a:15:47:bd ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffa3:b55e to_ex { }]
01:50:24.834685 eth0  M   ifindex 2 00:00:0a:15:4d:f4 ethertype IPv6 (0x86dd), length 156: (hlim 1, next-header Options (0) payload length: 96) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 4 group record(s) [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::fb to_ex { }] [gaddr ff02::1:ffa3:b55e to_ex { }]
01:50:24.835124 eth0  M   ifindex 2 00:00:0a:15:4e:24 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffa3:b55e to_ex { }]
01:50:24.835278 eth0  M   ifindex 2 00:00:0a:15:49:d2 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffa3:b55e to_ex { }]
01:50:24.835421 eth0  M   ifindex 2 00:00:0a:15:43:34 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffa3:b55e to_ex { }]
01:50:24.835572 eth0  M   ifindex 2 00:00:0a:15:49:27 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffa3:b55e to_ex { }]
01:50:24.835723 eth0  M   ifindex 2 00:00:0a:15:42:a0 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffa3:b55e to_ex { }]
01:50:24.835916 eth0  M   ifindex 2 00:00:0a:15:4d:17 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffa3:b55e to_ex { }]
01:50:24.836052 eth0  M   ifindex 2 00:00:0a:15:4f:48 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffa3:b55e to_ex { }]
01:50:24.836226 eth0  M   ifindex 2 00:00:0a:15:47:f3 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffa3:b55e to_ex { }]
01:50:24.836429 eth0  M   ifindex 2 00:00:0a:15:40:a5 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffa3:b55e to_ex { }]
01:50:24.836550 eth0  M   ifindex 2 00:00:0a:15:43:25 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffa3:b55e to_ex { }]
01:50:24.836695 eth0  M   ifindex 2 00:00:0a:15:51:c2 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffa3:b55e to_ex { }]
01:50:24.836837 eth0  M   ifindex 2 00:00:0a:15:49:bb ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffa3:b55e to_ex { }]
01:50:24.836963 eth0  M   ifindex 2 00:00:0a:15:41:d2 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffa3:b55e to_ex { }]
01:50:24.837107 eth0  M   ifindex 2 00:00:0a:15:4b:6a ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffa3:b55e to_ex { }]
01:50:24.837246 eth0  M   ifindex 2 00:00:0a:15:4c:be ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffa3:b55e to_ex { }]
01:50:24.837370 eth0  M   ifindex 2 00:00:0a:15:47:be ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffa3:b55e to_ex { }]
01:50:24.837555 eth0  M   ifindex 2 00:00:0a:15:50:a7 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffa3:b55e to_ex { }]
01:50:24.837738 eth0  M   ifindex 2 00:00:0a:15:46:90 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffa3:b55e to_ex { }]
01:50:24.837859 eth0  M   ifindex 2 00:00:0a:15:48:2a ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffa3:b55e to_ex { }]
01:50:24.838279 eth0  M   ifindex 2 00:00:0a:15:50:4b ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffa3:b55e to_ex { }]
01:50:24.838393 eth0  M   ifindex 2 00:00:0a:15:47:c4 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffa3:b55e to_ex { }]
01:50:24.838520 eth0  M   ifindex 2 00:00:0a:15:41:6d ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:fff3:9969 to_in { }] [gaddr ff02::1:ffa3:b55e to_ex { }]
01:50:24.840276 eth0  M   ifindex 2 00:00:0a:15:4d:f4 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::fb to_ex { }] [gaddr ff02::1:ffa3:b55e to_ex { }]
01:50:24.846137 eth0  M   ifindex 2 00:00:0a:15:47:f3 ethertype IPv6 (0x86dd), length 92: (hlim 255, next-header ICMPv6 (58) payload length: 32) :: > ff02::1:ffa3:b55e: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has fe80::5dda:4834:24a3:b55e
01:50:24.847290 eth0  M   ifindex 2 00:00:0a:15:46:21 ethertype IPv6 (0x86dd), length 92: (hlim 255, next-header ICMPv6 (58) payload length: 32) fe80::5dda:4834:24a3:b55e > ff02::1: [icmp6 sum ok] ICMP6, neighbor advertisement, length 32, tgt is fe80::5dda:4834:24a3:b55e, Flags [override]
01:50:24.849025 eth0  M   ifindex 2 00:00:0a:15:49:c3 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ffa3:b55e to_in { }] [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:ff6f:6de1 to_ex { }]
01:50:24.849206 eth0  M   ifindex 2 00:00:0a:15:49:bb ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ffa3:b55e to_in { }] [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:ff6f:6de1 to_ex { }]
01:50:24.849409 eth0  M   ifindex 2 00:00:0a:15:47:bd ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ffa3:b55e to_in { }] [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:ff6f:6de1 to_ex { }]
01:50:24.849632 eth0  M   ifindex 2 00:00:0a:15:4d:f4 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ffa3:b55e to_in { }] [gaddr ff02::fb to_ex { }] [gaddr ff02::1:ff6f:6de1 to_ex { }]
01:50:24.849776 eth0  M   ifindex 2 00:00:0a:15:4e:24 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ffa3:b55e to_in { }] [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:ff6f:6de1 to_ex { }]
01:50:24.849911 eth0  M   ifindex 2 00:00:0a:15:42:a0 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ffa3:b55e to_in { }] [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:ff6f:6de1 to_ex { }]
01:50:24.850069 eth0  M   ifindex 2 00:00:0a:15:4d:17 ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ffa3:b55e to_in { }] [gaddr ff02::1:ff8e:4640 to_in { }] [gaddr ff02::1:ff6f:6de1 to_ex { }]
01:50:24.850233 eth0  M   ifindex 2 00:00:0a:15:47:be ethertype IPv6 (0x86dd), length 136: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3

@yboaron
Copy link
Contributor

yboaron commented Nov 5, 2024

Hmmmm, its ICMP/IPv6 traffic , don't you get any ICMP/IPv4 (tcpdump -penni any -vv | grep -i icmp | grep IPv4) traffic?

@aswinayyolath
Copy link
Author

root@st-1-worker-3:~# tcpdump -penni any -vv | grep -i icmp | grep IPv4
tcpdump: data link type LINUX_SLL2
tcpdump: listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes
20:03:40.206272 lo    In  ifindex 1 00:00:00:00:00:00 ethertype IPv4 (0x0800), length 100: (tos 0xc0, ttl 64, id 12508, offset 0, flags [none], proto ICMP (1), length 80)
20:04:40.206323 lo    In  ifindex 1 00:00:00:00:00:00 ethertype IPv4 (0x0800), length 100: (tos 0xc0, ttl 64, id 18713, offset 0, flags [none], proto ICMP (1), length 80)
20:05:40.206530 lo    In  ifindex 1 00:00:00:00:00:00 ethertype IPv4 (0x0800), length 100: (tos 0xc0, ttl 64, id 24579, offset 0, flags [none], proto ICMP (1), length 80)
20:06:40.206287 lo    In  ifindex 1 00:00:00:00:00:00 ethertype IPv4 (0x0800), length 100: (tos 0xc0, ttl 64, id 27221, offset 0, flags [none], proto ICMP (1), length 80)

@yboaron
Copy link
Contributor

yboaron commented Nov 12, 2024

Hmm, strange, can't see the V4 icmp sent to remote cluster.
If you still have the env available could you please upload subctl gather?

@aswinayyolath
Copy link
Author

I don't have the cluster with me 😔. But I will create one (in fact 2). @yboaron I would like to check with you if the steps I am following is correct or not. Could you please review the Steps here (https://kubernetes.slack.com/archives/C010RJV694M/p1730390398271589?thread_ts=1730390376.380879&cid=C010RJV694M) and let me know If I am missing anything please?

@aswinayyolath
Copy link
Author

aswinayyolath commented Nov 12, 2024

I would also like to test the same in AWS across 2 regions. I just want to know if the steps I followed is correct and I will try it in Both the VM I used before as well as I will create 2 EKS cluster in 2 diff regions in AWS and see if that works

@yboaron
Copy link
Contributor

yboaron commented Nov 12, 2024

I don't have the cluster with me 😔. But I will create one (in fact 2). @yboaron I would like to check with you if the steps I am following is correct or not. Could you please review the Steps here (https://kubernetes.slack.com/archives/C010RJV694M/p1730390398271589?thread_ts=1730390376.380879&cid=C010RJV694M) and let me know If I am missing anything please?

Yep, looks fine.

Can you try reinstalling without adding --globalnet-cidr 242.0.0.0/16 flag in subctl join command for both clusters

@rohan-anilkumar
Copy link

rohan-anilkumar commented Nov 25, 2024

Hello @yboaron. Since @aswinayyolath is busy with some other tasks, I'm looking at this issue. We're on the same team working on the same project.

Since we have same CIDRs on our K8s clusters we cannot have submariner run without global net. To counter this we created an AWS account and then tried to run submariner on EKS.
We followed this tutorial to setup the aws eks control plane: https://www.youtube.com/watch?v=0bUEKcjC_jM&t=261s
And followed this tutorial to setup submariner on aws: https://www.youtube.com/watch?v=fMhZRNn0fxQ&t=5s

But this does not work and gives these outputs while running diagnostics

rohananilkumar@Rohans-MacBook-Pro .kube % subctl diagnose all --kubeconfig config-str-aws
Cluster "arn:aws:eks:eu-north-1:<SNIPPED>:cluster/stretch-1"
 ✓ Checking Submariner support for the Kubernetes version
 ✓ Kubernetes version "v1.31.2-eks-7f9249a" is supported

 ✗ Non-Globalnet deployment detected - checking that cluster CIDRs do not overlap
 ✗ Error getting the Broker's REST config: error getting auth rest config: Get "https://<SNIPPED>.eu-north-1.eks.amazonaws.com/apis/submariner.io/v1/namespaces/submariner-k8s-broker/clusters/any": tls: failed to verify certificate: x509: “kube-apiserver” certificate is not trusted

 ⚠ Checking Submariner support for the CNI network plugin
 ⚠ Submariner could not detect the CNI network plugin and is using ("generic") plugin. It may or may not work.
 ✗ Checking gateway connections
 ✗ There are no active connections on gateway "ip-<SNIPPED>.eu-north-1.compute.internal"
 ✓ Checking Submariner support for the kube-proxy mode
 ✓ The kube-proxy mode is supported
 ✗ Checking that firewall configuration allows intra-cluster VXLAN traffic
 ✗ Unable to obtain a remote endpoint: endpoints.submariner.io "remote Endpoint" not found

 ✓ Checking that services have been exported properly

Skipping inter-cluster firewall check as it requires two kubeconfigs. Please run "subctl diagnose firewall inter-cluster" command manually.

subctl version: v0.18.0

I suspect that there is some issue with setting up the subnets. What is something I should try next to get submariner up and running on AWS?

@yboaron @Jaanki

@yboaron
Copy link
Contributor

yboaron commented Dec 11, 2024

Maybe you can follow this link ?

In case deployment fails please attach debug details from clusters (subctl gather , subctl diagnose all ) ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working datapath Datapath related issues or enhancements flannel flannel CNI
Projects
Status: In Progress
Development

No branches or pull requests

4 participants