Last week, with my colleague Marc, we faced a timeout issue in an Istio service mesh. An idle PostgreSQL connection was shut down precisely one hour after it has been opened. During our investigations, I had to capture the network traffic entering and leaving the PostgreSQL client pod.
Ksniff
For this, I've been using Ksniff. This Kubernetes plugin tries to deploy a statically compiled tcpdump binary inside the pod which traffic you want to capture and then streams the captured packets to a Wireshark instance running on your workstation. This plugin is just awesome, thanks a lot to its author Eldad Rudich!
Envoy traffic interception
Istio works by injecting an Envoy proxy sidecar container inside every pod, which will intercept inbound and outbound network traffic. So when a process inside your container communicates with an external server, there are in fact two TCP connections: one between the process and Envoy, and one between Envoy and the distant server. However, in Wireshark, I saw only the packets between Envoy and the distant server and the packets from Envoy to the process but not the packets from the process to Envoy.
In the Wireshark screenshot below, 10.0.9.225
is the IP address of the process, 172.20.94.219
is the IP address of the virtual service the process communicates with, and 10.0.5.234
is the IP address of the distant (real) server backing the virtual service.
I was a bit surprised 🤔 So I searched how Istio diverted the network traffic through Envoy. It does so by adding iptables REDIRECT
rules to send the traffic to port 15001
on localhost
which Envoy is listening on. Indeed, I could see it in Wireshark:
But then, how can Envoy know where to forward the traffic if all the traffic it sees entering is destined for 127.0.0.1:15001
? All my knowledge on IP network functioning was questioned! However, after some time spent looking for the answer, I finally found it!
When a packet hits an iptables REDIRECT
rules, the kernel sets a socket option named SO_ORIGINAL_DST
which contains the original packet destination. Envoy just has to read this option to decide what to do with this packet.
Conclusion
We've seen how network traffic is redirected in an Istio service mesh using iptables REDIRECT
rules and SO_ORIGINAL_DST
socket option.
Going back to our original issue, the investigations confirmed that the culprit was Envoy's idle timeout. From what I understand, it should be possible to configure it, but I didn't figure out yet how to do so.
Top comments (0)