kubespy trace: a real-time view into the heart of a Kubernetes Service
Why isn’t my
Pod getting any traffic?
An experienced ops team running on GKE might assemble the following checklist to help answer this question:
- Does a
Serviceexist? Does that service have a
.spec.selectorthat matches some number of
- Are the
Pods alive and has their readiness probe passed?
- Did the
Endpointsobject that specifies one or more
Pods to direct traffic to?
- Is the
Servicereachable via DNS? When you
kubectl ``execinto a
Podand you use
curlto poke the
Servicehostname, do you get a response? (If not, does any
Servicehave a DNS entry?)
- Is the
Servicereachable via IP? When you SSH into a
Nodeand you use
curlto poke the
ServiceIP, do you get a response?
kube-proxyup? Is it writing iptables rules? Is it proxying to the
This question might have the highest complexity-to-sentence-length ratio of any question in the Kubernetes ecosystem. Unfortunately, it’s also a question that every user finds themselves asking at some point. And when they do, it usually means their app is down.
To help answer questions like this, we’ve been developing a small
kubespy. In this post we’ll look at the new
kubespy trace command, which is broadly aimed at automating questions
1, 2, 3, and providing “hints” about 4 and 5.
Below is a gif demonstrating the CLI experience. You can watch in
real-time as the
Service comes online, finds pods to target, and
finally is allocated a public IP address:
What is kubespy, again?
kubespy is a simple, standalone diagnostic tool, meant to make it easy
to introspect on Kubernetes resources in real time.
Before we begin, it’s worth noting that this
re-packages the machinery we developed for Kubernetes support in
One of our major goals in this work was to make deploying an application to Kubernetes as simple as possible, by presenting a concise summary of this information in the CLI experience. See my tweetstorm on the subject, or try it out for yourself!
A real-time view of a Service’s life
kubespy repository contains
we use in this demo. The
contains detailed installation instructions, as well as explaining how
to run the app (using either
pulumi though of course we
hope you will try Pulumi).
kubespy trace service nginx will cause
to sit and wait for you to deploy a
nginx. When you
run this example, it will do just this: creating a
replicates an nginx
Pod 3 times and exposes it publicly to the
Internet with a
Service, also called
Let’s break down the
kubespy trace gif above to show that there are
actually several distinct steps in the process of booting up a
Service is created, the
Service controller creates an
Endpoints object of the same name. The
Endpoints object is to
Pods get traffic — their IPs, which ports to direct
traffic to, and so on. In this case, there are no
Pods to target,
kubespy trace tells us:
Pods that match the
created; their readiness probes immediately pass. The
object is updated to reflect this. As we will see below, if the
failed the readiness probes,
kubespy trace would note this.
Service is allocated a public IP address. The
.spec.type set to
LoadBalancer, which on most cloud platforms means
that a public IP address should be allocated for it.
Exercise: Other Service types, watching rollouts, deleting Services!
kubespy trace supports all the other
Service types, including
ClusterIP. Try both of those, and you’ll see
slightly different output. Try them! It’s also worth watching what
happens when a
Service is deleted.
You can also use
kubespy trace to watch an unhealthy deployment become
healthy. In the following gif, we see a bunch of
Pods that are failing
readiness checks become healthy as a new version is rolled out:
Confession time. Last time we told you we’d dig more into the lifecycle
Pod. And we will, at some point. But we ended up deciding that it
would be easier to explain with a cohesive
And, while this is a good start, it is only the beginning.
currently supports only
Service. In our next post, we’ll extend trace
Deployment (or perhaps
ReplicaSet), and from there, we will have
enough tools to really dig into what is happening when you roll out your
In the mean time, if you enjoyed this post, or are curious to see how this lifecycle is baked into the Pulumi CLI, give it a spin! We’d love to have your feedback.