Install Istio, Configure Your Proxy, & Preview Telemetry

Setting up Istio with open source telemetry, then installing a basic app and observing it.


Introduction

This is a full tutorial, complete with working examples, on installing Istio with open source telemetry (like Jaeger, as opposed to Google Stack Driver), configuring the proxy to serve an application, and a peak into how to observe the telemetry using their UIs.

I use GKE to manage my Kubernetes clusters, so there is some GCP specific stuff in here. A lot of it though is generic and can be applied to any Kubernetes cluster, managed or upstream.

What This Tutorial Covers

What This Tutorial Covers
  1. Installing Istio On A Kubernetes Cluster
  2. Configure Istio's Envoy Proxy To Serve An App In A Different Namespace
  3. View The Telemetry UIs And How They Automatically Observe The App

What You Need For This Tutorial

What You Need For This Tutorial

A Kubernetes Cluster


Install Istio

Istio is best installed using Helm, so first, install the Helm and Istio clients with the following commands (replace the Istio semver number if there is a more recent version available):

Btw, these instructions are for Helm version 2. Helm version 3 no longer uses Tiller.

Now we need to do create certificates that will be used to establish mutual TLS between helm and tiller. This is a really important security step. You should never use tiller without certs in production. Run the following commands (you will need my tls.sh file, which can be found at: https://gist.github.com/efossas/59d38ab9ba8f33f94c94e6fa879c15bb):

Tiller needs a service account and cluster role binding, so download the following file to a file named helm-tiller.yml

Now run the following commands to create the service account and cluster role binding, and then to initialize the tiller deployment in your cluster.

Before we can install Istio with Helm, we need to manually create some resources. First is the istio-system namespace and the second is the secret used by Kiali as the default username and password. Create the following file (you should replace the kiali admin and password; they should be base64 encoded) into a file called istio-namespace-kiali-secret.yml.

Now run the following commands to create the resources and also to prepare the default namespace for auto sidcar injection, which allows your pods to be automatically hooked up to Istio's proxy and telemetry.

We need to run one more command before setting up Istio. This command creates necessary Istio custom resource definitions:

One more step, create a file name istio.yml and enter the following (read over it and edit as you need). This includes my personal values file for the Istio Helm chart. You can see all possible values here: values.yml. My version, enables the open source telemetry services (Jaeger, Prometheus, Kiali, Grafana).

Now we can finally install Istio!

DNS & Certs

I'm not going to go into specific detail about setting up DNS and certificates, but I'll just give the general idea of what you need to prepare before the next section.

In the following section we're going to set up proxies to all of the telemetry UIs. I don't recommend that you blindly do this in production. You should put them behind some authentication wall (I use Keycloak with JWTs for that, but I'll save that set up for another tutorial). But we're going to do that here so you can play with the telemetry UIs and see how the proxy configuration works. Without it, you can just proxy-forward to the services. Anyways, for the proxies to work, you'll need to make DNS A records that point domain names to the istio-ingressgateway.

You can find the IP address to the istio-ingressgatway with the following command:


kubectl describe service istio-ingressgateway -n istio-system | grep 'LoadBalancer Ingress'
      

Now use that IP address to setup DNS entries for the following domains:

We'll be securing the proxies with TLS (HTTPS). Currently, Istio doesn't have an easy way to auto-create certificates for gateways, so we'll be creating certificate resources. To do so, we'll need a certificate issuer. They're are many different ways to go about this. I'm going to briefly explain the certificate issuer I use, but not go into specific detail to keep this tutorial shorter.

I use a CertManager ClusterIssuer that uses the DNS01 authentication mechanism with GCP. To set it up, do the following:


kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.6/deploy/manifests/00-crds.yaml
      

Create a GCP service account with DNS Administrator role. When given the option to create a key, create one, base 64 encode it, and add it to the following file called clusterIssuer.yml:

Now create the resources by running the following:


kubectl apply -f clusterIssuer.yml
      

Now we're ready to create https enabled proxy entries.

Proxy

There are a lot of articles that talk about the wonders of side car injection, the Envoy Proxy, and a bunch of abstract details on how managing a microservice mesh is hard (it is, I get it). But once you figure out how proxying works with Istio, it's really not that bad. Essentially you have an Ingress Gateway (our helm chart created one for us) that is an object like a Load Balancer with external IP address. Then you create a Gateway (usually one per domain name) that attaches to the Ingress Gateway and matches a protocol (http or https) and host name. Then a Virtual Service and a Destination Rule which adds routing for that gateway to your service.

Before we create these resources though, let's create certificates to enable https for each route. So create the following file in telemetry-certs.yml:

And run:


kubectl apply -f telemetry-certs.yml
      

Now here is what the proxy configuration for all 4 telemetry services looks like (if you enabled mtls when installing Istio with Helm, you'll want to change tls mode to ISTIO_MUTUAL:

Save the config to a file called telemetry-proxy.yml and run the following:


kubectl apply -f telemetry-proxy.yml
      

Once everything is up and running, you should be able to visit the domains and play around with the metrics (grafana/prometheus), tracing (jaeger), and network maps (kiali).

Okay, but what does a custom application proxy look like? For that, you can use the following file which creates everything in a separate namespace along with private registry image pulls and horizontal pod autoscaling. Obviously, you'll need to place your own custom application image in the following file. In addition, the file makes some assumptions that you should replace.

  1. The application name is "app", change it as you want
  2. The project name is "project", change it as you want
  3. It's using my gcp-letsencrypt cluster issuer
  4. You need to replace "YOUR-DOMAIN" with the domain name for your service
  5. You should replace "git-tag" with a git reference. I use branches for staging and tags for production
  6. You need to replace "YOUR-IMAGE" with your Docker application image
  7. It assumes a private image registry. After you login to your registry, you can generate .dockerconfigjson by base64 encoding the following:

kubectl create secret docker-registry --dry-run=true project-registry --docker-server=REGISTRY-URL --docker-username=REGISTRY-USERNAME --docker-password=RESGISTRY-PASSWORD --docker-email=YOUR-EMAIL -o yaml
      

Telemetry

Great, so now you have everything running with Istio (hopefully). How do we use the telemetry?

You use Kiali to see what is pods are communicating with one another and to check it they're attached to the mesh correctly. It's good for checking that the networking is as you expect.

You use Jaeger for debugging. You can see complete paths of a request through all services that it touches. If there's an error, you can search directly for any trace tagged with an error. Traces can also contain logs (requires you to instrument your services with tracing logs). When done completely and correctly, tracing is your best friend to find and solving code problems quickly.

Grafana receives metrics from Prometheus, so they go hand in hand. You use Grafana to get an idea of how much cpu and memory usage your services are taking up. You receive the information live with graphs. It's a good idea to send a load test to your service, observe it's usage, and then define your resource requests, limits, and autoscaling configuration based on your observations for production.

Grafana also comes with single metric alerts and prometheus can be configured with advanced multi-metrics alerts (along with other nifty features like "inhibition", which stops one alert when another is going off and "silences", which limits how many alerts you see so as to not get overloaded with unnecessary alerts). Useful alerts are when your services are sending too many HTTP error responses, are crashing, or are continually peaking in resource usage.

Dunskies!

That was quite a bit to go through. I have templates for all these common resources and place them in a deploy directory of all of my repos. Then I pass in Gitlab CI environment variables into the templates to generate the final kubernetes files, which are then deployed with Helm. Why Helm? So that if an engineer decides to delete a resource, Helm will pick that up and delete it for them (it's idempotent). After the repo templates and pipelines are set up, managing your infrastructure becomes a lot easier as so much has now been automated for you.