How to validate your Kubernetes liveness probes using Gremlin (2024)

Introduction

In this tutorial, we'll show you how to create and validate a liveness probe for a Kubernetes application. We'll walk through deploying the Gremlin Kubernetes agent, deploying an application with a liveness probe, run a Latency attack, then validate that the probe triggers.

By reading this tutorial, you'll learn how to create a liveness probe and validate that it works as expected.

Want more? Check out our comprehensive eBook, "Kubernetes Reliability at Scale."

Overview

This tutorial will show you how to:

  • Step 1 - Deploy a multi-node Kubernetes cluster
  • Step 2 - Deploy the Gremlin Kubernetes agent
  • Step 3 - Deploy an application with a liveness probe configured
  • Step 4 - Run a Latency experiment to validate your liveness probe configuration

Background

In Kubernetes, a probe is an automated process that checks the state of a container at different stages in its lifecycle, or at regular intervals. Kubernetes includes three types of probes: liveness probes, startup probes, and readiness probes.

  • Liveness probes are essentially health checks. They periodically run a command (or send an HTTP request to an endpoint) on a container and wait for a response. If a response doesn't arrive, or the container returns a failure, or fails to respond within a certain time frame, the probe triggers a restart of the container.
  • Startup probes are used alongside liveness probes to protect slow-starting containers. If you have a container that takes a long time to start—longer than your liveness probe threshold—you can use a startup probe to delay the liveness probe until the container has fully started.
  • Readiness probes are also alongside liveness probes for containers that have finished starting, but aren't yet ready to serve traffic. This is for containers that do some kind of initialization task, like loading data from a remote data store or waiting for a downstream dependency to become available. Readiness probes can detect this and prevent liveness probes from restarting the container, while also preventing requests from reaching the container.

For more on the different types of probes, see the Kubernetes docs. The Google Cloud team also has a great blog post on Kubernetes probes and best practices.

Prerequisites

Before starting this tutorial, you’ll need the following:

  • A Gremlin account
  • A Kubernetes cluster
  • The kubectl command-line tool
  • The Helm command-line tool

Step 1 - Deploy a multi-node Kubernetes cluster

For this tutorial, you'll need a Kubernetes cluster with at least two worker nodes. If you already have a cluster, you can continue to step 2. If not, you can create your own cluster using a tool like K3s, minikube, or kubeadm, or use a cloud Kubernetes platform like Amazon EKS, Azure Kubernetes Service, or Google Kubernetes Engine.

Once your cluster is deployed, verify that you have at least two active worker nodes by opening a terminal and running the following command :

BASH

kubectl get nodes

In this example, we have one control plane node and two workers:

SHELL

NAME STATUS ROLES AGE VERSIONgremlin-lab-control-plane Ready control-plane,master 7d v1.21.1gremlin-lab-worker Ready  7d v1.21.1gremlin-lab-worker2 Ready  7d v1.21.1

Step 2 - Deploy the Gremlin Kubernetes agent

The Gremlin Kubernetes agent is the recommended method of deploying Gremlin to a Kubernetes cluster. It allows Gremlin to detect resources such as Pods, Deployments, and DaemonSets, and select them as targets for running experiments. If you've already installed and configured the agent, skip to step 3.

To install the agent, follow the installation instructions in the documentation. Make sure to replace <span class="code-class-custom">$GREMLIN_TEAM_ID</span> with your actual team ID, <span class="code-class-custom">$GREMLIN_TEAM_SECRET</span> with your team secret key, and <span class="code-class-custom">$GREMLIN_CLUSTER_ID</span> with a unique name for this cluster. You'll use the cluster ID to identify the cluster in the Gremlin web app. You can learn more about team IDs and secret keys in the Authentication documentation.

Once the agent is installed, you can confirm that it's configured correctly by logging into your Gremlin account, opening the Agents page, then clicking the Kubernetes tab. Your cluster will be listed with the name provided by <span class="code-class-custom">$GREMLIN_CLUSTER_ID</span> :

How to validate your Kubernetes liveness probes using Gremlin (1)

Step 3 - Deploy an application with a liveness probe configured

Now we'll deploy an application and create a liveness probe. For this tutorial, we'll use the open source Online Boutique application created by the Google Cloud Platform team. Online Boutique is a demo e-commerce website consisting of 12 services, each one running in a separate Deployment. Several of these Deployments already have liveness probes configured. The one we'll focus on for this tutorial is the frontend, which has both a liveness and readiness probe.

If we look at the application's Kubernetes manifest, we see that there's a liveness probe configured to run an HTTP request over port <span class="code-class-custom">8080</span> on the container's <span class="code-class-custom">/_healthz</span> endpoint. Kubernetes will wait 10 seconds before running the first check according to the line <span class="code-class-custom">initialDelaySeconds: 10</span> , and since <span class="code-class-custom">periodSeconds</span> isn't defined, Kubernetes uses the default value of 10 seconds. By default, the probe will also time out after 1 second (<span class="code-class-custom">timeoutSeconds: 1</span>), will retry three times before reporting a failure (<span class="code-class-custom">failureThreshold: 3</span>), and will report a success for each successful check (<span class="code-class-custom">successThreshold: 1</span>). According to line 238, it will also include a cookie in the HTTP headers identifying the request as coming from the liveness probe.

To summarize: the frontend Pod defines a liveness probe that will send a request to the Pod's <span class="code-class-custom">/_healthz</span> URL on port <span class="code-class-custom">8080</span> every 10 seconds. It will wait 10 seconds before running the first check, and will require three consecutive failed checks before reporting a failure.

The frontend also has a readiness probe that follows the same behavior. The only difference is that this probe's HTTP cookie is set to <span class="code-class-custom">shop_session-id=x-readiness-probe</span> instead of <span class="code-class-custom">shop_session-id=x-liveness-probe</span>.

To deploy the Online Boutique, either follow the quick start guide or simply run the following command:

BASH

kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/microservices-demo/release/v0.3.6/release/kubernetes-manifests.yaml

You can check on the status of the deployment by running:

BASH

kubectl get deployments

SHELL

NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGEdefault adservice 1/1 1 1 5d21hdefault cartservice 1/1 1 1 5d21hdefault checkoutservice 1/1 1 1 5d21hdefault currencyservice 1/1 1 1 5d21hdefault emailservice 1/1 1 1 5d21hdefault frontend 1/1 1 1 5d21hdefault loadgenerator 1/1 1 1 5d21hdefault paymentservice 1/1 1 1 5d21hdefault productcatalogservice 1/1 1 1 5d21hdefault recommendationservice 1/1 1 1 5d21hdefault redis-cart 1/1 1 1 5d21hdefault shippingservice 1/1 1 1 5d21h

We can check the status of our liveness probe by running <span class="code-class-custom">kubectl describe</span> on the frontend Deployment and looking for the <span class="code-class-custom">Liveness</span> field. Note that the <span class="code-class-custom">success</span> and <span class="code-class-custom">failure</span> parameters are commented out, meaning they're using their default values of 1 and 3 respectively:

BASH

kubectl describe deploy/frontend

SHELL

...Liveness: http-get http://:8080/_healthz delay=10s timeout=1s period=10s #success=1 #failure=3...

Step 4 - Run a Latency experiment to validate your liveness probe configuration

The last step is to ensure that our liveness probe works the way we expect it to. In step 3, we found that our liveness probe is configured to run every 10 seconds, will fail after 3 unsuccessful checks, and considers a check unsuccessful if it takes longer than 1 second for the endpoint to respond. To test it, we'll use a Latency attack to delay all network traffic from the Pod. We'll add enough latency to exceed the timeout period for at least 30 seconds to ensure at least three failed checks in a row.

Note

Since the liveness check originates from the kubelet process running on the host, we can't run a Latency attack on the host itself. Instead, we'll need to run it on the frontend Deployment.

First, log into the Gremlin web app and create a new attack by navigating to Attacks, then New Attack. Select the Infrastructure tab, then select Kubernetes. Expand Deployments, then select the frontend Deployment.

Next, click on Choose a Gremlin to expand the attack selection box, select the Network category, then select Latency. Increase MS to <span class="code-class-custom">1000</span> to add one full second of latency, then click Unleash Gremlin to start the attack.

While the attack is running, use <span class="code-class-custom">kubectl</span> (or a monitoring tool) to monitor the state of your frontend Deployment. You can check the current health of the frontend Deployment by running the following command:

BASH

kubectl get deploy/frontend

SHELL

NAME READY UP-TO-DATE AVAILABLE AGEfrontend 1/1 1 1 141m

Tip

An easy way to continuously monitor your Deployment is by using the watch command. For example, this command refreshes the output of kubectl every 2 seconds:watch -n 2 'kubectl get deploy/frontend'

Shortly after the attack starts, we see the following:

SHELL

NAME READY UP-TO-DATE AVAILABLE AGEfrontend 0/1 1 0 153m

We also see an interesting development in the Gremlin web app. Instead of showing the attack stage as "Successful", it instead shows "Agent Aborted".

How to validate your Kubernetes liveness probes using Gremlin (2)

If we scroll down to the Executions section and click on our host, we see this in the execution log:

SHELL

2022-04-19 17:15:30 - Gremlin did not exit normally: container failure: Container runtime killed Gremlin during attack:Reason: UnknownTargetStatus: ExitedDetails: The container runtime kills the Gremlin attack when the target of the attack is terminated, either by intentional removal by an operator, or due the target container exiting from an error or a failed health check. For more information about this behavior see our Knowledge Base: https://support-site.gremlin.com/support/solutions/articles/151000015041-exit-code-137

In other words, the liveness probe worked as intended and terminated the frontend Pod when it took too long to respond. Because the Gremlin container running the latency attack no longer had a valid attack target, it also terminated. This is why the web app reports "Agent Aborted." If we look back at our kubectl output a few moments later, we'll see that the Pod has restarted and the Deployment is healthy again:

SHELL

NAME READY UP-TO-DATE AVAILABLE AGEfrontend 1/1 1 1 154m

This shows that our liveness probe is working reliably and keeping our application available and responsive.

Conclusion

In this tutorial, we used Gremlin to validate that our liveness probe will keep our frontend Deployment running reliably. There are additional steps we can take to improve reliability even further, such as scaling up our Deployment to two or more Pods to avoid the small amount of downtime that happened when the probe terminated the old Pod.

For more ideas on experiments to run on your Kubernetes cluster, check out our white paper on the first 5 Chaos Experiments to run on Kubernetes.

How to validate your Kubernetes liveness probes using Gremlin (2024)
Top Articles
Where can I find historical stock prices for stock tickers no longer traded?
What Is the CIA Triad?
English Bulldog Puppies For Sale Under 1000 In Florida
Katie Pavlich Bikini Photos
Gamevault Agent
Pieology Nutrition Calculator Mobile
Hocus Pocus Showtimes Near Harkins Theatres Yuma Palms 14
Hendersonville (Tennessee) – Travel guide at Wikivoyage
Compare the Samsung Galaxy S24 - 256GB - Cobalt Violet vs Apple iPhone 16 Pro - 128GB - Desert Titanium | AT&T
Vardis Olive Garden (Georgioupolis, Kreta) ✈️ inkl. Flug buchen
Craigslist Dog Kennels For Sale
Things To Do In Atlanta Tomorrow Night
Non Sequitur
Crossword Nexus Solver
How To Cut Eelgrass Grounded
Pac Man Deviantart
Alexander Funeral Home Gallatin Obituaries
Energy Healing Conference Utah
Geometry Review Quiz 5 Answer Key
Hobby Stores Near Me Now
Icivics The Electoral Process Answer Key
Allybearloves
Bible Gateway passage: Revelation 3 - New Living Translation
Yisd Home Access Center
Home
Shadbase Get Out Of Jail
Gina Wilson Angle Addition Postulate
Celina Powell Lil Meech Video: A Controversial Encounter Shakes Social Media - Video Reddit Trend
Walmart Pharmacy Near Me Open
Marquette Gas Prices
A Christmas Horse - Alison Senxation
Ou Football Brainiacs
Access a Shared Resource | Computing for Arts + Sciences
Vera Bradley Factory Outlet Sunbury Products
Pixel Combat Unblocked
Movies - EPIC Theatres
Cvs Sport Physicals
Mercedes W204 Belt Diagram
Mia Malkova Bio, Net Worth, Age & More - Magzica
'Conan Exiles' 3.0 Guide: How To Unlock Spells And Sorcery
Teenbeautyfitness
Where Can I Cash A Huntington National Bank Check
Topos De Bolos Engraçados
Sand Castle Parents Guide
Gregory (Five Nights at Freddy's)
Grand Valley State University Library Hours
Holzer Athena Portal
Hello – Cornerstone Chapel
Stoughton Commuter Rail Schedule
Nfsd Web Portal
Selly Medaline
Latest Posts
Article information

Author: Aracelis Kilback

Last Updated:

Views: 5852

Rating: 4.3 / 5 (44 voted)

Reviews: 83% of readers found this page helpful

Author information

Name: Aracelis Kilback

Birthday: 1994-11-22

Address: Apt. 895 30151 Green Plain, Lake Mariela, RI 98141

Phone: +5992291857476

Job: Legal Officer

Hobby: LARPing, role-playing games, Slacklining, Reading, Inline skating, Brazilian jiu-jitsu, Dance

Introduction: My name is Aracelis Kilback, I am a nice, gentle, agreeable, joyous, attractive, combative, gifted person who loves writing and wants to share my knowledge and understanding with you.