Troubleshooting & Auditing Tanzu Kubernetes Grid Clusters Using vRealize Log Insight Cloud

Original blog posted on July 20th.

Simplify how you monitor your Tanzu Kubernetes Grid Clusters by sending logs to VMware vRealize Log Insight Cloud. Read on to learn how to use the new Kubernetes dashboards and widgets to easily log queries and keep track of your environment.

 

If you are already running Tanzu Kubernetes Grid Clusters (TKG Clusters), you can configure your environment to send logs to vRealize Log Insight Cloud by following this article by Munishpal Makhija – Configure Log forwarding from VMware Tanzu Kubernetes Cluster to vRealize Log Insight Cloud. The instructions for setting up log forwarding are also detailed under Log Sources under each application in the console. If you don’t have Tanzu Kubernetes Grid configured, you can download the Demo Appliance for Tanzu Kubernetes Grid contributed by William Lam to quickly deploy some pods for testing.

 

New Kubernetes Dashboards

The new content pack comes with three pre-built dashboards: Kube Controller Overview, Kube API Server Overview, and Kube Scheduler Overview. In addition to these dashboards, the content pack also uses several extracted fields to help you filter through logs and create additional queries and widgets based on what you want to monitor in your environment.

Below is a list of the Dashboards along with the Widgets contained in each dashboard:

Kube API Server Overview

Total Error Logs

Error Logs Occurred over a Period of Time

Total 200 Status Codes

Total Audit Logs Count

Message Source Trend

Severity Breakdown

 

Kube Controller Overview

Total Error Logs

Message Severity Trent

Message Source Trend

Severity Breakdown

Scale Up Operation Over Time

Scale Down Operation Over Time

Prod Successfully Created Over Time

Pods Successfully Deleted Over Time

 

Kube Scheduler Overview

Total Error Logs

Message Severity Trend

Message Source Trend

Severity Breakdown

Pods Failed Scheduling

 

First Step – Verify Log Flow

Once you have your environment deployed and logging configured, you can verify log flow in the UI:

 

Enable the Kubernetes Content Pack

You will want to enable the Kubernetes content pack In the Cloud Services console for vRealize Log Insight Cloud: Content Packs > Cloud Services > Kubernetes Auditing. In order to enable content packs, you will need the role of vRealize Log Insight Cloud Administrator.

 

Review Content Pack Dashboards

As a TKG cluster administrator, you will want an easy way to view the status of the environment via logs files. This is available through content pack dashboards:

I want to copy all the Kubernetes Auditing dashboards to my dashboards section for easy access to the log info I care about. When I clone a dashboard, it will save a copy under “My Dashboards”. From there I can modify my own content without impacting the default dashboards that are included with the content pack.

Now when I click on Dashboards, I see the three dashboards I cloned in the “My Dashboards” view.

 

Explore Log Data

In addition to the out of the box dashboards, the Kubernetes content pack provides a number of Kubernetes specific fields that have been extracted to easily filter through log data. Not only can you focus your queries on specific infrastructure components, you can also create custom dashboards based on these queries.

Perhaps I only monitor Kubernetes clusters running on TKG clusters. I can query my Kubernetes logs by filtering on environment from the predefined fields and entering the value “Tanzu_k8s_grid”. In addition to the area chart, I can change the data visualization to show a table of my Kubernetes Namespaces.

Once I am satisfied with my query and chart, I can add it to My Dashboards.

 

Configure an Alert Based on a Query

I may want to set up some alerts for my clusters. I can easily create an alert from one of the queries that were used to build the widget. Under the controller overview dashboard, there is already a query for pods deleted over time.

Note: By default, the dashboard results are only showing the last 5 minutes of data. You can change the time frame in the upper right-hand corner to one of the predefined intervals or specify a time range to search. The longer the time period you are searching, the longer it will take to display the results.

After I increase the time range, I can see that someone has deleted a pod. I can click the dots in the upper right-hand corner of the widget to view the log messages related to this widget.

When I view the log query, I can see the filter is set to Text containing “”reason: ‘SuccessfulDelete’ Deleted pod”” AND log_type contains “kube_controller”. I would like to get an alert when a pod is deleted. I can click on the alert icon in the upper right-hand corner to create an alert based on this query.

From there I enter the details for this alert:

After you click on save you will be redirected to alert definitions. Select the notification method and click on the dots in the upper-right hand corner to enable the new alert.

 

Now that you have some basics on using the Kubernetes content pack, please let us know what you think in the comments below! You can find more information on using vRealize Log Insight Cloud in our documentation portal and in our User Guide. If you would like to search all your infrastructure and application logs across cloud environments talk to your VMware team or click here to try a free 30-day trial.

About the Authors

Leave a Reply

Your email address will not be published. Required fields are marked *