Help Docs

Kubernetes controller manager monitoring 

Monitor the heart of your Kubernetes control plane with precision using Kubernetes controller manager monitoring. This essential component orchestrates critical operations, ensuring seamless node management, pod scheduling, and workload automation.

Gain access to detailed metrics, including request handling, resource usage, webhook activities, and workqueue queues. With these insights, you can:  

  • Pinpoint performance bottlenecks in the controller manager.  

  • Detect workqueue depth, retries, and duration as well as unfinished works.  

  • Optimize resource allocation by analyzing workloads and storage efficiency.  

Ensure your cluster operates at peak performance while maintaining a secure and efficient containerized ecosystem. 

Supported versions 

This feature is supported from Linux server monitoring agent version 20.0.0.

Control plane monitoring and the other latest features require you to upgrade your Kubernetes agent to the latest version.

Note

If you haven't added a Kubernetes monitor yet, follow these steps to add one.

Controller Manager monitor  

As soon as you upgrade your agent, the Site24x7 Kubernetes monitoring agent will fetch all the controller manager metrics.

To navigate to your Kubernetes Controller Manager monitor:

  1. Log in to your Site24x7 account.

  2. Navigate to K8s > select the Cluster > Controller Manager. This will open the list of Controller Manager monitors in the particular cluster. Click one to view detailed insights into that monitor.  

Supported metrics 

Metric Description
Go Threads The number of OS threads created by the Go runtime of the controller manager process during the last poll time
Go Routines The number of Go routines that currently exist for the controller manager process during the last poll time
Terminated Pods Tracking Finalizer (Add) The number of terminated pods (phase=Failed|Succeeded) that have the finalizer batch on the event add during the last poll period
Terminated Pods Tracking Finalizer (Delete) The number of terminated pods (phase=Failed|Succeeded) that have the finalizer batch on the event delete during the last poll period
Leader Election Status The status of the current Kubernetes controller manager instance to indicate if it is master or backup (1/0)
Process Resident Memory The amount of resident memory size in bytes used by the controller manager process during the last poll time
Process Virtual Memory The amount of virtual memory size in bytes used by the controller manager process during the last poll time
Process CPU Time The CPU time consumed by the controller manager process during the last poll period
Process Open File Descriptors The number of file descriptors that are opened by the controller manager process during the last poll time
Maximum Open File Descriptors The maximum number of open file descriptors during the last poll time
Average Request Latency The average latency per API request on the controller manager process during the last poll period
Requests Count The total API request count on the controller manager process during the last poll period
Total Requests Duration The total time taken to process all the API requests on the controller manager process during the last poll period
Rest Client Requests by Response Code
Response Code The response code number for the request
Total Rest Client Requests The total number of HTTP requests from your API server to external services or APIs grouped by code during the last poll period
Rest Client Requests by Verb
Verb The verb action of the request
Total Rest Client Requests The total number of HTTP requests from your API server to external services or APIs grouped by verb during the last poll period
Average Request Latency The average latency per API request on the controller manager process grouped by verb during the last poll period
Total Requests The total number of API requests on the controller manager process grouped by verb during the last poll period
Total Requests Duration The total time taken to process all the API requests on the controller manager process grouped by verb during the last poll period
Rest Client Requests by Host
Host The hostname of the service
Total Rest Client Requests The total number of HTTP requests from your API server to external services or APIs grouped by hostname during the last poll period
Average Request Latency The average latency per API request on the controller manager process grouped by verb during the last poll period
Total Requests The total number of API requests on the controller manager process grouped by verb during the last poll period
Total Requests Duration The total time taken to process all the API requests on the controller manager process grouped by verb during the last poll period
WorkQueue
Resource Name The name of the action or task workqueue
Total Workqueue Adds The total number of adds handled by the workqueue grouped by action name during the last poll period
Workqueue Depth The number of actions or task in the workqueue grouped by the action name to be processed during the last poll
Workqueue Retries The total number of retries handled by workqueue grouped by name during the last poll period
Workqueue Unfinished Work Duration The duration of work that has been done, is in progress, or hasn't been observed by work duration. Large values indicate stuck threads. You can deduce the number of stuck threads by observing the rate at which this increases during the last poll time
Average Workqueue Queue Duration The average duration an items stays in a workqueue before being requested during the last poll period
Total Workqueue Queue Count The total number of items requested during the last poll period
Total Workqueue Queue Duration The total duration an item stays in a workqueue before being requested during the last poll period
Average Workqueue Work Duration The average duration for an item to get processed during the last poll period
Total Workqueue Work Count The total number of items processed from workqueue during the last poll period
Total Workqueue Work Duration The total duration for processing an item from a workqueue during the last poll period
Workqueue Longest Running Processor Duration The total duration of the longest running processor for a workqueue during the last poll time

Related links: 

 

 

Was this document helpful?

Would you like to help us improve our documents? Tell us what you think we could do better.


We're sorry to hear that you're not satisfied with the document. We'd love to learn what we could do to improve the experience.


Thanks for taking the time to share your feedback. We'll use your feedback to improve our online help resources.

Shortlink has been copied!