Kubernetes w/ Prometheus

Pack assets

Templates

The Monitoring Connector Kubernetes w/ Prometheus brings a host template:

Cloud-Prometheus-Kubernetes-Api-custom

The connector brings the following service templates (sorted by the host template they are attached to):

Cloud-Prometheus-Kubernetes-Api-custom

Service Alias	Service Template	Service Description
Container-Status	Cloud-Prometheus-Kubernetes-Container-Status-Api-custom	Check the status of the containers
Daemonset-Status	Cloud-Prometheus-Kubernetes-Daemonset-Status-Api-custom	Check the status of the daemonsets
Deployment-Status	Cloud-Prometheus-Kubernetes-Deployment-Status-Api-custom	Check the status of the deployments
Namespace-Status	Cloud-Prometheus-Kubernetes-Namespace-Status-Api-custom	Check the status of the namespaces
Node-Status	Cloud-Prometheus-Kubernetes-Node-Status-Api-custom	Check the status of the nodes

The services listed above are created automatically when the Cloud-Prometheus-Kubernetes-Api-custom host template is used.

Collected metrics & status

Here is the list of services for this connector, detailing all metrics linked to each service.

Container-Status
Daemonset-Status
Deployment-Status
Namespace-Status
Node-Status

Metric name	Unit
containers#status	N/A
containers#containers.restarts.count	count

To obtain this new metric format, include --use-new-perfdata in the EXTRAOPTIONS service macro.

Metric name	Unit
daemonsets#daemonset.nodes.misscheduled.count	count

Metric name	Unit
deployments#deployment.replicas.uptodate.count	count

Metric name	Unit
namespaces.active.count	count
namespaces.terminating.count	count
namespaces#status	N/A

Metric name	Unit
nodes#status	N/A
nodes#pods.allocated.count	count

Installing the monitoring connector

Pack

The installation procedures for monitoring connectors are slightly different depending on whether your license is offline or online.

If the platform uses an online license, you can skip the package installation instruction below as it is not required to have the connector displayed within the Configuration > Connectors > Monitoring Connectors menu. If the platform uses an offline license, install the package on the central server with the command corresponding to the operating system's package manager:

Alma / RHEL / Oracle Linux 8
Alma / RHEL / Oracle Linux 9
Debian 11 & 12
CentOS 7

dnf install centreon-pack-cloud-prometheus-kubernetes-api

dnf install centreon-pack-cloud-prometheus-kubernetes-api

apt install centreon-pack-cloud-prometheus-kubernetes-api

yum install centreon-pack-cloud-prometheus-kubernetes-api

Whatever the license type (online or offline), install the Kubernetes w/ Prometheus connector through the Configuration > Connectors > Monitoring Connectors menu.

Plugin

Since Centreon 22.04, you can benefit from the 'Automatic plugin installation' feature. When this feature is enabled, you can skip the installation part below.

You still have to manually install the plugin on the poller(s) when:

Automatic plugin installation is turned off
You want to run a discovery job from a poller that doesn't monitor any resource of this kind yet

More information in the Installing the plugin section.

Use the commands below according to your operating system's package manager:

Alma / RHEL / Oracle Linux 8
Alma / RHEL / Oracle Linux 9
Debian 11 & 12
CentOS 7

dnf install centreon-plugin-Cloud-Prometheus-Kubernetes-Api

dnf install centreon-plugin-Cloud-Prometheus-Kubernetes-Api

apt install centreon-plugin-cloud-prometheus-kubernetes-api

yum install centreon-plugin-Cloud-Prometheus-Kubernetes-Api

Using the monitoring connector

Using a host template provided by the connector

Log into Centreon and add a new host through Configuration > Hosts.
Fill in the Name, Alias & IP Address/DNS fields according to your resource's settings.
Apply the Cloud-Prometheus-Kubernetes-Api-custom template to the host. A list of macros appears. Macros allow you to define how the connector will connect to the resource, and to customize the connector's behavior.
Fill in the macros you want. Some macros are mandatory.

Macro	Description	Default value	Mandatory
PROMETHEUSAPIHOSTNAME	Prometheus hostname		X
PROMETHEUSAPIPROTO	Specify https if needed (default: 'http')	http
PROMETHEUSAPIPORT	API port (default: 9090)	9090
PROMETHEUSAPIURL	API url path (default: '/api/v1')	/api/v1
EXTRAOPTIONS	Any extra option you may want to add to every command (a --verbose flag for example). All options are listed here.

Deploy the configuration. The host appears in the list of hosts, and on the Resources Status page. The command that is sent by the connector is displayed in the details panel of the host: it shows the values of the macros.

Using a service template provided by the connector

If you have used a host template and checked Create Services linked to the Template too, the services linked to the template have been created automatically, using the corresponding service templates. Otherwise, create manually the services you want and apply a service template to them.
Fill in the macros you want (e.g. to change the thresholds for the alerts). Some macros are mandatory (see the table below).

Container-Status
Daemonset-Status
Deployment-Status
Namespace-Status
Node-Status

Macro	Description	Default value
CONTAINER	Filter on a specific container (must be a PromQL filter, Default: 'container=~".*"')	container=~".*"
POD	Filter on a specific pod (must be a PromQL filter, Default:'pod=~".*"')	pod=~".*"
WARNINGRESTARTSCOUNT	Warning threshold for container restarts count
CRITICALRESTARTSCOUNT	Critical threshold for container restarts count
CRITICALSTATUS	Define the conditions to match for the status to be CRITICAL (default: '%{status} !~ /running/ \|\| %{state} !~ /ready/'). You can use the following variables: %{status}, %{state}, %{reason}	%{status} !~ /running/ \|\| %{state} !~ /ready/
WARNINGSTATUS	Define the conditions to match for the status to be WARNING (default: '') You can use the following variables: %{status}, %{state}, %{reason}
EXTRAOPTIONS	Any extra option you may want to add to the command (a --verbose flag for example). All options are listed here.	--verbose

Macro	Description	Default value
DAEMONSET	Filter on a specific daemonset (must be a PromQL filter, Default: 'daemonset=~".*"')	daemonset=~".*"
WARNINGSTATUS	Define the conditions to match for the status to be WARNING (default: '%{up_to_date} < %{desired}') You can use the following variables: %{display}, %{desired}, %{current}, %{available}, %{unavailable}, %{up_to_date}, %{ready}, %{misscheduled}	%{up_to_date} < %{desired}
CRITICALSTATUS	Define the conditions to match for the status to be CRITICAL (default: '%{available} < %{desired}'). You can use the following variables: %{display}, %{desired}, %{current}, %{available}, %{unavailable}, %{up_to_date}, %{ready}, %{misscheduled}	%{available} < %{desired}
EXTRAOPTIONS	Any extra option you may want to add to the command (a --verbose flag for example). All options are listed here.	--verbose

Macro	Description	Default value
DEPLOYMENT	Filter on a specific deployment (must be a PromQL filter, Default: 'deployment=~".*"')	deployment=~".*"
WARNINGSTATUS	Define the conditions to match for the status to be WARNING (default: '%{up_to_date} < %{desired}') You can use the following variables: %{display}, %{desired}, %{current}, %{available}, %{unavailable}, %{up_to_date}	%{up_to_date} < %{desired}
CRITICALSTATUS	Define the conditions to match for the status to be CRITICAL (default: '%{available} < %{desired}'). You can use the following variables: %{display}, %{desired}, %{current}, %{available}, %{unavailable}, %{up_to_date}	%{available} < %{desired}
EXTRAOPTIONS	Any extra option you may want to add to the command (a --verbose flag for example). All options are listed here.	--verbose

Macro	Description	Default value
NAMESPACE	Filter on a specific namespace (must be a PromQL filter, Default: 'namespace=~".*"')	namespace=~".*"
PHASE	Filter on a specific phase (must be a PromQL filter, Default:'phase=~".*"')	phase=~".*"
CRITICALSTATUS	Define the conditions to match for the status to be CRITICAL (default: '%{phase} !~ /Active/'). You can use the following variables: %{display}, %{phase}	%{phase} !~ /Active/
WARNINGSTATUS	Define the conditions to match for the status to be WARNING (default: '') You can use the following variables: %{display}, %{phase}
EXTRAOPTIONS	Any extra option you may want to add to the command (a --verbose flag for example). All options are listed here.	--verbose

Macro	Description	Default value
NODE	Filter on a specific node (must be a PromQL filter, Default:'node=~".*"')	node=~".*"
UNITS	Units of thresholds (default: '') (can be '%')
WARNINGALLOCATEDPODS	Warning threshold for pods allocation
CRITICALALLOCATEDPODS	Critical threshold for pods allocation
CRITICALSTATUS	Define the conditions to match for the status to be CRITICAL (default: '%{status} !~ /Ready/ \|\| %{schedulable} != /false/'). You can use the following variables: %{display}, %{status}, %{schedulable}	%{status} !~ /Ready/ \|\| %{schedulable} =~ /false/
WARNINGSTATUS	Define the conditions to match for the status to be WARNING (default: '') You can use the following variables: %{display}, %{status}, %{schedulable}
EXTRAOPTIONS	Any extra option you may want to add to the command (a --verbose flag for example). All options are listed here.	--verbose

Deploy the configuration. The service appears in the list of services, and on the Resources Status page. The command that is sent by the connector is displayed in the details panel of the service: it shows the values of the macros.

How to check in the CLI that the configuration is OK and what are the main options for?

Once the plugin is installed, log into your Centreon poller's CLI using the centreon-engine user account (su - centreon-engine). Test that the connector is able to monitor a resource using a command like this one (replace the sample values by yours):

/usr/lib/centreon/plugins/centreon_prometheus_kubernetes_api.pl \
	--plugin=cloud::prometheus::direct::kubernetes::plugin \
	--mode=node-status \
	--hostname= \
	--url-path='/api/v1' \
	--port='9090' \
	--proto='http'  \
	--node='node=~".*"' \
	--warning-status='' \
	--critical-status='%\{status\} !~ /Ready/ || %\{schedulable\} =~ /false/' \
	--warning-allocated-pods='' \
	--critical-allocated-pods='' \
	--units='' \
	--verbose

The expected command output is shown below:

OK: All nodes status are ok | '*nodes*#pods.allocated.count'=42;;;;

Troubleshooting

Please find the troubleshooting documentation for the API-based plugins in this chapter.

Available modes

In most cases, a mode corresponds to a service template. The mode appears in the execution command for the connector. In the Centreon interface, you don't need to specify a mode explicitly: its use is implied when you apply a service template. However, you will need to specify the correct mode for the template if you want to test the execution command for the connector in your terminal.

All available modes can be displayed by adding the --list-mode parameter to the command:

/usr/lib/centreon/plugins/centreon_prometheus_kubernetes_api.pl \
	--plugin=cloud::prometheus::direct::kubernetes::plugin \
	--list-mode

The plugin brings the following modes:

Mode	Linked service template
container-status [code]	Cloud-Prometheus-Kubernetes-Container-Status-Api-custom
daemonset-status [code]	Cloud-Prometheus-Kubernetes-Daemonset-Status-Api-custom
deployment-status [code]	Cloud-Prometheus-Kubernetes-Deployment-Status-Api-custom
list-containers [code]	Not used in this Monitoring Connector
list-daemonsets [code]	Not used in this Monitoring Connector
list-deployments [code]	Not used in this Monitoring Connector
list-namespaces [code]	Not used in this Monitoring Connector
list-nodes [code]	Not used in this Monitoring Connector
list-services [code]	Not used in this Monitoring Connector
namespace-status [code]	Cloud-Prometheus-Kubernetes-Namespace-Status-Api-custom
node-status [code]	Cloud-Prometheus-Kubernetes-Node-Status-Api-custom

Available options

Generic options

All generic options are listed here:

Option	Description
--mode	Define the mode in which you want the plugin to be executed (see--list-mode).
--dyn-mode	Specify a mode with the module's path (advanced).
--list-mode	List all available modes.
--mode-version	Check minimal version of mode. If not, unknown error.
--version	Return the version of the plugin.
--custommode	When a plugin offers several ways (CLI, library, etc.) to get information the desired one must be defined with this option.
--list-custommode	List all available custom modes.
--multiple	Multiple custom mode objects. This may be required by some specific modes (advanced).
--pass-manager	Define the password manager you want to use. Supported managers are: environment, file, keepass, hashicorpvault and teampass.
--verbose	Display extended status information (long output).
--debug	Display debug messages.
--filter-perfdata	Filter perfdata that match the regexp. Example: adding --filter-perfdata='avg' will remove all metrics that do not contain 'avg' from performance data.
--filter-perfdata-adv	Filter perfdata based on a "if" condition using the following variables: label, value, unit, warning, critical, min, max. Variables must be written either %{variable} or %(variable). Example: adding --filter-perfdata-adv='not (%(value) == 0 and %(max) eq "")' will remove all metrics whose value equals 0 and that don't have a maximum value.
--explode-perfdata-max	Create a new metric for each metric that comes with a maximum limit. The new metric will be named identically with a '_max' suffix). Example: it will split 'used_prct'=26.93%;0:80;0:90;0;100 into 'used_prct'=26.93%;0:80;0:90;0;100 'used_prct_max'=100%;;;;
--change-perfdata --extend-perfdata	Change or extend perfdata. Syntax: --extend-perfdata=searchlabel,newlabel,target[,[newuom],[min],[m ax]] Common examples: Convert storage free perfdata into used: --change-perfdata='free,used,invert()' Convert storage free perfdata into used: --change-perfdata='used,free,invert()' Scale traffic values automatically: --change-perfdata='traffic,,scale(auto)' Scale traffic values in Mbps: --change-perfdata='traffic_in,,scale(Mbps),mbps' Change traffic values in percent: --change-perfdata='traffic_in,,percent()'
--extend-perfdata-group	Add new aggregated metrics (min, max, average or sum) for groups of metrics defined by a regex match on the metrics' names. Syntax: --extend-perfdata-group=regex,namesofnewmetrics,calculation[,[ne wuom],[min],[max]] regex: regular expression namesofnewmetrics: how the new metrics' names are composed (can use $1, $2... for groups defined by () in regex). calculation: how the values of the new metrics should be calculated newuom (optional): unit of measure for the new metrics min (optional): lowest value the metrics can reach max (optional): highest value the metrics can reach Common examples: Sum wrong packets from all interfaces (with interface need --units-errors=absolute): --extend-perfdata-group=',packets_wrong,sum(packets_(discard \|error)_(in\|out))' Sum traffic by interface: --extend-perfdata-group='traffic_in_(.*),traffic_$1,sum(traf fic_(in\|out)_$1)'
--change-short-output --change-long-output	Modify the short/long output that is returned by the plugin. Syntax: --change-short-output=pattern~~replacement~~modifier Most commonly used modifiers are i (case insensitive) and g (replace all occurrences). Example: adding --change-short-output='OKUpgi' will replace all occurrences of 'OK', 'ok', 'Ok' or 'oK' with 'Up'
--change-exit	Replace an exit code with one of your choice. Example: adding --change-exit=unknown=critical will result in a CRITICAL state instead of an UNKNOWN state.
--range-perfdata	Rewrite the ranges displayed in the perfdata. Accepted values: 0: nothing is changed. 1: if the lower value of the range is equal to 0, it is removed. 2: remove the thresholds from the perfdata.
--filter-uom	Mask the units when they don't match the given regular expression.
--opt-exit	Replace the exit code in case of an execution error (i.e. wrong option provided, SSH connection refused, timeout, etc). Default: unknown.
--output-ignore-perfdata	Remove all the metrics from the service. The service will still have a status and an output.
--output-ignore-label	Remove the status label ("OK:", "WARNING:", "UNKNOWN:", CRITICAL:") from the beginning of the output. Example: 'OK: Ram Total:...' will become 'Ram Total:...'
--output-xml	Return the output in XML format (to send to an XML API).
--output-json	Return the output in JSON format (to send to a JSON API).
--output-openmetrics	Return the output in OpenMetrics format (to send to a tool expecting this format).
--output-file	Write output in file (can be combined with json, xml and openmetrics options). E.g.: --output-file=/tmp/output.txt will write the output in /tmp/output.txt.
--disco-format	Applies only to modes beginning with 'list-'. Returns the list of available macros to configure a service discovery rule (formatted in XML).
--disco-show	Applies only to modes beginning with 'list-'. Returns the list of discovered objects (formatted in XML) for service discovery.
--float-precision	Define the float precision for thresholds (default: 8).
--source-encoding	Define the character encoding of the response sent by the monitored resource Default: 'UTF-8'. Prometheus Rest API
--timeframe	Set timeframe in seconds (i.e. 3600 to check last hour).
--step	Set the step of the metric query (examples: '30s', '1m', '15m','1h').
--hostname	Prometheus hostname.
--url-path	API url path (default: '/api/v1')
--port	API port (default: 9090)
--proto	Specify https if needed (default: 'http')
--credentials	Specify this option if you access the API with authentication
--username	Specify the username for authentication (mandatory if --credentials is specified)
--password	Specify the password for authentication (mandatory if --credentials is specified)
--basic	Specify this option if you access the API over basicauthentication and don't want a '401 UNAUTHORIZED' error to be logged on your web server. Specify this option if you access the API over hidden basic authentication or you'll get a '404 NOT FOUND' error. (use with --credentials)
--timeout	Set HTTP timeout
--header	Set HTTP header (can be multiple, example: --header='Authorization:Bearer ABCD') Useful to access Prometheus API hosted in a specific environment.
--http-peer-addr	Set the address you want to connect to. Useful if hostname is only a vhost, to avoid IP resolution.
--proxyurl	Proxy URL. Example: http://my.proxy:3128
--proxypac	Proxy pac file (can be a URL or a local file).
--insecure	Accept insecure SSL connections.
--http-backend	Perl library to use for HTTP transactions. Possible values are: lwp (default) and curl.
--ssl-opt	Set SSL Options (--ssl-opt="SSL_version => TLSv1" --ssl-opt="SSL_verify_mode => SSL_VERIFY_NONE").
--curl-opt	Set CURL Options (--curl-opt="CURLOPT_SSL_VERIFYPEER => 0" --curl-opt="CURLOPT_SSLVERSION => CURL_SSLVERSION_TLSv1_1" ).

Modes options

All available options for each service template are listed below:

Container-Status
Daemonset-Status
Deployment-Status
Namespace-Status
Node-Status

Option	Description
--container	Filter on a specific container (must be a PromQL filter, Default: 'container=~".*"')
--pod	Filter on a specific pod (must be a PromQL filter, Default:'pod=~".*"')
--warning-status	Define the conditions to match for the status to be WARNING (default: '') You can use the following variables: %{status}, %{state}, %{reason}
--critical-status	Define the conditions to match for the status to be CRITICAL (default: '%{status} !~ /running/ \|\| %{state} !~ /ready/'). You can use the following variables: %{status}, %{state}, %{reason}
--warning-restarts-count	Warning threshold for container restarts count.
--critical-restarts-count	Critical threshold for container restarts count.
--extra-filter	Add a PromQL filter (can be defined multiple times) Example : --extra-filter='name=~".pretty."'
--metric-overload	Overload default metrics name (can be defined multiple times) Example : --metric-overload='metric,^my_metric_name$' Default : - ready: ^kube_pod_container_status_ready$ - running: ^kube_pod_container_status_running$ - terminated: ^kube_pod_container_status_terminated$ - terminated_reason: ^kube_pod_container_status_terminated_reason$ - waiting: ^kube_pod_container_status_waiting$ - waiting_reason: ^kube_pod_container_status_waiting_reason$ - restarts: ^kube_pod_container_status_restarts_total$
--filter-counters	Only display some counters (regexp can be used). Example: --filter-counters='status'

Option	Description
--daemonset	Filter on a specific daemonset (must be a PromQL filter, Default: 'daemonset=~".*"')
--warning-status	Define the conditions to match for the status to be WARNING (default: '%{up_to_date} < %{desired}') You can use the following variables: %{display}, %{desired}, %{current}, %{available}, %{unavailable}, %{up_to_date}, %{ready}, %{misscheduled}
--critical-status	Define the conditions to match for the status to be CRITICAL (default: '%{available} < %{desired}'). You can use the following variables: %{display}, %{desired}, %{current}, %{available}, %{unavailable}, %{up_to_date}, %{ready}, %{misscheduled}
--extra-filter	Add a PromQL filter (can be defined multiple times) Example : --extra-filter='name=~".pretty."'
--metric-overload	Overload default metrics name (can be defined multiple times) Example : --metric-overload='metric,^my_metric_name$' Default : - desired: ^kube_daemonset_status_desired_number_scheduled$ - current: ^kube_daemonset_status_current_number_scheduled$ - available: ^kube_daemonset_status_number_available$ - unavailable: ^kube_daemonset_status_number_unavailable$ - up_to_date: ^kube_daemonset_updated_number_scheduled$ - ready: ^kube_daemonset_status_number_ready$ - misscheduled: ^kube_daemonset_status_number_misscheduled$

Option	Description
--deployment	Filter on a specific deployment (must be a PromQL filter, Default: 'deployment=~".*"')
--warning-status	Define the conditions to match for the status to be WARNING (default: '%{up_to_date} < %{desired}') You can use the following variables: %{display}, %{desired}, %{current}, %{available}, %{unavailable}, %{up_to_date}
--critical-status	Define the conditions to match for the status to be CRITICAL (default: '%{available} < %{desired}'). You can use the following variables: %{display}, %{desired}, %{current}, %{available}, %{unavailable}, %{up_to_date}
--extra-filter	Add a PromQL filter (can be defined multiple times) Example : --extra-filter='name=~".pretty."'
--metric-overload	Overload default metrics name (can be defined multiple times) Example : --metric-overload='metric,^my_metric_name$' Default : - desired: ^kube_deployment_spec_replicas$ - current: ^kube_deployment_status_replicas$ - available: ^kube_deployment_status_replicas_available$ - unavailable: ^kube_deployment_status_replicas_unavailable$ - up_to_date: ^kube_deployment_status_replicas_updated$

Option	Description
--namespace	Filter on a specific namespace (must be a PromQL filter, Default: 'namespace=~".*"')
--phase	Filter on a specific phase (must be a PromQL filter, Default:'phase=~".*"')
--warning-status	Define the conditions to match for the status to be WARNING (default: '') You can use the following variables: %{display}, %{phase}.
--critical-status	Define the conditions to match for the status to be CRITICAL (default: '%{phase} !~ /Active/'). You can use the following variables: %{display}, %{phase}
--extra-filter	Add a PromQL filter (can be defined multiple times) Example : --extra-filter='name=~".pretty."'
--metric-overload	Overload default metrics name (can be defined multiple times) Example : --metric-overload='metric,^my_metric_name$' Default : - status: ^kube_namespace_status_phase$
--filter-counters	Only display some counters (regexp can be used). Example: --filter-counters='status'

Option	Description
--node	Filter on a specific node (must be a PromQL filter, Default:'node=~".*"')
--warning-status	Define the conditions to match for the status to be WARNING (default: '') You can use the following variables: %{display}, %{status}, %{schedulable}
--critical-status	Define the conditions to match for the status to be CRITICAL (default: '%{status} !~ /Ready/ \|\| %{schedulable} != /false/'). You can use the following variables: %{display}, %{status}, %{schedulable}
--warning-allocated-pods	Warning threshold for pods allocation.
--critical-allocated-pods	Critical threshold for pods allocation.
--units	Units of thresholds (default: '') (can be '%').
--extra-filter	Add a PromQL filter (can be defined multiple times) Example : --extra-filter='name=~".pretty."'
--metric-overload	Overload default metrics name (can be defined multiple times) Example : --metric-overload='metric,^my_metric_name$' Default : - status: ^kube_node_status_condition$ - unschedulable: ^kube_node_spec_unschedulable$ - capacity: ^kube_node_status_capacity_pods$ - allocatable: ^kube_node_status_allocatable_pods$ - allocated: ^kubelet_running_pod_count$
--filter-counters	Only display some counters (regexp can be used). Example: --filter-counters='status'

All available options for a given mode can be displayed by adding the --help parameter to the command:

/usr/lib/centreon/plugins/centreon_prometheus_kubernetes_api.pl \
	--plugin=cloud::prometheus::direct::kubernetes::plugin \
	--mode=node-status \
	--help

Pack assets​

Templates​

Collected metrics & status​

Installing the monitoring connector​

Pack​

Plugin​

Using the monitoring connector​

Using a host template provided by the connector​

Using a service template provided by the connector​

How to check in the CLI that the configuration is OK and what are the main options for?​

Troubleshooting​

Available modes​

Available options​

Generic options​

Modes options​