Skip to main content

Linux Telegraf Agent

Telegraf is an observability tool implementing the OpenTelemetry protocol.

This monitoring connector is a proof of concept, Centreon does not recommend to use it in production. It has some limitations such as:

  • the need to restart the Telegraf service whenever the configuration is changed.
  • the impossibility to display the informational output message of the host or service (due to limitations of the OpenTelemetry protocol).

You may refer to this page for more information about Centreon's integration with Telegraf.

Pack assets​

Templates​

The Monitoring Connector Linux Telegraf Agent brings a host template:

  • OS-Linux-Telegraf-Agent-custom

The connector brings the following service templates (sorted by the host template they are attached to):

Service AliasService TemplateService Description
CpuOS-Linux-Cpu-Telegraf-Agent-customCheck the rate of utilization of CPUs for the machine. This check can give the average CPU utilization rate and the rate per CPU for multi-core CPUs
LoadOS-Linux-Load-Telegraf-Agent-customCheck the server load average
MemoryOS-Linux-Memory-Telegraf-Agent-customCheck the rate of the utilization of memory
NtpOS-Linux-Ntp-Telegraf-Agent-customCheck system time synchronization with an NTP server
SwapOS-Linux-Swap-Telegraf-Agent-customCheck virtual memory usage
UptimeOS-Linux-Uptime-Telegraf-Agent-customTime since the server has been working and available

The services listed above are created automatically when the OS-Linux-Telegraf-Agent-custom host template is used.

Collected metrics & status​

Here is the list of services for this connector, detailing all metrics linked to each service.

Metric nameUnit
command.exit.code.countcount

Prerequisites​

The prerequisites below have to be applied to the Linux servers to be monitored.

Network flow​

Two TCP flows must be open from the host to the poller.

SourceDestinationProtocolPortPurpose
Monitored hostPollerTCP1443Access to the Telegraf agent's configuration.
Monitored hostPollerTCP4317OpenTelemetry data flow.

System prerequisites on the poller​

To be able to use the Telegraf agent, you must use a poller with at least version 24.04.2 of centreon-engine. The Telegraf agent will configure itself via a HTTPS request sent to Centreon Engine.

  1. For this to work, you must first get a valid certificate or generate a self-signed one on the poller as detailed below.

In the command below, replace ${HOSTNAME} with the poller's FQDN if they don't match. If you set an IP address in the access parameters to the configuration server instead of an FQDN, the Telegraf agent will refuse the certificate.

openssl req -new -subj "/CN=${HOSTNAME}" -addext "subjectAltName = DNS:${HOSTNAME}" -newkey rsa:2048 -sha256 -days 365 -nodes -x509 -keyout /etc/centreon-engine/conf-server.key -out /etc/centreon-engine/conf-server.crt
chown centreon-engine: /etc/centreon-engine/conf-*

The -days 365 option limits the certificate validity to one year. You may choose a longer or shorter duration according to your security/maintainance preferences.

  1. Then provide Engine with the connection information it needs to give to the Telegraf agent so that it can send information to Engine.
cat > /etc/centreon-engine/otl_server.json <<EOF
{
"otel_server": {
"host": "0.0.0.0",
"port": 4317,
"encryption": true,
"certificate_path": "/etc/centreon-engine/conf-server.crt",
"key_path": "/etc/centreon-engine/conf-server.key"
},
"max_length_grpc_log": 0,
"telegraf_conf_server": {
"http_server" : {
"port": 1443,
"encryption": true,
"certificate_path": "/etc/centreon-engine/conf-server.crt",
"key_path": "/etc/centreon-engine/conf-server.key"
},
"engine_otel_endpoint": "${HOSTNAME}:4317",
"check_interval":60
}
}
EOF
chown centreon-engine: /etc/centreon-engine/otl_server.json

Configure Engine​

  1. In the Configuration > Pollers > Engine configuration menu, on the Data tab, add an entry to the Broker modules to load and enter the /usr/lib64/centreon-engine/libopentelemetry.so /etc/centreon-engine/otl_server.json directive. Save the form.

  2. Export the poller's configuration, selecting the Restart option.

System prerequisites on the monitored host​

The prerequisites below must be met on the Linux servers to monitor with the Telegraf agent.

In the next steps, replace mypoller.local with the poller's FQDN. Make sure you use the same FQDN as the one used to create the certificate.

  1. Accept the poller's certificate (if it has been self-signed).
openssl s_client -connect mypoller.local:1443 2>/dev/null </dev/null |  sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' >  /etc/pki/ca-trust/source/anchors/mypoller.local.crt
update-ca-trust

Check the certificate's validity with this command:

curl https://mypoller.local:1443/engine

At this stage, the expected response is:

<html><body>No host service found from get parameters</body></html>
  1. Install the Telegraf agent and some dependencies.
dnf -y install epel-release
dnf -y config-manager --set-enabled 'powertools'

This part is an excerpt from Telegraf's official documentation.

cat > /etc/yum.repos.d/influxdb.repo <<'EOF'
[influxdb]
name = InfluxData Repository - Stable
baseurl = https://repos.influxdata.com/stable/$basearch/main
enabled = 1
gpgcheck = 1
gpgkey = https://repos.influxdata.com/influxdata-archive_compat.key
EOF

dnf install -y telegraf
  1. Set up the Telegraf agent so that it retrieves its configuration from the poller.
cat > /etc/default/telegraf <<EOF
TELEGRAF_OPTS='--config-url-watch-interval 120s --config=https://mypoller.local:1443/engine?host=$HOSTNAME'
EOF
systemctl restart telegraf
  • Make sure you replace mypoller.local with the poller's FQDN.
  • If the name of the host to monitor is different from the $HOSTNAME variable's content, you'll have to change it inside the /etc/default/telegraf file.
  1. To allow the Telegraf agent to run checks on systemd-journal, run these commands.
usermod -a -G systemd-journal telegraf
systemctl restart telegraf
  1. Add the Centreon plugins repository and install the local Linux plugin.
cat >/etc/yum.repos.d/centreon-plugins.repo <<'EOF'
[centreon-plugins-stable]
name=Centreon plugins repository.
baseurl=https://packages.centreon.com/rpm-plugins/el8/stable/$basearch/
enabled=1
gpgcheck=1
gpgkey=https://yum-gpg.centreon.com/RPM-GPG-KEY-CES
module_hotfixes=1

[centreon-plugins-stable-noarch]
name=Centreon plugins repository.
baseurl=https://packages.centreon.com/rpm-plugins/el8/stable/noarch/
enabled=1
gpgcheck=1
gpgkey=https://yum-gpg.centreon.com/RPM-GPG-KEY-CES
module_hotfixes=1

[centreon-plugins-testing]
name=Centreon plugins repository. (UNSUPPORTED)
baseurl=https://packages.centreon.com/rpm-plugins/el8/testing/$basearch/
enabled=0
gpgcheck=1
gpgkey=https://yum-gpg.centreon.com/RPM-GPG-KEY-CES
module_hotfixes=1

[centreon-plugins-testing-noarch]
name=Centreon plugins repository. (UNSUPPORTED)
baseurl=https://packages.centreon.com/rpm-plugins/el8/testing/noarch/
enabled=0
gpgcheck=1
gpgkey=https://yum-gpg.centreon.com/RPM-GPG-KEY-CES
module_hotfixes=1

[centreon-plugins-unstable]
name=Centreon plugins repository. (UNSUPPORTED)
baseurl=https://packages.centreon.com/rpm-plugins/el8/unstable/$basearch/
enabled=0
gpgcheck=1
gpgkey=https://yum-gpg.centreon.com/RPM-GPG-KEY-CES
module_hotfixes=1

[centreon-plugins-unstable-noarch]
name=Centreon plugins repository. (UNSUPPORTED)
baseurl=https://packages.centreon.com/rpm-plugins/el8/unstable/noarch/
enabled=0
gpgcheck=1
gpgkey=https://yum-gpg.centreon.com/RPM-GPG-KEY-CES
module_hotfixes=1
EOF

dnf install -y centreon-plugin-Operatingsystems-Linux-Local.noarch
  1. Restart the telegraf service.
systemctl restart telegraf

Installing the monitoring connector​

Pack​

  1. If the platform uses an online license, you can skip the package installation instruction below as it is not required to have the connector displayed within the Configuration > Monitoring Connector Manager menu. If the platform uses an offline license, install the package on the central server with the command corresponding to the operating system's package manager:
dnf install centreon-pack-operatingsystems-linux-telegraf-agent
  1. Whatever the license type (online or offline), install the Linux Telegraf-Agent connector through the Configuration > Monitoring Connector Manager menu.

  2. Add the new connector.

In the Configuration > Commands > Connectors menu, click Add and fill the form as detailed below.

ParameterValue
Connector NameTelegraf Agent
Connector DescriptionTelegraf Agent
Command Lineopentelemetry --processor=nagios_telegraf --extractor=attributes --host_path=resourceMetrics.scopeMetrics.metrics.dataPoints.attributes.host --service_path=resourceMetrics.scopeMetrics.metrics.dataPoints.attributes.service
Used by commandSelect all the commands whose names match OS-Linux-Telegraf-Agent-*
Connector StatusEnabled

Plugin​

This monitoring connector relies on an integration supported by Centreon Engine and does not require any particular plugin on the pollers.

Using the monitoring connector​

Using a host template provided by the connector​

  1. Log into Centreon and add a new host through Configuration > Hosts.
  2. Fill in the Name, Alias & IP Address/DNS fields according to your resource's settings.
  3. Apply the OS-Linux-Telegraf-Agent-custom template to the host. A list of macros appears. Macros allow you to define how the connector will connect to the resource, and to customize the connector's behavior.
  4. Fill in the macros you want. Some macros are mandatory.
MacroDescriptionValeur par défautObligatoire
TELEGRAFPLUGINSPath where the Centreon Plugins can be found./usr/lib/centreon/pluginsX
TELEGRAFSTATEFILEDIRDefine the cache directory./var/lib/telegrafX
TELEGRAFEXTRAOPTIONSAny extra option you may want to add to every command (a --verbose flag for example). Toutes les options sont listées ici.
  1. Deploy the configuration. The host appears in the list of hosts, and on the Resources Status page. The command that is sent by the connector is displayed in the details panel of the host: it shows the values of the macros.

Using a service template provided by the connector​

  1. If you have used a host template and checked Create Services linked to the Template too, the services linked to the template have been created automatically, using the corresponding service templates. Otherwise, create manually the services you want and apply a service template to them.
  2. Fill in the macros you want (e.g. to change the thresholds for the alerts). Some macros are mandatory (see the table below).
MacroDescriptionDefault valueMandatory
COMMANDCommand to test (default: none). You can use 'sh' to use '&&' or '||'X
COMMANDOPTIONSCommand options (default: none)
THRESHOLDSSet action according command exit code. Example: %(code) == 0,OK,File xxx exist#%(code) == 1,CRITICAL,File xxx not exist#,UNKNOWN,Command problemX
EXTRAOPTIONSAny extra option you may want to add to the command (a --verbose flag for example). All options are listed here.
  1. Deploy the configuration. The service appears in the list of services, and on the Resources Status page. The command that is sent by the connector is displayed in the details panel of the service: it shows the values of the macros.

How to check in the CLI that the configuration is OK and what are the main options for?​

Once the plugin is installed, log into your Linux host CLI using the root user account. Test that the connector is able to monitor a resource using a command like this one (replace the sample values by yours):

sudo -u telegraf /usr/lib/centreon/plugins/centreon_linux_local.pl \
--plugin='os::linux::local::plugin' \
--mode='cpu' \
--warning-core='99' \
--critical-core='' \
--warning-average='40' \
--critical-average='' \
--statefile-dir='/var/lib/telegraf'

The expected command output is shown below:

OK: CPU(s) average usage is 2.95 % | 'cpu.utilization.percentage'=2.95%;0:40;;0;100 '0#core.cpu.utilization.percentage'=3.07%;0:99;;0;100 '1#core.cpu.utilization.percentage'=2.83%;0:99;;0;100

Troubleshooting​

Please find the troubleshooting documentation for Centreon Plugins typical issues.

Available modes​

In most cases, a mode corresponds to a service template. The mode appears in the execution command for the connector. In the Centreon interface, you don't need to specify a mode explicitly: its use is implied when you apply a service template. However, you will need to specify the correct mode for the template if you want to test the execution command for the connector in your terminal.

All available modes can be displayed by adding the --list-mode parameter to the command:

sudo -u telegraf /usr/lib/centreon/plugins/centreon_linux_local.pl \
--plugin='os::linux::local::plugin' \
--list-mode

The plugin brings the following modes:

ModeLinked service template
check-plugin [code]Not used in this Monitoring Connector
cmd-return [code]OS-Linux-Cmd-Generic-Telegraf-Agent-custom
OS-Linux-Is-File-Generic-Telegraf-Agent-custom
OS-Linux-Is-Not-File-Generic-Telegraf-Agent-custom
connections [code]OS-Linux-Connections-Generic-Telegraf-Agent-custom
cpu [code]OS-Linux-Cpu-Telegraf-Agent-custom
cpu-detailed [code]OS-Linux-Cpu-Detailed-Telegraf-Agent-custom
discovery-snmp [code]Not used in this Monitoring Connector
discovery-snmpv3 [code]Not used in this Monitoring Connector
diskio [code]OS-Linux-Disk-IO-Telegraf-Agent-custom
files-date [code]OS-Linux-File-Date-Generic-Telegraf-Agent-custom
files-size [code]OS-Linux-File-Size-Generic-Telegraf-Agent-custom
inodes [code]OS-Linux-Inodes-Telegraf-Agent-custom
list-interfaces [code]Used for service discovery
list-partitions [code]Not used in this Monitoring Connector
list-storages [code]Used for service discovery
list-systemdservices [code]Not used in this Monitoring Connector
load [code]OS-Linux-Load-Telegraf-Agent-custom
lvm [code]Not used in this Monitoring Connector
memory [code]OS-Linux-Memory-Telegraf-Agent-custom
mountpoint [code]Not used in this Monitoring Connector
ntp [code]OS-Linux-Ntp-Telegraf-Agent-custom
open-files [code]OS-Linux-Open-Files-Telegraf-Agent-custom
packet-errors [code]OS-Linux-Packet-Errors-Telegraf-Agent-custom
paging [code]Not used in this Monitoring Connector
pending-updates [code]OS-Linux-Pending-Updates-Telegraf-Agent-custom
process [code]OS-Linux-Process-Generic-Telegraf-Agent-custom
quota [code]Not used in this Monitoring Connector
storage [code]OS-Linux-Disks-Telegraf-Agent-custom
swap [code]OS-Linux-Swap-Telegraf-Agent-custom
systemd-journal [code]OS-Linux-Systemd-Journal-Telegraf-Agent-custom
systemd-sc-status [code]OS-Linux-Systemd-Sc-Status-Telegraf-Agent-custom
traffic [code]OS-Linux-Traffic-Telegraf-Agent-custom
uptime [code]OS-Linux-Uptime-Telegraf-Agent-custom

Available options​

Generic options​

All generic options are listed here:

OptionDescription
--modeDefine the mode in which you want the plugin to be executed (see--list-mode).
--dyn-modeSpecify a mode with the module's path (advanced).
--list-modeList all available modes.
--mode-versionCheck minimal version of mode. If not, unknown error.
--versionReturn the version of the plugin.
--custommodeWhen a plugin offers several ways (CLI, library, etc.) to get information the desired one must be defined with this option.
--list-custommodeList all available custom modes.
--multipleMultiple custom mode objects. This may be required by some specific modes (advanced).
--pass-managerDefine the password manager you want to use. Supported managers are: environment, file, keepass, hashicorpvault and teampass.
--verboseDisplay extended status information (long output).
--debugDisplay debug messages.
--filter-perfdataFilter perfdata that match the regexp. Example: adding --filter-perfdata='avg' will remove all metrics that do not contain 'avg' from performance data.
--filter-perfdata-advFilter perfdata based on a "if" condition using the following variables: label, value, unit, warning, critical, min, max. Variables must be written either %{variable} or %(variable). Example: adding --filter-perfdata-adv='not (%(value) == 0 and %(max) eq "")' will remove all metrics whose value equals 0 and that don't have a maximum value.
--explode-perfdata-maxCreate a new metric for each metric that comes with a maximum limit. The new metric will be named identically with a '_max' suffix). Example: it will split 'used_prct'=26.93%;0:80;0:90;0;100 into 'used_prct'=26.93%;0:80;0:90;0;100 'used_prct_max'=100%;;;;
--change-perfdata --extend-perfdataChange or extend perfdata. Syntax: --extend-perfdata=searchlabel,newlabel,target[,[newuom],[min],[m ax]] Common examples: Convert storage free perfdata into used: --change-perfdata='free,used,invert()' Convert storage free perfdata into used: --change-perfdata='used,free,invert()' Scale traffic values automatically: --change-perfdata='traffic,,scale(auto)' Scale traffic values in Mbps: --change-perfdata='traffic_in,,scale(Mbps),mbps' Change traffic values in percent: --change-perfdata='traffic_in,,percent()'
--extend-perfdata-groupAdd new aggregated metrics (min, max, average or sum) for groups of metrics defined by a regex match on the metrics' names. Syntax: --extend-perfdata-group=regex,namesofnewmetrics,calculation[,[ne wuom],[min],[max]] regex: regular expression namesofnewmetrics: how the new metrics' names are composed (can use $1, $2... for groups defined by () in regex). calculation: how the values of the new metrics should be calculated newuom (optional): unit of measure for the new metrics min (optional): lowest value the metrics can reach max (optional): highest value the metrics can reach Common examples: Sum wrong packets from all interfaces (with interface need --units-errors=absolute): --extend-perfdata-group=',packets_wrong,sum(packets_(discard |error)_(in|out))' Sum traffic by interface: --extend-perfdata-group='traffic_in_(.*),traffic_$1,sum(traf fic_(in|out)_$1)'
--change-short-output --change-long-outputModify the short/long output that is returned by the plugin. Syntax: --change-short-output=pattern~replacement~modifier Most commonly used modifiers are i (case insensitive) and g (replace all occurrences). Example: adding --change-short-output='OK~Up~gi' will replace all occurrences of 'OK', 'ok', 'Ok' or 'oK' with 'Up'
--change-exitReplace an exit code with one of your choice. Example: adding --change-exit=unknown=critical will result in a CRITICAL state instead of an UNKNOWN state.
--range-perfdataRewrite the ranges displayed in the perfdata. Accepted values: 0: nothing is changed. 1: if the lower value of the range is equal to 0, it is removed. 2: remove the thresholds from the perfdata.
--filter-uomMask the units when they don't match the given regular expression.
--opt-exitReplace the exit code in case of an execution error (i.e. wrong option provided, SSH connection refused, timeout, etc). Default: unknown.
--output-ignore-perfdataRemove all the metrics from the service. The service will still have a status and an output.
--output-ignore-labelRemove the status label ("OK:", "WARNING:", "UNKNOWN:", CRITICAL:") from the beginning of the output. Example: 'OK: Ram Total:...' will become 'Ram Total:...'
--output-xmlReturn the output in XML format (to send to an XML API).
--output-jsonReturn the output in JSON format (to send to a JSON API).
--output-openmetricsReturn the output in OpenMetrics format (to send to a tool expecting this format).
--output-fileWrite output in file (can be combined with json, xml and openmetrics options). E.g.: --output-file=/tmp/output.txt will write the output in /tmp/output.txt.
--disco-formatApplies only to modes beginning with 'list-'. Returns the list of available macros to configure a service discovery rule (formatted in XML).
--disco-showApplies only to modes beginning with 'list-'. Returns the list of discovered objects (formatted in XML) for service discovery.
--float-precisionDefine the float precision for thresholds (default: 8).
--source-encodingDefine the character encoding of the response sent by the monitored resource Default: 'UTF-8'.
--hostnameHostname to query.
--timeoutTimeout in seconds for the command (default: 45). Default value can be override by the mode.
--commandCommand to get information. Used it you have output in a file.
--command-pathCommand path.
--command-optionsCommand options.
--sudo sudo command.
--ssh-backendDefine the backend you want to use. It can be: sshcli (default), plink and libssh.
--ssh-usernameDefine the user name to log in to the host.
--ssh-passwordDefine the password associated with the user name. Cannot be used with the sshcli backend. Warning: using a password is not recommended. Use --ssh-priv-key instead.
--ssh-portDefine the TCP port on which SSH is listening.
--ssh-priv-keyDefine the private key file to use for user authentication.
--sshcli-commandssh command (default: 'ssh').
--sshcli-pathssh command path (default: none)
--sshcli-optionSpecify ssh cli options (example: --sshcli-option='-o=StrictHostKeyChecking=no').
--plink-commandplink command (default: 'plink').
--plink-pathplink command path (default: none)
--plink-optionSpecify plink options (example: --plink-option='-T').
--libssh-strict-connectConnection won't be OK even if there is a problem (server known changed or server found other) with the ssh server.

Modes options​

All available options for each service template are listed below:

OptionDescription
--manage-returnsSet action according command exit code. Example: %(code) == 0,OK,File xxx exist#%(code) == 1,CRITICAL,File xxx not exist#,UNKNOWN,Command problem
--separatorSet the separator used in --manage-returns (default : #)
--exec-commandCommand to test (default: none). You can use 'sh' to use '&&' or '||'.
--exec-command-pathCommand path (default: none).
--exec-command-optionsCommand options (default: none).

All available options for a given mode can be displayed by adding the --help parameter to the command:

sudo -u telegraf /usr/lib/centreon/plugins/centreon_linux_local.pl \
--plugin='os::linux::local::plugin' \
--mode='cpu' \
--help