Operating guide
Unless otherwise stated, all commands in this page must be run as
root
.
Logging in to Centreon on the active node
You access the interface of the active node via the IP address of the central VIP. This means that you always use the same URL to access the interface, whether the interface is that of central node 1 or of central node 2.
How do I know the state of the cluster?
Using crm_mon and pcs status
You can know the state of the cluster at all times by using the crm_mon
command, or the pcs status
command, on any member of the cluster (central nodes, quorum device, database nodes).
-
pcs status
has a static output: it displays the state of the cluster as it is at the time you run the command. -
crm_mon
has a dynamic output: the state of the cluster is displayed in real time. You can watch the resources being stopped and transferred to the other node. Usecrm_mon -fr
to keep displaying stopped resources.
Example of output when the cluster is working properly:
Cluster Summary:
* Stack: corosync (Pacemaker is running)
* Current DC: @CENTRAL_NODE2_NAME@ (version 2.1.6-9.1.el8_9-6fdc9deea29) - MIXED-VERSION partition with quorum
* Last updated: Tue Jun 4 07:49:50 2024 on @CENTRAL_NODE1_NAME@
* Last change: Tue Jun 4 05:44:11 2024 by root via crm_resource on @CENTRAL_NODE2_NAME@
* 4 nodes configured
* 21 resource instances configured
Node List:
* Online: [ @CENTRAL_NODE1_NAME@ @CENTRAL_NODE2_NAME@ @DATABASE_NODE1_NAME@ @DATABASE_NODE2_NAME@ ]
Full List of Resources:
* Clone Set: ms_mysql-clone [ms_mysql] (promotable):
* Masters: [ @DATABASE_NODE1_NAME@ ]
* Slaves: [ @DATABASE_NODE2_NAME@ ]
* Stopped: [ @CENTRAL_NODE1_NAME@ @CENTRAL_NODE2_NAME@ ]
* Clone Set: php-clone [php]:
* Started: [ @CENTRAL_NODE1_NAME@ @CENTRAL_NODE2_NAME@ ]
* Stopped: [ @DATABASE_NODE1_NAME@ @DATABASE_NODE2_NAME@ ]
* Clone Set: cbd_rrd-clone [cbd_rrd]:
* Started: [ @CENTRAL_NODE1_NAME@ @CENTRAL_NODE2_NAME@ ]
* Stopped: [ @DATABASE_NODE1_NAME@ @DATABASE_NODE2_NAME@ ]
* vip_mysql (ocf::heartbeat:IPaddr2): Started @DATABASE_NODE1_NAME@
* Resource Group: centreon:
* vip (ocf::heartbeat:IPaddr2): Started @CENTRAL_NODE1_NAME@
* http (systemd:httpd): Started @CENTRAL_NODE1_NAME@
* gorgone (systemd:gorgoned): Started @CENTRAL_NODE1_NAME@
* centreon_central_sync (systemd:centreon-central-sync): Started @CENTRAL_NODE1_NAME@
* cbd_central_broker (systemd:cbd-sql): Started @CENTRAL_NODE1_NAME@
* centengine (systemd:centengine): Started @CENTRAL_NODE1_NAME@
* centreontrapd (systemd:centreontrapd): Started @CENTRAL_NODE1_NAME@
* snmptrapd (systemd:snmptrapd): Started @CENTRAL_NODE1_NAME@
Migration Summary:
These commands should return no errors. If there are "Failed actions" on any resource, troubleshoot them using the troubleshooting guide.
Using the Centreon interface
The installation process includes the monitoring of the members of the cluster by a poller. This way, you can be notified if a member of the cluster goes down.
The Resource Status page gives you the following information:
- On both central nodes, the PCS-Status service gives you the detailed state of the cluster. The output of the service in the details panel is the output of the
pcs status
command. - You can know which central node is the active node by looking at which node is carrying the cluster resources in the output of the PCS-Status service on each central.
Remove an error displayed in the cluster status
Once the cause of the error has been identified and fixed (see the troubleshooting guide), you must delete the error message manually:
pcs resource cleanup
Or, if you want to remove only the errors linked to one resource:
pcs resource cleanup <resource_name>
Check the constraints
If a failover has occurred at some point, there may be some leftover location constraints. Run the following command to display the current constraints:
pcs constraint
The command should return this:
Location Constraints:
Resource: cbd_rrd-clone
Disabled on:
Node: @DATABASE_NODE1_NAME@ (score:-INFINITY)
Node: @DATABASE_NODE2_NAME@ (score:-INFINITY)
Resource: centreon
Disabled on:
Node: @DATABASE_NODE1_NAME@ (score:-INFINITY)
Node: @DATABASE_NODE2_NAME@ (score:-INFINITY)
Resource: ms_mysql-clone
Disabled on:
Node: @CENTRAL_NODE1_NAME@ (score:-INFINITY)
Node: @CENTRAL_NODE2_NAME@ (score:-INFINITY)
Resource: php-clone
Disabled on:
Node: @DATABASE_NODE1_NAME@ (score:-INFINITY)
Node: @DATABASE_NODE2_NAME@ (score:-INFINITY)
Ordering Constraints:
Colocation Constraints:
vip_mysql with ms_mysql-clone (score:INFINITY) (rsc-role:Started) (with-rsc-role:Master)
ms_mysql-clone with vip_mysql (score:INFINITY) (rsc-role:Master) (with-rsc-role:Started)
Ticket Constraints:
The output shows the constraints you have defined during the installation procedure: the ms_mysql-clone resource only runs on the database nodes, the cbd_rrd-clone, centreon and php-clone resources only run on the central nodes.
To remove unwanted constraints, run the following command:
pcs resource clear centreon
Check the status of the database synchronization
To check that the database synchronization is working, run the following command:
/usr/share/centreon-ha/bin/mysql-check-status.sh
The command should return the following information:
Connection MASTER Status '@DATABASE_NODE1_NAME@' [OK]
Connection SLAVE Status '@DATABASE_NODE2_NAME@' [OK]
Slave Thread Status [OK]
Position Status [OK]
If the synchronization shows KO
, you must fix it. The procedure below explains how to manually re-enable MariaDB replication.
Restore MariaDB active-passive replication
This procedure should be applied in the event of a breakdown in the MariaDB databases' replication thread or a server crash if it cannot be recovered by running
pcs resource cleanup ms_mysql
orpcs resource restart ms_mysql
.
Prevent the cluster from managing the MariaDB resource during the operation (to be run from any node):
pcs resource unmanage ms_mysql
Connect to the MariaDB slave server and shut down the MariaDB service:
mysqladmin -p shutdown
Connect to the active database node and run the following command to overwrite the passive node's data with the active's:
/usr/share/centreon-ha/bin/mysql-sync-bigdb.sh
Re-enable the cluster to manage the ms_mysql resource:
pcs resource manage ms_mysql
Run the following command on one of the database servers to make sure that the replication has been successfully restored:
/usr/share/centreon-ha/bin/mysql-check-status.sh
Connection Status '@CENTRAL_MASTER_NAME@' [OK]
Connection Status '@CENTRAL_SLAVE_NAME@' [OK]
Slave Thread Status [OK]
Position Status [OK]
View the cluster's configuration
To display a very detailed description the cluster's configuration (e.g. to check the name of the resources for any typos, or to check network information), run this command:
pcs config show
Testing the cluster
This section provides you with examples of tests to validate that your cluster is working properly: perform a failover, simulate a network failure, and check that the cluster behaves as expected.
How to perform a manual failover
We're assuming that central node 1 is the active central node and central node 2 is the passive central node (check the state of the cluster if you need to).
When you move the centreon resource group from central node 1 to central node 2, central node 2 will become the active node and central node 1 will become the passive node.
- Run the following command to perform the failover:
pcs resource move centreon
In another terminal, you can also use the crm_mon -fr
command to watch the failover as it happens. It will be necessary to use Ctrl+c to exit the command.
Warning: The
pcs resource move centreon
command sets an-INFINITY
constraint on node 1. This means that the resource is no longer allowed to be running on that node. (You will clear this constraint at step 3.)
- The resources move to node 2. To check that the resources have indeed moved, run the following command:
pcs status
The expected output is:
Cluster name: centreon_cluster
WARNINGS:
Following resources have been moved and their move constraints are still in place: 'centreon'
Run 'pcs constraint location' or 'pcs resource clear <resource id>' to view or remove the constraints, respectively
Cluster Summary:
* Stack: corosync (Pacemaker is running)
* Current DC: @CENTRAL_NODE2_NAME@ (version 2.1.6-9.1.el8_9-6fdc9deea29) - MIXED-VERSION partition with quorum
* Last updated: Tue Jun 4 05:41:08 2024 on @CENTRAL_NODE2_NAME@
* Last change: Tue Jun 4 05:36:52 2024 by root via crm_resource on @CENTRAL_NODE1_NAME@
* 4 nodes configured
* 21 resource instances configured
Node List:
* Online: [ @CENTRAL_NODE1_NAME@ @CENTRAL_NODE2_NAME@ @DATABASE_NODE1_NAME@ @DATABASE_NODE2_NAME@ ]
Full List of Resources:
* Clone Set: ms_mysql-clone [ms_mysql] (promotable):
* Masters: [ @DATABASE_NODE1_NAME@ ]
* Slaves: [ @DATABASE_NODE2_NAME@ ]
* Stopped: [ @CENTRAL_NODE1_NAME@ @CENTRAL_NODE2_NAME@ ]
* Clone Set: php-clone [php]:
* Started: [ @CENTRAL_NODE1_NAME@ @CENTRAL_NODE2_NAME@ ]
* Stopped: [ @DATABASE_NODE1_NAME@ @DATABASE_NODE2_NAME@ ]
* Clone Set: cbd_rrd-clone [cbd_rrd]:
* Started: [ @CENTRAL_NODE1_NAME@ @CENTRAL_NODE2_NAME@ ]
* Stopped: [ @DATABASE_NODE1_NAME@ @DATABASE_NODE2_NAME@ ]
* vip_mysql (ocf::heartbeat:IPaddr2): Started @DATABASE_NODE1_NAME@
* Resource Group: centreon:
* vip (ocf::heartbeat:IPaddr2): Started @CENTRAL_NODE2_NAME@
* http (systemd:httpd): Started @CENTRAL_NODE2_NAME@
* gorgone (systemd:gorgoned): Started @CENTRAL_NODE2_NAME@
* centreon_central_sync (systemd:centreon-central-sync): Started @CENTRAL_NODE2_NAME@
* cbd_central_broker (systemd:cbd-sql): Started @CENTRAL_NODE2_NAME@
* centengine (systemd:centengine): Started @CENTRAL_NODE2_NAME@
* centreontrapd (systemd:centreontrapd): Started @CENTRAL_NODE2_NAME@
* snmptrapd (systemd:snmptrapd): Started @CENTRAL_NODE2_NAME@
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
- Once the failover is completed, execute the following command to ensure the resources can be moved back to their original node in the future (EL8 or Debian).
pcs resource clear centreon
This will remove the constraints established during the failover.
If you move a single resource from the centreon resource group from one node to the other, all the other resources in the group will switch too.
If you want to return to the nominal situation (i.e. central node 1 is the active central node and central node 2 is the passive central node), you must perform a second resource failover (and clear the constraints afterwards).
How to simulate the loss of the passive central node
If the passive central node goes down, the cluster should carry on working as before, as the resources are managed by the active central node. You will see your passive central node as down in Resources Status.
To simulate a network failure that would isolate the passive central node, you can use iptables
to drop traffic from and to the passive central node. The passive central node will be completely excluded from the cluster. The active central node keeps the majority with the quorum device.
Perform the test
We're assuming that node 1 is the active node and node 2 is the passive node (check the state of the cluster if you need to).
To perform this test, run the iptables
commands on the passive central node. Thanks to these rules, all traffic coming from the active central node, the databases and the quorum device will be ignored by the passive central node:
iptables -A INPUT -s @CENTRAL_NODE1_IPADDR@ -j DROP
iptables -A OUTPUT -d @CENTRAL_NODE1_IPADDR@ -j DROP
iptables -A INPUT -s @DATABASE_NODE1_IPADDR@ -j DROP
iptables -A OUTPUT -d @DATABASE_NODE1_IPADDR@ -j DROP
iptables -A INPUT -s @DATABASE_NODE2_IPADDR@ -j DROP
iptables -A OUTPUT -d @DATABASE_NODE2_IPADDR@ -j DROP
iptables -A INPUT -s @QDEVICE_IPADDR@ -j DROP
iptables -A OUTPUT -d @QDEVICE_IPADDR@ -j DROP
The passive central node is now excluded from the cluster.
If you run pcs status
on the active central node:
- The resources and the cluster are still working (the output shows that the active node still sees the quorum device).
- The passive central node is seen
offline
on the active node:
Cluster name: centreon_cluster
Stack: corosync
Current DC: @CENTRAL_NODE1_NAME@ (version 1.1.23-1.el8_9.1-9acf116022) - partition with quorum
Last updated: Thu May 5 10:34:05 2022
Last change: Thu May 5 09:09:50 2022 by root via crm_resource on @CENTRAL_NODE1_NAME@
4 nodes configured
21 resource instances configured
Online: [ @DATABASE_NODE1_NAME@ @CENTRAL_NODE1_NAME@ @DATABASE_NODE2_NAME@ ]
OFFLINE: [ @CENTRAL_NODE2_NAME@ ]
Full list of resources:
Master/Slave Set: ms_mysql-clone [ms_mysql]
Masters: [ @DATABASE_NODE1_NAME@ ]
Slaves: [ @DATABASE_NODE2_NAME@ ]
Stopped: [ @CENTRAL_NODE1_NAME@ @CENTRAL_NODE2_NAME@ ]
vip_mysql (ocf::heartbeat:IPaddr2): Started @DATABASE_NODE1_NAME@
Clone Set: php-clone [php]
Started: [ @CENTRAL_NODE1_NAME@ ]
Stopped: [ @DATABASE_NODE1_NAME@ @DATABASE_NODE2_NAME@ @CENTRAL_NODE2_NAME@ ]
Clone Set: cbd_rrd-clone [cbd_rrd]
Started: [ @CENTRAL_NODE1_NAME@ ]
Stopped: [ @DATABASE_NODE1_NAME@ @DATABASE_NODE2_NAME@ @CENTRAL_NODE2_NAME@ ]
Resource Group: centreon
vip (ocf::heartbeat:IPaddr2): Started @CENTRAL_NODE1_NAME@
http (systemd:httpd24-httpd): Started @CENTRAL_NODE1_NAME@
gorgone (systemd:gorgoned): Started @CENTRAL_NODE1_NAME@
centreon_central_sync (systemd:centreon-central-sync): Started @CENTRAL_NODE1_NAME@
cbd_central_broker (systemd:cbd-sql): Started @CENTRAL_NODE1_NAME@
centengine (systemd:centengine): Started @CENTRAL_NODE1_NAME@
centreontrapd (systemd:centreontrapd): Started @CENTRAL_NODE1_NAME@
snmptrapd (systemd:snmptrapd): Started @CENTRAL_NODE1_NAME@
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
If you run pcs status
on the passive node:
- All resources appear stopped on the passive node (this is because the passive node does not see the quorum device anymore, as "partition WITHOUT quorum" indicates below. The resources are stopped.)
- The active node is seen as
offline
(as the passive node is cut off from the rest of the cluster):
Cluster name: centreon_cluster
Stack: corosync
Current DC: @CENTRAL_NODE1_NAME@ (version 1.1.23-1.el8_9.1-9acf116022) - partition WITHOUT quorum
Last updated: Thu May 5 10:34:05 2022
Last change: Thu May 5 09:09:50 2022 by root via crm_resource on @CENTRAL_NODE1_NAME@
4 nodes configured
21 resource instances configured
ONLINE: [ @CENTRAL_NODE2_NAME@ ]
OFFLINE: [ @CENTRAL_NODE1_NAME@ @DATABASE_NODE1_NAME@ @DATABASE_NODE2_NAME@ ]
Full list of resources:
* Master/Slave Set: ms_mysql-clone [ms_mysql]
* Stopped: [ @DATABASE_NODE1_NAME@ @DATABASE_NODE2_NAME@ @CENTRAL_NODE1_NAME@ @CENTRAL_NODE2_NAME@ ]
* vip_mysql (ocf::heartbeat:IPaddr2): Stopped
* Clone Set: php-clone [php]
* Stopped: [ @CENTRAL_NODE1_NAME@ @DATABASE_NODE1_NAME@ @DATABASE_NODE2_NAME@ @CENTRAL_NODE2_NAME@ ]
* Clone Set: cbd_rrd-clone [cbd_rrd]
* Stopped: [ @CENTRAL_NODE1_NAME@ @DATABASE_NODE1_NAME@ @DATABASE_NODE2_NAME@ @CENTRAL_NODE2_NAME@ ]
* Resource Group: centreon
* vip (ocf::heartbeat:IPaddr2): Stopped
* http (systemd:httpd24-httpd): Stopped
* gorgone (systemd:gorgoned): Stopped
* centreon_central_sync (systemd:centreon-central-sync): Stopped
* cbd_central_broker (systemd:cbd-sql): Stopped
* centengine (systemd:centengine): Stopped
* centreontrapd (systemd:centreontrapd): Stopped
* snmptrapd (systemd:snmptrapd): Stopped
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
Go back to the nominal situation
If you want to go back to the nominal situation, remove the iptables rules.
To view the various iptables rules configured on the passive node, run the following command:
iptables -L
The command should return the following information:
Chain INPUT (policy ACCEPT)
target prot opt source destination
DROP all -- @CENTRAL_NODE1_NAME@ anywhere
DROP all -- @DATABASE_NODE1_NAME@ anywhere
DROP all -- @DATABASE_NODE2_NAME@ anywhere
DROP all -- @QDEVICE_NAME@ anywhere
Chain FORWARD (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
DROP all -- anywhere @CENTRAL_NODE1_NAME@
DROP all -- anywhere @DATABASE_NODE1_NAME@
DROP all -- anywhere @DATABASE_NODE2_NAME@
DROP all -- anywhere @QDEVICE_NAME@
If you do not have any other iptables rules configured, you can execute the following command to remove the rules related to the test:
iptables -F
Otherwise, you will have to list the rule numbers with the following command:
iptables -L --line-numbers
And delete them with the following command:
iptables -D INPUT @RULE_NUMBER@
iptables -D OUTPUT @RULE_NUMBER@
If you run pcs status
on the active node, the passive node is seen as online
again:
Cluster name: centreon_cluster
Cluster Summary:
* Stack: corosync (Pacemaker is running)
* Current DC: @CENTRAL_NODE1_NAME@ (version 2.1.8-3.el9-3980678f0) - partition with quorum
* Last updated: Fri Mar 21 16:40:32 2025 on @CENTRAL_NODE1_NAME@
* Last change: Thu Mar 13 11:30:16 2025 by hacluster via hacluster on @CENTRAL_NODE1_NAME@
* 4 nodes configured
* 21 resource instances configured
Node List:
* Online: [ @CENTRAL_NODE1_NAME@ @CENTRAL_NODE2_NAME@ @DATABASE_NODE1_NAME@ @DATABASE_NODE2_NAME@ ]
Full List of Resources:
* Clone Set: ms_mysql-clone [ms_mysql] (promotable):
* Promoted: [ @DATABASE_NODE1_NAME@ ]
* Unpromoted: [ @DATABASE_NODE2_NAME@ ]
* Stopped: [ @CENTRAL_NODE1_NAME@ @CENTRAL_NODE2_NAME@ ]
* vip_mysql (ocf:heartbeat:IPaddr2): Started @DATABASE_NODE1_NAME@
* Clone Set: php-clone [php]:
* Started: [ @CENTRAL_NODE1_NAME@ @CENTRAL_NODE2_NAME@ ]
* Stopped: [ @DATABASE_NODE1_NAME@ @DATABASE_NODE2_NAME@ ]
* Clone Set: cbd_rrd-clone [cbd_rrd]:
* Started: [ @CENTRAL_NODE1_NAME@ @CENTRAL_NODE2_NAME@ ]
* Stopped: [ @DATABASE_NODE1_NAME@ @DATABASE_NODE2_NAME@ ]
* Resource Group: centreon:
* vip (ocf:heartbeat:IPaddr2): Started @CENTRAL_NODE1_NAME@
* http (systemd:httpd): Started @CENTRAL_NODE1_NAME@
* cbd_central_broker (systemd:cbd-sql): Started @CENTRAL_NODE1_NAME@
* gorgone (systemd:gorgoned): Started @CENTRAL_NODE1_NAME@
* centreon_central_sync (systemd:centreon-central-sync): Started @CENTRAL_NODE1_NAME@
* centengine (systemd:centengine): Started @CENTRAL_NODE1_NAME@
* centreontrapd (systemd:centreontrapd): Started @CENTRAL_NODE1_NAME@
* snmptrapd (systemd:snmptrapd): Started @CENTRAL_NODE1_NAME@
Also check that the database replication is still operational using the following command:
/usr/share/centreon-ha/bin/mysql-check-status.sh
The expected output is:
Connection MASTER Status '@DATABASE_NODE1_NAME@' [OK]
Connection SLAVE Status '@DATABASE_NODE2_NAME@' [OK]
Slave Thread Status [OK]
Position Status [OK]
How to simulate the loss of the active central node
This test checks that the resources are switched to the passive node if the active node is unavailable, allowing for continuity of service.
Perform the test
We're assuming that central node 1 is the active central node and central node 2 is the passive central node (check the state of the cluster if you need to).
To perform this test, run the following commands on the active central node. Thanks to these rules, all traffic coming from the passive central node, the databases and the quorum device will be ignored by the active central node:
iptables -A INPUT -s @CENTRAL_NODE2_IPADDR@ -j DROP
iptables -A OUTPUT -d @CENTRAL_NODE2_IPADDR@ -j DROP
iptables -A INPUT -s @DATABASE_NODE1_IPADDR@ -j DROP
iptables -A OUTPUT -d @DATABASE_NODE1_IPADDR@ -j DROP
iptables -A INPUT -s @DATABASE_NODE2_IPADDR@ -j DROP
iptables -A OUTPUT -d @DATABASE_NODE2_IPADDR@ -j DROP
iptables -A INPUT -s @QDEVICE_IPADDR@ -j DROP
iptables -A OUTPUT -d @QDEVICE_IPADDR@ -j DROP
Resources on the active central node (central node 1) should stop. Central node 2 becomes the active node and all the resources switch to it. You can use the crm_mon -fr
command on central node 2 to watch the startup of resources:
Stack: corosync
Current DC: @CENTRAL_NODE1_NAME@ (version 1.1.23-1.el8_9.1-9acf116022) - partition with quorum
Last updated: Thu May 5 11:06:38 2022
Last change: Thu May 5 09:09:50 2022 by root via crm_resource on @CENTRAL_NODE1_NAME@
4 nodes configured
21 resource instances configured
Online: [ @DATABASE_NODE1_NAME@ @DATABASE_NODE2_NAME@ @CENTRAL_NODE2_NAME@ ]
OFFLINE: [ @CENTRAL_NODE1_NAME@ ]
Full list of resources:
Master/Slave Set: ms_mysql-clone [ms_mysql]
Masters: [ @DATABASE_NODE1_NAME@ ]
Slaves: [ @DATABASE_NODE2_NAME@ ]
Stopped: [ @CENTRAL_NODE1_NAME@ @CENTRAL_NODE2_NAME@ ]
vip_mysql (ocf::heartbeat:IPaddr2): Started @DATABASE_NODE1_NAME@
Clone Set: php-clone [php]
Started: [ @CENTRAL_NODE2_NAME@ ]
Stopped: [ @DATABASE_NODE1_NAME@ @CENTRAL_NODE1_NAME@ @DATABASE_NODE2_NAME@ ]
Clone Set: cbd_rrd-clone [cbd_rrd]
Started: [ @CENTRAL_NODE2_NAME@ ]
Stopped: [ @DATABASE_NODE1_NAME@ @CENTRAL_NODE1_NAME@ @DATABASE_NODE2_NAME@ ]
Resource Group: centreon
vip (ocf::heartbeat:IPaddr2): Started @CENTRAL_NODE2_NAME@
http (systemd:httpd24-httpd): Started @CENTRAL_NODE2_NAME@
gorgone (systemd:gorgoned): Started @CENTRAL_NODE2_NAME@
centreon_central_sync (systemd:centreon-central-sync): Started @CENTRAL_NODE2_NAME@
cbd_central_broker (systemd:cbd-sql): Started @CENTRAL_NODE2_NAME@
centengine (systemd:centengine): Started @CENTRAL_NODE2_NAME@
centreontrapd (systemd:centreontrapd): Started @CENTRAL_NODE2_NAME@
snmptrapd (systemd:snmptrapd): Started @CENTRAL_NODE2_NAME@
Migration Summary:
Go back to the nominal situation
To check the various iptables rules configured on the central node 1, run the following command:
iptables -L
The command should return the following information:
Chain INPUT (policy ACCEPT)
target prot opt source destination
DROP all -- @CENTRAL_NODE2_NAME@ anywhere
DROP all -- @DATABASE_NODE1_NAME@ anywhere
DROP all -- @DATABASE_NODE2_NAME@ anywhere
DROP all -- @QDEVICE_NAME@ anywhere
Chain FORWARD (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
DROP all -- anywhere @CENTRAL_NODE2_NAME@
DROP all -- anywhere @DATABASE_NODE1_NAME@
DROP all -- anywhere @DATABASE_NODE2_NAME@
DROP all -- anywhere @QDEVICE_NAME@
If you do not have any other iptables rules configured, you can execute the following command to remove the rules related to the test:
iptables -F
Otherwise, it will be necessary to list the rule numbers with the specific command:
iptables -L --line-numbers
And delete them with the following command:
iptables -D INPUT @RULE_NUMBER@;
iptables -D OUTPUT @RULE_NUMBER@
If you run the crm_mon
command on central node 2, you can see that central node 1 is still the passive node:
Cluster Summary:
* Stack: corosync
* Current DC: @CENTRAL_NODE1_NAME@ (version 2.1.2-4.el8_6.3-ada5c3b36e2) - partition with quorum
* Last updated: Tue Nov 8 17:27:28 2022
* Last change: Tue Nov 8 17:23:19 2022 by root via crm_attribute on @CENTRAL_NODE2_NAME@
* 4 nodes configured
* 21 resource instances configured
Node List:
* Online: [ @CENTRAL_NODE1_NAME@ @CENTRAL_NODE2_NAME@ @DATABASE_NODE1_NAME@ @DATABASE_NODE2_NAME@ ]
Full List of Resources:
* Master/Slave Set: ms_mysql-clone [ms_mysql]
* Masters: [ @DATABASE_NODE1_NAME@ ]
* Slaves: [ @DATABASE_NODE2_NAME@ ]
* Stopped: [ @CENTRAL_NODE1_NAME@ @CENTRAL_NODE2_NAME@ ]
* vip_mysql (ocf::heartbeat:IPaddr2): Started @DATABASE_NODE1_NAME@
* Clone Set: php-clone [php]
* Started: [ @CENTRAL_NODE2_NAME@ @CENTRAL_NODE1_NAME@ ]
* Stopped: [ @DATABASE_NODE1_NAME@ @DATABASE_NODE2_NAME@ ]
* Clone Set: cbd_rrd-clone [cbd_rrd]
* Started: [ @CENTRAL_NODE2_NAME@ @CENTRAL_NODE1_NAME@ ]
* Stopped: [ @DATABASE_NODE1_NAME@ @DATABASE_NODE2_NAME@ ]
* Resource Group: centreon
* vip (ocf::heartbeat:IPaddr2): Started @CENTRAL_NODE2_NAME@
* http (systemd:httpd24-httpd): Started @CENTRAL_NODE2_NAME@
* gorgone (systemd:gorgoned): Started @CENTRAL_NODE2_NAME@
* centreon_central_sync (systemd:centreon-central-sync): Started @CENTRAL_NODE2_NAME@
* cbd_central_broker (systemd:cbd-sql): Started @CENTRAL_NODE2_NAME@
* centengine (systemd:centengine): Started @CENTRAL_NODE2_NAME@
* centreontrapd (systemd:centreontrapd): Started @CENTRAL_NODE2_NAME@
* snmptrapd (systemd:snmptrapd): Started @CENTRAL_NODE2_NAME@
If you want central node 1 to be the active node again, you must do a failover. Before you do this, you must check the cluster.
First, check the constraints:
pcs constraint
The command should return this:
Location Constraints:
Resource: cbd_rrd-clone
Disabled on:
Node: @DATABASE_NODE1_NAME@ (score:-INFINITY)
Node: @DATABASE_NODE2_NAME@ (score:-INFINITY)
Resource: centreon
Disabled on:
Node: @DATABASE_NODE1_NAME@ (score:-INFINITY)
Node: @DATABASE_NODE2_NAME@ (score:-INFINITY)
Node: @CENTRAL_NODE2_NAME@ (score:-INFINITY) (role:Started)
Resource: ms_mysql-clone
Disabled on:
Node: @CENTRAL_NODE1_NAME@ (score:-INFINITY)
Node: @CENTRAL_NODE2_NAME@ (score:-INFINITY)
Resource: php-clone
Disabled on:
Node: @DATABASE_NODE1_NAME@ (score:-INFINITY)
Node: @DATABASE_NODE2_NAME@ (score:-INFINITY)
Ordering Constraints:
Colocation Constraints:
vip_mysql with ms_mysql-clone (score:INFINITY) (rsc-role:Started) (with-rsc-role:Master)
ms_mysql-clone with vip_mysql (score:INFINITY) (rsc-role:Master) (with-rsc-role:Started)
Ticket Constraints:
Now, you can perform a failover to return to the initial situation.
pcs resource clear centreon
Do a cleanup to clear errors.
pcs resource cleanup
You can perform a failover by moving the centreon resource.
pcs resource move centreon
The centreon resource is now relocated and the cluster is OK. Check this with crm_mon -fr
on any node.
Cluster Summary:
Stack: corosync (Pacemaker is running)
Current DC: @CENTRAL_NODE1_NAME@ (version 2.1.8-3.el9-3980678f0) - partition with quorum
Last updated: Fri Mar 21 16:47:43 2025 on @CENTRAL_NODE1_NAME@
Last change: Thu Mar 13 11:30:16 2025 by hacluster via hacluster on @CENTRAL_NODE1_NAME@
4 nodes configured
21 resource instances configured
Node List:
Online: [ @CENTRAL_NODE1_NAME@ @CENTRAL_NODE2_NAME@ @DATABASE_NODE1_NAME@ @DATABASE_NODE2_NAME@ ]
Full List of Resources:
* Clone Set: ms_mysql-clone [ms_mysql] (promotable):
* Promoted: [ @DATABASE_NODE1_NAME@ ]
* Unpromoted: [ @DATABASE_NODE2_NAME@ ]
* Stopped: [ @CENTRAL_NODE1_NAME@ @CENTRAL_NODE2_NAME@ ]
* vip_mysql (ocf:heartbeat:IPaddr2): Started @DATABASE_NODE1_NAME@
* Clone Set: php-clone [php]:
* Started: [ @CENTRAL_NODE1_NAME@ @CENTRAL_NODE2_NAME@ ]
* Stopped: [ @DATABASE_NODE1_NAME@ @DATABASE_NODE2_NAME@ ]
* Clone Set: cbd_rrd-clone [cbd_rrd]:
* Started: [ @CENTRAL_NODE1_NAME@ @CENTRAL_NODE2_NAME@ ]
* Stopped: [ @DATABASE_NODE1_NAME@ @DATABASE_NODE2_NAME@ ]
* Resource Group: centreon:
* vip (ocf:heartbeat:IPaddr2): Started @CENTRAL_NODE1_NAME@
* http (systemd:httpd): Started @CENTRAL_NODE1_NAME@
* cbd_central_broker (systemd:cbd-sql): Started @CENTRAL_NODE1_NAME@
* gorgone (systemd:gorgoned): Started @CENTRAL_NODE1_NAME@
* centreon_central_sync (systemd:centreon-central-sync): Started @CENTRAL_NODE1_NAME@
* centengine (systemd:centengine): Started @CENTRAL_NODE1_NAME@
* centreontrapd (systemd:centreontrapd): Started @CENTRAL_NODE1_NAME@
* snmptrapd (systemd:snmptrapd): Started @CENTRAL_NODE1_NAME@
View cluster logs
The cluster logs are located in /var/log/cluster/corosync.log
(or in /var/log/corosync/corosync.log
for Debian). To display them, use the following command:
tail -f /var/log/cluster/corosync.log
Useful logs can also be found in /var/log/pacemaker/pacemaker.log
.
Change the cluster log verbosity level
To change the verbosity level of the cluster logs, edit the following files:
/etc/sysconfig/pacemaker
/etc/rsyslog.d/centreon-cluster.conf
Advanced commands
Delete a Pacemaker resource group
Warning: These commands will destroy your Centreon cluster. Do this only if you know what you are doing.
Connect to a cluster node and run the following commands:
pcs resource delete centreon \
cbd_central_broker \
gorgone \
snmptrapd \
centreontrapd \
http \
centreon_central_sync \
vip
If that does not work, it is probably due to a resource in a failed state. Run the following commands to delete the resource:
crm_resource --resource [resource] -D -t primitive -C
pcs resource cleanup centreon