What is Failover?
Failover is an alternative monitoring instance that is used to ensure your network remains monitored even when your primary monitoring setup goes down. Network Configuration Manager helps in ensuring efficient network configuration management by allowing you to configure a secondary monitoring instance on a separate server.
How does Failover work?
The primary server updates a value called heartbeat in the database. The heartbeat value is a counter that is incremented by the primary server at a specific frequency of time. The secondary server monitors the heartbeat value to check if it is being updated in the specified time interval. When the primary server goes down, it will not be able to update the heartbeat value in the database. If the heartbeat value in the database is not updated for the last 60 seconds, the primary server is considered to have gone down and the secondary monitoring instance takes over. This secondary server will continue monitoring the network as long as it is up. Meanwhile, if the primary server is up (recovered and restarted), it will take the standby mode and let the secondary server continue monitoring.
The information between the primary and secondary instances are synced periodically, thus ensuring that you don't miss critical monitoring data ( such as syslog messages etc., ) when your primary Network Configuration Manager instance goes down.
What are the prerequisites?
- Apply the failover add-on: Apply the Failover - Hot Standby Engine add-on in your primary instance. You can purchase the add-on for Professional Edition from here (Note: Failover is supported in both MSSQL and remote PGSQL setups. To configure failover for remote PGSQL setup, click here.
- Have the database in a separate server: Ensure that the database for your Network Configuration Manager installation is setup in a separate server and not the same server in which the primary or secondary Network Configuration Manager instance is installed (MSSQL setup preferred).
- Create a shared folder in a separate server: Some data in Network Configuration Manager are stored in files which are present in the local directory. When failover is configured, instead of a local directory, these files are stored in a shared folder that is accessible by both primary and secondary servers . This ensures that there is no data loss when the secondary server takes over the monitoring process.
Create a folder in a separate server and share it with both the primary and secondary servers. Ensure that both primary and secondary servers have access to the shared folder with write permission.
( Note: The server in which the folder is created should be in the same domain in which your primary and secondary servers are configured. Also, the server in which the folder is created should not be the same server in which the primary or secondary instance is configured). Click here to learn how to share a folder with both primary and secondary instances.
- Have a virtual IP address: A Virtual IP address is a common IP address that is shared by both primary and secondary server on the same subnet. When the one server goes down, the virtual IP points to the other server.
- Hardware and software requirements
- Both the primary and secondary instances should be installed in Windows systems.
- The same version of Network Configuration Manager should be installed in both servers.
- Both primary and secondary Network Configuration Manager services should have the same port and protocol ( http / https ).
- Both primary and secondary servers should have the same time and time zone.
- Both primary and secondary servers should have the same hardware configurations.
- Network requirements
- Both primary and secondary servers should have a static IP address.
- The virtual IP should be static and in IPv4 format.
- The primary server and secondary server should be able to resolve each other's host name and IP address.
- The IP and virtual IP of both the primary and secondary servers should belong to the same subnet.
- Both the servers should have high connectivity and bandwidth.
- The primary, secondary and the server in which the shared folder is created should all be in the same domain.
- The Syslogs, SNMP traps and Flows are forwarded to the virtual IP address.
In your primary instance, go to Settings -> General Settings -> Failover Details and enter the following details:
- Secondary Server IP: The IP address or host name of your secondary server.
- Shared folder path: The path to the empty shared folder created in a separate server. This is generally of the form \\<Server_Name_or_IP>\<Share_Name>.
Note: Ensure that the empty folder is shared with both primary and secondary servers. Click here to learn how to share the folder with primary and secondary servers.
- Virtual IP: The virtual IP address. Refer to the pre-requisites to know more about Virtual IP Address.
- Subnet mask (optional): The subnet mask is used to bind the Virtual IP. By default, it is typically set to 255.255.255.0. If the failover configuration is completed before adding the subnet mask value, follow this link for instructions on how to change it.
- Email address (optional): Receive notifications on failover self monitoring alerts, data synchronization alerts and secondary server takeover alerts. You can specify the email recipients to whom the notifications must be sent. You can specify multiple recipients by separating each email address by a comma.
Save the details and perform the following steps in the primary and secondary servers:
In the primary server:
- Stop Network Configuration Manager service
- Share the <NCMHome> folder to the secondary server. Click here to learn how.
- Open command prompt / terminal with administrator priviliges, navigate to <NCMHome>\bin and execute the following command:
Clone_primary_server.bat
- Start the Network Configuration Manager service.
In the secondary server:
- Download the Configure_failover_server.bat file and move it to the folder where you wish to have your secondary instance configured.
- Run the Configure_failover_server.bat file.
- Share the <NCMHome> folder to the primary server. Click here to learn how.
- Start the secondary Network Configuration Manager instance.
Note:
- Network Configuration Manager does not provide any kind of database failover support. It only provides application level failover support.
- Always start the secondary instance after the primary instance is completely started.
- The approximate time taken for the secondary server to completely takeover the primary will be 3-4 minutes. There may be a minor loss of data in few syslogs received during that period.
Upgarding the failover setup: While upgrading your Network Configuration Manager service, it is enough to apply the PPM for the primary setup. The secondary server will be updated automatically.
Encrypted File transfer
In Virtual IP Based Failover, the configuration files in primary and secondary setup will be synced periodically. From version 127189, Encrypted File transfer between Primary and Secondary server will be supported. Please contact our support team to enable it.
Note: Encrypted File transfer is supported only from Windows server 2012 , Windows 8 and the later versions. Make sure that the primary, secondary and the shared folder path server, support Encrypted File Transfer.
Change the subnet mask:
If the customer has already configured failover and wants to change subnet mask, follow the below steps,
- Stop both primary and secondary services.
- Go to itom_fos.conf under <<NetworkConfigurationManager>\conf>home directory, and modify the subnet mask value in the following key: publicIP.netmask (this has to be done in both primary and secondary servers)
- Start the primary service completely, and once it is connected to UI, start the secondary service.