In network-level management, maintaining the status and connectivity of the network is a picture at a higher level. It is of prime importance to know the status of the machines in the network, how loaded (or overloaded) they are and how efficiently they are utilized (or overused) to enable necessary corrective administrative functions to be performed on the identified overloaded/poorly performing systems. Server-level management is a down-to-earth concept which involves a lot of manual intervention, human resources, and administrative tasks to be performed. Applications Manager provides a server-level monitoring functionality to achieve such goals and to ease the process of configuration management of hosts.
To create any of the above server monitors, follow the steps given below:
You can diagnose issues that occur while adding a server monitor whenever any input details are wrongly entered. By clicking the Diagnose the Problem link, you will be able to view various information associated with the server such as Ping test, host details, monitoring modes along with the list of tables having stray entries of the same host. However, this is not applicable for WMI monitoring mode.
Click on the individual monitors listed to view the following information:
Parameters | Description |
---|---|
System Load | Specifies the number of jobs handled by the system in 1/ 5/ 15 minutes with its peak and current value, and current status. |
Average System Load | Specifies the average amount of work that the system performs per core in 1/ 5/ 15 minutes with its peak value, current value, and current status, along with the total number of CPU Cores present. |
Disk Utilization | Specifies the hard disk space utilized by the system and updates with the peak and current value, and current status of the Disk Partition parameter.(The parameter includes C, D, E, F drives, etc. in windows, /home, etc. in Linux.) |
Memory Utilization |
|
Disk I/O Stats | Specifies read/writes per second, transfers per second, for each device. |
Network Interface | Specifies details about the network interface in the system, rate of traffic transferred and the status of physical and logical network ports.
Note: Network Interface monitoring (for both IBM AIX and Linux) is possible in SSH and Telnet modes. |
Connection Stats | Specifies details about the connection status.
Note: Connection Stats is supported in Linux & AIX (SSH/ TELNET), Windows (WMI) modes of monitoring. |
Network Adapter | Specifies the name, SCSI link status etc.. of the Fibre Channel device. |
CPU Utilization | Specifies the total CPU used by the system with its peak and current value, and current status. |
LPAR CPU Stats |
Note: Utilization in CPU cores and CPU Entitlement Utilization attributes will be shown only when the value for Type attribute is shared. |
# The drives beginning with the characters given below will not be monitored in the server monitor.
am.disks.ignore=C:
Here, monitoring will not happen for C: drive. Likewise, you can add further disks comma-separated(C;D:/home).
The following table briefs the parameters monitored & the mode of monitoring ( - yes).
To enable it, permissions need to be given to the admin or operator to use this telnet client. The permissions can be given from Settings → User Management → Permissions link.
Operating System |
Telnet | SSH | SNMP | WMI |
---|---|---|---|---|
Windows | (only if Applications Manager is installed on a Windows machine) | |||
Linux | ||||
Sun Solaris | ||||
HP-UX / Tru64 Unix | ||||
FreeBSD | ||||
Mac OS | ||||
IBM AIX | ||||
Novell | ||||
Attributes | ||||
CPU Utilization (all types except Windows NT) | ||||
Disk Utilization (all types) | ||||
Physical Memory Utilization (IBM AIX -only for the root user, Windows - WMI mode, all other types) | ||||
Swap Memory Utilization (IBM AIX - only for the root user, FreeBSD, Linux, Sun Solaris, Windows, Novell) | ||||
Computational Memory Utilization (IBM AIX) | ||||
Network Interface (all types) | [status attribute data is not available] | |||
IP Address Status (Sun Solaris) | ||||
IPMP Group Status (Sun Solaris) | ||||
Multipathing Status (Sun Solaris) | ||||
HBA Port Status (Sun Solaris) | ||||
Network Adapter (IBM AIX) | ||||
Connection Stats (IBM AIX) | ||||
LPAR CPU Stats (IBM AIX) | ||||
Service Fault Status (Sun Solaris) | ||||
System Fault Status (Sun Solaris) | ||||
Chassis Status (Sun Solaris) | ||||
LEDs Status (Sun Solaris) | ||||
Temperature Sensor Status (Sun Solaris) | ||||
Fan Sensor Status (Sun Solaris) | ||||
Voltage Sensor Status (Sun Solaris) | ||||
Current Sensor Status (Sun Solaris) | ||||
Average System Load (IBM AIX and Linux) | ||||
NTP Stats (IBM AIX and Linux) | ||||
Zpool Utilization (Sun Solaris) | ||||
Process Monitoring (all types) | ||||
Process Monitoring - Memory Utilization (all types) | ||||
Process Monitoring - CPU Utilization (IBM AIX - FreeBSD, Linux, Mac OS, Sun Solaris, HP Unix / Tru64) | ||||
Process Monitoring - Zombie Process Count (IBM AIX and Linux) | ||||
Service Monitoring (only for Windows, Linux and AIX) | ||||
Event log (only for Windows ) | ||||
System Load ( IBM AIX, FreeBSD, Linux, Mac OS, HP-Unix, Sun Solaris, Novell ) | ||||
Disk I/O Stats (only for IBM AIX, Linux, Sun Solaris, Novell) | ||||
Disk Errors (Sun Solaris) | ||||
Hardware monitoring (Dell, HP) | (supported only for Sun Solaris) | (supported only for Sun Solaris) | ||
Server Uptime ( IBM AIX, FreeBSD, Linux, Mac OS, HP-Unix, Sun Solaris, Novell, Windows ) | ||||
Firewall monitoring ( Only for Windows ) |
Note: To know more about the configuration details required while discovering the host resource, click here.
Avg. Queue Length and % Busy Time in Disk I/O statistics for AIX are not supported for physical disks and Fibre Channel cards. |
When it comes to choosing the mode of monitoring for servers, we recommend Telnet/SSH over SNMP.
To get in-depth details on Page Space in AIX servers, you can use the following command "lsps -a".
The command "lspa -a" lists the location of the paging space logical volumes as they were, not as they are.
Normally page spaces are used when the process running in the system has used the entire allocated memory and it has run out of memory space. It then uses the page spaces in the system to move the piece of code/data that is not currently referenced by the running process into the page space area so that it could be moved back to the Primary memory when it is been referenced again by the currently running process.
While trying to monitor the AIX server, if you get "No data available" for Page Space, you can troubleshoot it by following the steps given below:
First, you need to establish a connection only through TELNET or SSH mode.
Second, check whether the command lsps -a exists in the system and then execute it.
The "lsps" command displays the characteristics of paging spaces, such as the paging space name, physical volume name, volume group name, size, percentage of the paging space used, whether space is active or inactive, and whether the paging space is set to automatic. The paging space parameter specifies the paging space whose characteristics are to be shown.
The following examples show the use of lsps command with various flags to obtain the paging space information. The "-c" flag will display the information in colon format and paging space size in physical partitions.
# lsps -a
Page Space | Physical Volume | Volume Group | Size | %Used | Active | Auto | Type |
---|---|---|---|---|---|---|---|
paging00 | hdisk1 | rootvg | 80MB | 1 | yes | yes | lv |
hd6 | hdisk1 | rootvg | 256MB | 1 | yes | yes | lv |
To make a paging space available to the operating system, you must add the paging space and then make it available. The total space available to the system for paging is the sum of the sizes of all active paging-space logical volumes.
You can get more details about the command here: https://www.ibm.com/support/knowledgecenter/en/ssw_aix_72/devicemanagement/pscpag_space_config.html
Apart from the above-mentioned parameters, you can also monitor the following:
To monitor processes in a server
After configuring the processes, they are listed under the Process Details section of the Server Monitor page. By clicking on the process, you can view its availability graph. You can also configure alarms for a particular process.
You can edit the Display Name, Process Name, Commands and Arguments of the particular process by clicking on the Edit Process icon.
To monitor Windows, Linux and IBM AIX services
After configuring the services, they are listed under the Service Details section of the Monitor page. By clicking on the service, you can view its availability graph and also configure alarms for the availability of that particular service.
Apart from monitoring the availability of the service, you can manage the services by using the start, stop, restart options. When the service goes down, configure action 'Restart the Service' along with other actions.
In the Server Monitor page, under Network Interfaces, all the network interfaces will be listed. The various attributes that can be monitored are:
By associating a script or a URL to a Host resource, their attributes become one among the other attributes of the Host and their data is also shown under Host Details itself. The health of the Host resource is dependent on the Health of the Scripts and URLs as well.
For eg., If you wish to monitor RequestExecutionTime, RequestsCurrent, RequestsDisconnected of the ASP.NET application, WMI scripts can be used to get the statistics (this info is not available when Applications Manager is used). You can write your own script that would fetch these details then configure this script to the Applications Manager. After configuring this script to the Applications Manager you can associate this script to the host monitor itself. Then the attributes of the script would behave like the other attributes of the Host monitor. Hence, you can configure in such a way that the Health of the script directly affects the health of the host.
Likewise, If you want to monitor a website hosted in a system in such a way that, whenever there is a change in the health of the website, the health of the server should reflect the change. In this case, you can configure the URL monitor and then associate that URL to the host. Hence, if the website is down, the health of the Host resource is affected.
We recommend Telnet or SSH mode of monitoring because the following attributes are not available through SNMP:
Please check this link for more details.
System administrators generally prefer to check system resources with commands and will prefer to compare it with the SSH/telnet mode output, rather than running SNMP walk to compare. Also, having the connection to the Linux boxes over SSH will make it easier for you to configure the same for script monitors or 'execute program' actions if required.
Here is a list of commands used by Applications Manager for both Windows, Linux and Unix Servers:
Windows:
Parameter | Command |
---|---|
Disk Utilization | disk.vbs |
Win Physical Disk Stats | diskio.vbs |
Network Interface | NetworkInterface.vbs |
Network Adapter | NetworkAdapter.vbs |
Memory Utilization | memory.vbs |
CPU Utilization | cpu.vbs |
CPU Core Utilization | cpucore.vbs |
Services | services.vbs |
Process | PhyMemCpuImportProduct.vbs |
Server Uptime | uptime.vbs |
Linux:
Parameter | Command |
---|---|
Memory Utilization | free -b |
Memory Utilization in the Memory tab | LANG=C cat /proc/meminfo;echo '-----FREE_MEM_STATS-----';LANG=C free -m |
System reboot | date +%s;/bin/cat /proc/uptime | cut -d "." -f1 |
Current Date and Time | LANG=C date |
ThreadCount | ps -eo nlwp | awk '{ threadcount += $1 } END { print threadcount }' |
Disk Utilization | /bin/df -Pm |
Disk IO Stats | export S_COLORS='never';LANG=C iostat -d;echo '-----DISK_EXTENDED_STATS-----';iostat -d -x 1 3 |
Inode Usage | /bin/df -Pi |
System Load | uptime |
CPU Utilization | /usr/bin/vmstat 1 3 |
CPU Core Utilization | export S_COLORS='never';mpstat -P ALL 1 3 |
Kernel Statistics | export S_COLORS='never';LANG=C sar -B -w 1 3 | awk '{if(NR>2)print}' |
Server Uptime | uptime|cut -d ',' -f1,2|tr -s ' ' '^'|cut -d '^' -f 2- |
Network Interface | LANG=C ip -s -j link (if json format is supported) (or) LANG=C ip -s link |
Network State | LANG=C netstat -nat | awk '{if(NR>1)print}' | awk '{print $6}' | sort | uniq -c | sort -n |
NTP Monitoring | ntpstat (or) chronyc tracking |
NTP Status | yes N | LANG=C ntpstat (or) LANG=C chronyc tracking (if Chronyc installed) |
Unix:
Parameter | Command |
---|---|
Memory Utilization | export UNIX95;top -d 1 -n 2 |
Disk Utilization | /bin/df -m |
System Load | uptime |
CPU Utilization | /usr/bin/vmstat 1 3 |
CPU Core Utilization | /usr/bin/vmstat -n 0 -P 1 3 |
Server Uptime | uptime|cut -d ',' -f1,2|tr -s ' ' '^'|cut -d '^' -f 2- |