PostgreSQL Monitoring Tools


Overview

Applications Manager's PostgreSQL Monitoring capabilities play a vital role in monitoring your PostgreSQL database servers by providing end-to-end visibility into the performance of your database server in real-time. It aids you to keep check over the health and availability of your PostgreSQL servers by monitoring various key metrics such as database utilization, response time, query performance, session details and much more.

With Applications Manager, you can ensure that the performance of your PostgreSQL environments is kept on point. In this help document, you will learn how to get started with Postgres monitoring, along with the list of parameters that are monitored by Applications Manager.

For detailed information on the performance configurations of your PostgreSQL database server and recommendations for tuning them, visit our page on PostgreSQL tuning.

Creating a new PostgreSQL performance monitor

Supported Versions : 8.1 and above.

Prerequisites for monitoring PostgreSQL metrics : Click here

Rest API to add a new PostgreSQL performance monitor : Click here

To create a PostgreSQL database server monitor, follow the steps given below:

  1. Click on New Monitor link.
  2. Select PostgreSQL.
  3. Enter the Display Name of the database server.
  4. Enter the IP Address or the Host Name of the host.
  5. Enter the port number in which PostgreSQL is running.
  6. Provide the user name and password of user who has permission to access the PostgreSQL database.
  7. Specify the DBName.
  8. Enter the polling interval time in minutes.
  9. If you are adding a new monitor from an Admin Server, select a Managed Server.
  10. Choose the Monitor Group with which you want to associate the PostgreSQL database server to, from the combo box (optional). You can choose multiple groups to associate your monitor.
  11. Click Add Monitor(s). This discovers PostgreSQL database server from the network and starts monitoring them.

Monitored Parameters

  • Availability tab displays the availability history of the PostgreSQL database servers in your network for the past 24 hours or 30 days.
  • Performance tab displays the health status and events for the past 24 hours or 30 days.
  • List view tab displays all the PostgreSQL servers that you monitor along with their current availability and health status. You can also perform bulk admin configurations from this view.

Applications Manager's PostgreSQL monitoring tool provides complete visibility into your database environments based on the metrics listed in the following tabs:

Monitor Information

Parameter Description
Name Denotes the name of PostgreSQL database server monitor.
Type Denotes the type you are monitoring.
Health Denotes the health (Clear, Warning, Critical) status of the PostgreSQL server.
Last Polled at Specifies the time at which the last poll was performed.
Next Poll at Specifies the time at which the next poll is scheduled.
Availability Shows the current status of the server - available or not available.

Overview

Parameter Description
Connection Statistics:
Active Connections Number of currently active connections to the database.
Total Users The total number of users active at the time of data collection.
Lock Statistics:
Locks Held Number of locks held by the indicated session.
Locks Wait Number of locks waiting in the database.
Buffer Statistics:
Buffer Hits/min Total buffer hits (i.e., block read requests avoided by finding the block already in buffer cache) per minute.
Block Reads/Min Total disk blocks read per minute.
Cache Hit Ratio The current ratio of buffer cache hits to total requests.
Disk Usage Details:
Disk Usage Size of the on-disk representation of all tables in the database in MB.
Index usage Size of the on-disk representation of all indexes in the database in MB.
Index Scan Details:
Index scans/min Total number of index scans initiated per minute.
Index Reads/min Total number of index entries returned by index scans per minute.
Index Fetches/min Total number of live table rows fetched by simple index scans per minute.
Query Statistics:
Row inserts/min Total number of rows inserted per minute.
Row Updates/min Total number of rows updated per minute.
Row Deletes/min Total number of rows deleted per minute.
Transaction Details:
Total Commits Total number of transactions committed.
Total Rollbacks Total number of transactions rolled back.
Commits/Min Total number of transactions committed per minute.
Rollbacks/Min Total number of transactions rolled back per minute.
Table Level Scan Details:
Sequential Scans/min Total number of sequential scans per minute.
Table Index Scans/min Total number of index scans per minute.
Sequential Scan Rows Read/min Total number of rows returned by sequential scans per minute.
Table Index Scan Rows Read/min Total numbers of rows returned by index scans per minute.
Primary Database Object Statistics:
Total Tables Total number of tables in the database server.
Total Triggers Total number of triggers in the database server.
Total Procedures Total number of procedures in the database server.
Size of the Largest Table Size of the largest table in the database server.
Largest Table(s) Largest table in the database server.

Performance

Parameter Description
Top Queries by CPU:
Database Name The database name under which the query has been executed.
Total CPU Time The cumulative total amount of CPU time that has been spent in running the query (in millisecond).
Average CPU Time The average CPU time taken to execute the query (in millisecond).
Max CPU Time The maximum amount of time that was taken by the CPU to run the query (in millisecond). (Available from Postgres 9.4)
Query The SQL query which consumed the maximum CPU time.
Long Running Query Details:
Process ID The identifier for the process caused by the query.
Runtime Amount of time for which the query has been running (in minutes).
User Name Name of the user executing the query.
Query The SQL query that was running.
Database Name The database name under which the query has been running.
Top 50 Table Row Details:
Table Name Name of the table.
No. of Rows Number of rows available in the table.
Table Size Size of the table (in MB).
Used Size(MB) Size of the used space(in MB).
Free Size(MB) Size of the free space (in MB).
Used Size(%) Size of the used space in percentage.
Free Size(%) Size of the free space in percentage.
Top 10 Table By Dead Tuples Percentage
Schema Name The name of the schema where the table resides.
Table Name The name of the table.
Data Length(MB) The total size of the table's data (in MB).
Index Length(MB) The size of the table's indexes (in MB).
Live Tuples Number of live tuples (rows) in the table.
Dead Tuples Number of dead tuples (rows that have been marked for deletion but not yet vacuumed).
Dead Tuple Percentage(%) Percentage of Dead Tuples.
Days Since Last Vacuum Days passed since the last VACUUM operation performed on the table.
Last Vacuum Timestamp of the last VACUUM operation performed on the table.
Days Since Last Auto Vacuum Days passed since the last automatic VACUUM operation performed on the table.
Last Auto Vacuum Timestamp of the last automatic VACUUM operation performed on the table.

 

Note: If the threshold is breached for Top 10 queries by CPU and Long-Running Query Detail attributes, the details about the threshold breach will be attached in alert RCA message and as an HTML attachment in Email notification. To apply these feature, follow the given steps:
  • Go to Performance Polling -> Optimize Data Collection.
  • Choose PostgreSQL monitor type and Performance metric.
  • Set the time interval in Configure Long Running Query time interval in minutes tab.
  • Enable Attach query performance details in alert mail option if you wish to attach the query performance details as HTML file to the alert notification mails raised for metrics in 'Performance' tab.

Database

Parameter Description
Database Details:
Database Name Name of the database instance.
DB Size Size of the database instance (in MB).
Connections Number of connections for the database instance.
TableSpace Name of the tablespace.
TableSpace Details:
Name Name of the tablespace.
Owner Owner who created the tablespace.
Location Location where the tablespace is created.
Size Current size of the tablespace (in MB).

Session

Parameter Description
Session Details:
Process ID Displays the process ID of the session running in the Postgres server.
Client Address IP address of the client connected to the server.
Application Name The name of the application that has established the session. (Available from Postgres 9.0)
Database Name Displays the name of the database currently being used by the process.
User The login name of the user executing the process.
State The state of the process in the Postgres server. (Available from Postgres 9.2)
Blocked Indicates whether the process is blocked or is currently waiting on lock.
Wait Event Displays the Wait Event Name and Wait Event Type that is currently waiting. (Available from Postgres 9.6)
Query The SQL query that was last executed in the process.
CPU Time The cumulative CPU time taken for the process running in the Postgres server (in millisecond).
USER PROFILE DETAILS
User ID Unique OID of the user.
User Name Username of the user.
Role Name Privilege assigned to the user.
Super User User has Super user privilege or not. Possible values: True/ False.
CreateDB Allowed User has the privilege to create the DB. Possible values: True/ False.
Replication User User has is Replication User previlege or not. Possible values: True/ False.
Expired Days Number of days until the password expiry.
Expired Time Expire time of the Password

Lock

Parameter Description
Locked Table Details
Process ID The unique ID of the process.
Table Name Name of the table.
Lock Type The type of lock held. Possible Values: relation/ tuple/ page/ extend/ access share/ row share/ row exclusive etc..
Page Page number within the table's data file.
Virtual Transaction Represents the identifier for the virtual transaction holding the lock.
Mode Lock mode indicates the specific type and level of lock held.
Granted Access Boolean value (true or false) indicating whether the lock has been granted (true) or is waiting (false).
Blocked Session Details
Process ID Process ID (PID) of the session that is currently blocked.
User Name Username of the session that is blocked (blocked_pid).
Blocked Session by Process ID Array of PIDs of sessions that are blocking the session identified by PID.
Blocked Query Current query being executed by the session that is blocked.

Replication

Parameter Available Versions Description
Replication Details:
WAL files count >=9.2 Number of WAL files present in WAL directory.
Current WAL Location >=9.2 Gives the last inserted location.
WAL level >=9.2 wal_level determines how much information is written to the WAL.
  • Default value = minimal (version < 10) minimal → writes only the information needed to recover from a crash or immediate shutdown.
  • Default value = replica (version > 10) replica → writes enough data to support WAL archiving and replication, including running read-only queries on a standby server.
  • Other states: hot_standby → minimal state/replica state info + adds information required to run read-only queries on a standby server.
WAL sync method >=9.2 Method used for forcing WAL updates out to disk. If fsync is off, then this setting is irrelevant, since WAL file updates will not be forced out at all. Possible values are: open_datasync (write WAL files with open() option O_DSYNC), fdatasync (call fdatasync () at each commit), fsync (call fsync () at each commit), fsync_writethrough (call fsync () at each commit, forcing write-through of any disk write cache)
checkpoints_timed >=9.2 Number of scheduled checkpoints that have been performed.
checkpoints_req >=9.2 Number of requested checkpoints that have been performed.
buffers_checkpoint >=9.2 Number of buffers written during checkpoints.
buffers_clean >=9.2 Number of buffers written by the background writer.
buffers_backend >=9.2 Number of buffers written directly from backend.
Client Address >=9.2 Address of client/standby server address.
Client Port >=9.2 TCP port number that the client is using for communication with this WAL sender.
PID >=9.2 Process ID of a WAL sender process.
application name >=9.2 Name of the application that is connected to this WAL sender.
Sender State >=9.2 Current WAL sender state. Possible values are:
  • startup: This WAL sender is starting up.
  • catchup: This WAL sender's connected standby is catching up with the primary.
  • streaming: This WAL sender is streaming changes after its connected standby server has caught up with the primary.
  • backup: This WAL sender is sending a backup. stopping: This WAL sender is stopping.
Sync State >=9.2 Synchronous state of this standby server. Possible values are:
  • async: This standby server is asynchronous.
  • potential: This standby server is now asynchronous, but can potentially become synchronous if one of the current synchronous ones fails.
  • sync: This standby server is synchronous.
  • quorum: This standby server is considered as a candidate for quorum standbys.
Sending lag >=9.2 Lag in sending the transactions to standby
Last replay location >=9.2 Gives the last replayed location
Write lag >=9.2 Lag in writing transaction to disk in standby.
Flush lag >=9.2 Lag in flushing transactions to disk in standby.
Replay lag >=9.2 Lag in replaying the transactions on standby
Total lag >=9.2 Total lag sending from master to replaying transaction on standby.
Master Host >=9.6 Host details of master node.
Master Port >=9.6 Port of master node.
Master User >=9.6 User details of the master node.
Replication Delay >=9.1 Time lag between the last received transaction and the last replayed transaction.
Replication Slots:
slot_name >=9.6 A unique, cluster-wide identifier for the replication slot.
slot_type >=9.6 The slot type - physical or logical.
active >=9.6 True if this slot is currently actively being used.
restart_lsn >=9.6 The address (LSN) of oldest WAL which still might be required by the consumer of this slot and thus won't be automatically removed during checkpoints.
confirmed_flush_lsn >=9.6 The address (LSN) up to which the logical slot's consumer has confirmed receiving data. Data older than this is not available anymore. NULL for physical slots.

Configuration

Parameter Description
Version Version of the PosgreSQL Server.
Config File Location of Postgres server configuration file.
Data Directory Location of data directory used by the Postgres server.
External PID File Location of external PID file.
hba File Location of hba file.
ident File Location of ident file.