Apache Kafka is an open-sourced, fault-tolerant publish-subscribe-based messaging system developed by LinkedIn. A distributed log-service, Kafka is often used in place of traditional message brokers because of its higher throughput, scalability, reliability and replication.
An attractive option for data integration, Apache Kafka is fast and highly scalable. Kafka nodes are created and taken down in an elastic manner; with a single node handling hundreds of read/writes from thousands of clients in real-time. Data streams are split into partitions and spread over different brokers. Although very simple at a high level, Kafka has an incredible depth of technical detail, for which, having a robust Kafka monitoring software is essential to troubleshoot issues and optimize performance.
Applications Manager's Kafka monitoring aims to help administrators collect Kafka metrics, manage clusters and be alerted automatically on potential issues. Let us take a look at what you need to see to monitor Kafka and the performance metrics to gather with Applications Manager Kafka monitor:
Supported versions: Versions 0.7.0 to 3.4.1
Prerequisites for monitoring Apache Kafka: JMX support must be enabled in order to monitor Apache Kafka in Applications Manager. To learn more about enabling JMX in Kafka, click here.
Using the REST API to create a new Kafka monitor: Click here
To create an Apache Kafka Monitor, follow the steps given below:
Note:
In case you are unable to add the monitor even after enabling JMX, try providing the below argument:
-Djava.rmi.server.hostname=[YOUR_IP]
Go to the Monitors Category View by clicking the Monitors tab. Click on Apache Kafka under the Middleware/Portal Table. Displayed is the Apache Kafka bulk configuration view distributed into three tabs:
Applications Manager's Kafka performance monitoring provides complete visibility into your Kafka servers based on the metrics listed in the following tabs:
Parameter | Description |
---|---|
Memory Details | |
Total Physical Memory Size | The total amount of physical memory in Megabytes. |
Free Physical Memory Size | The amount of free physical memory in Megabytes. |
Committed Virtual Memory Size | The amount of virtual memory that is guaranteed to be available to the running process in Megabytes. |
Total Swap Space Size | The total size of virtual memory hold by the JVM. |
Free Swap Space Size | The free virtual memory size. |
Thread Details | |
Daemon Thread Count | The number of daemon threads currently running. |
Peak Thread Count | The peak live thread count since the Java virtual machine started or peak was reset. |
Live Thread Count | The number of live threads currently running. |
Total Started Thread Count | The total number of threads created and also started since the Java virtual machine started. |
Heap and Non Heap Memory Details | |
NonHeapMemoryUsage | The non-heap memory currently in use. |
HeapMemoryUsage | The heap memory currently in use. |
In a Kafka cluster, one of the brokers serves as the controller, which is responsible for managing the states of partitions and replicas and for performing administrative tasks like reassigning partitions.
Parameter | Description |
---|---|
Kafka Controller Details | |
Active Controller Count | Number of active controllers in the cluster. |
Offline Partitions Count | The number of unavailable partitions. |
Leader Election Rate | The rate of leader elections. (When a partition leader dies, an election for a new leader is triggered.) |
Unclean Leader Election Rate | The rate of Unclean Leader Elections. (Unclean leader elections are caused by the inability to find a qualified partition leader among Kafka brokers. When a broker that is the leader for a partition goes offline, a new leader is elected from the set of ISRs for the partition. An unclean leader election is a special case in which no available replicas are in sync) |
Parameter | Description |
---|---|
Log Details | |
Log Flush Rate | The asynchronous disk log flush rate. |
Broker Topic Metrics | |
Bytes In / Min | The aggregate incoming byte rate (amount of data written to topic on this broker) per minute. |
Bytes Out / Min | The aggregate outgoing byte rate per minute. |
Bytes Rejected / Min | The amount of data in messages rejected by broker per minute. |
Failed Fetch Requests / Min | The number of data read requests from consumers that brokers failed to process for this topic per minute. |
Failed Produce Requests / Min | The number of requests from producer that have failed. |
Messages In / Min | The number of Messages that comes into the Kafka broker. |
Replication Manager | |
IsrExpands / Min | The number of "in-sync" replica expansions. (If a broker goes down, ISR for some of the partitions will shrink. When that broker is up again, ISR will be expanded once the replicas are fully caught up). |
IsrShrinks / Min | The number of "in-sync" replica shrinks. (If a broker goes down, ISR for some of the partitions will shrink. When that broker is up again, ISR will be expanded once the replicas are fully caught up) . |
Leader Count | The number of partitions for which a particular host is the leader. |
Partition Count | The number of partitions in the cluster. |
Under Replicated Partitions | This indicates the number of partitions in the cluster are under-replicated. |
Request Handler Avg Idle Percent | The average fraction of time the request handler threads are idle. |
Parameter | Description |
---|---|
Requests Process Rate | |
Request Produce / Min | The number of messages written to topic on this broker. |
Request Fetch Consumer / Min | The amount of data that the consumers fetched from this topic on this broker. |
Request Fetch Follower / Min | The requests from brokers that are the followers of a partition to get new data. |
Time Taken For Requests | |
Total Time Produce / Min | The total time to serve the specified request. |
Total Time Fetch Consumer / Min | The total time that the consumers fetched data from this topic on this broker. |
Total Time Fetch Follower / Min | The total time that is taken by the followers of a partition to get new data |
Network Processor Rate | |
Network Processor Avg Idle Percent / Min | The average free capacity of the network processors per minutes. |
Parameter | Description |
---|---|
Topic Details | |
Topic Name | Specifies the name of the topic. |
Bytes in / Min | The aggregate incoming byte rate (amount of data written to topic on this broker) per minute. |
Bytes Out / Min | The aggregate outgoing byte rate per minute. |
Failed Fetch Requests / Min | The total number of failed Fetch Requests per minute. |
Failed Produce Requests / Min | The total number of failed producer requests. |
Messages In / Min | The number of messages that comes into the Kafka broker. |
Parameter | Description |
---|---|
Storage Details | |
Boot Class Path | The boot class path that is used by the bootstrap class loader to search for class files. |
Class Path | The Java class path that is used by the system class loader to search for class files. |
Spec Vendor | The vendor of the JMX specification implemented by this product. |
Spec Version | The version of the JMX specification implemented by this product. |
VM Name | The Java virtual machine name. |
VM Vendor | The Java virtual machine implementation vendor. |