kafka-monitor

Monitor the availability of Kafka clusters with generated messages.

Github stars Tracking Chart

Xinfra Monitor

Build Status

Xinfra Monitor (formerly Kafka Monitor) is a framework to implement and execute long-running kafka
system tests in a real cluster. It complements Kafka’s existing system
tests by capturing potential bugs or regressions that are only likely to occur
after prolonged period of time or with low probability. Moreover, it allows you to monitor Kafka
cluster using end-to-end pipelines to obtain a number of derived vital stats
such as end-to-end latency, service availability, consumer offset commit availability,
as well as message loss rate. You can easily
deploy Xinfra Monitor to test and monitor your Kafka cluster without requiring
any change to your application.

Xinfra Monitor can automatically create the monitor topic with the specified config
and increase partition count of the monitor topic to ensure partition# >=
broker#. It can also reassign partition and trigger preferred leader election
to ensure that each broker acts as leader of at least one partition of the
monitor topic. This allows Xinfra Monitor to detect performance issue on every
broker without requiring users to manually manage the partition assignment of
the monitor topic.

Xinfra Monitor is used in conjunction with different middle-layer services such as li-apache-kafka-clients in order to monitor single clusters, pipeline desination clusters, and other types of clusters as done in Linkedin engineering for real-time cluster healthchecks.

Getting Started

Prerequisites

Xinfra Monitor requires Gradle 2.0 or higher. Java 7 should be used for
building in order to support both Java 7 and Java 8 at runtime.

Xinfra Monitor supports Apache Kafka 0.8 to 2.0:

  • Use branch 0.8.2.2 to work with Apache Kafka 0.8
  • Use branch 0.9.0.1 to work with Apache Kafka 0.9
  • Use branch 0.10.2.1 to work with Apache Kafka 0.10
  • Use branch 0.11.x to work with Apache Kafka 0.11
  • Use branch 1.0.x to work with Apache Kafka 1.0
  • Use branch 1.1.x to work with Apache Kafka 1.1
  • Use master branch to work with Apache Kafka 2.0

Configuration Tips

Build Xinfra Monitor

$ git clone https://github.com/linkedin/kafka-monitor.git
$ cd kafka-monitor 
$ ./gradlew jar

Start KafkaMonitor to run tests/services specified in the config file

$ ./bin/kafka-monitor-start.sh config/kafka-monitor.properties

Run Xinfra Monitor with arbitrary producer/consumer configuration (e.g. SASL enabled client)

Edit config/kafka-monitor.properties to specify custom configurations for producer in the key/value map produce.producer.props in
config/kafka-monitor.properties. Similarly specify configurations for
consumer as well. The documentation for producer and consumer in the key/value maps can be found in the Apache Kafka wiki.

$ ./bin/kafka-monitor-start.sh config/kafka-monitor.properties

Run SingleClusterMonitor app to monitor kafka cluster

Metrics produce-availability-avg and consume-availability-avg demonstrate
whether messages can be properly produced to and consumed from this cluster.
See Service Overview wiki for how these metrics are derived.

$ ./bin/single-cluster-monitor.sh --topic test --broker-list localhost:9092 --zookeeper localhost:2181

Run MultiClusterMonitor app to monitor a pipeline of Kafka clusters connected by MirrorMaker

Edit config/multi-cluster-monitor.properties to specify the right broker and
zookeeper url as suggested by the comment in the properties file

Metrics produce-availability-avg and consume-availability-avg demonstrate
whether messages can be properly produced to the source cluster and consumed
from the destination cluster. See config/multi-cluster-monitor.properties for
the full jmx path for these metrics.

$ ./bin/kafka-monitor-start.sh config/multi-cluster-monitor.properties

Get metric values (e.g. service availability, message loss rate) in real-time as time series graphs

Open localhost:8000/index.html in your web browser.

You can edit webapp/index.html to easily add new metrics to be displayed.

Query metric value (e.g. produce availability and consume availability) via HTTP request

curl localhost:8778/jolokia/read/kmf.services:type=produce-service,name=*/produce-availability-avg

curl localhost:8778/jolokia/read/kmf.services:type=consume-service,name=*/consume-availability-avg

You can query other JMX metric value as well by substituting object-name and
attribute-name of the JMX metric in the query above.

Run checkstyle on the java code

./gradlew checkstyleMain checkstyleTest

Build IDE project

./gradlew idea
./gradlew eclipse

Wiki

Main metrics

Overview
Name With Ownerlinkedin/kafka-monitor
Primary LanguageJava
Program languageJavaScript (Language Count: 7)
Platform
License:Apache License 2.0
所有者活动
Created At2016-03-30 19:42:49
Pushed At2025-03-09 04:22:39
Last Commit At2023-03-21 22:18:28
Release Count54
Last Release Name2.5.16 (Posted on )
First Release Name0.1.0 (Posted on 2017-01-04 15:27:04)
用户参与
Stargazers Count2k
Watchers Count127
Fork Count445
Commits Count253
Has Issues Enabled
Issues Count130
Issue Open Count18
Pull Requests Count217
Pull Requests Open Count15
Pull Requests Close Count35
项目设置
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private