Scaling Up Your Kafka Cluster: A Step-by-Step Guide


Apache Kafka is a powerful distributed streaming platform. But for high availability and increased throughput, running a single Kafka server might not be enough. This blog post will guide you through setting up a multi-node Kafka cluster using the KRaft protocol.

What You'll Need:

  • Multiple servers with Kafka installed
  • SSH access to each server

Step 1: Configure Server IDs

  1. Navigate to the config/kraft directory within your Kafka installation on each server.
  2. Grant write permissions for the current user:
Bash
sudo chmod -R u+w /opt/kafka/config/kraft

   3. Copy the existing server.properties file and rename it for each server:
Bash
sudo cp -f server.properties server1.properties
sudo cp -f server.properties server2.properties
sudo cp -f server.properties server3.properties

  4.Edit each server's configuration file and update the node.id property with a unique value:

     
  • server1.properties: node.id=1
  • server2.properties: node.id=2
  • server3.properties: node.id=3

Step 2: Define Listeners and Controller Quorum

  1. Update the listeners property in each configuration file, specifying the ports for communication:

    • server1.properties: listener: PLAINTEXT://:9092,CONTROLLER://:9093
    • server2.properties: listener: PLAINTEXT://:9094,CONTROLLER://:9095
    • server3.properties: listener: PLAINTEXT://:9096,CONTROLLER://:9097
    • Explanation:
      • PLAINTEXT://: Broker communication protocol
      • Port numbers should be unique across servers.
      • CONTROLLER://: Internal controller communication

2. Configure the controller.quorum.voters property in each file, specifying the eligible nodes for electing the controller:

             controller.quorum.voters=1@localhost:9093,2@localhost:9095,3@localhost:9097   (preferable keep the same config across all the properties file)

  • Replace port numbers to match your configuration.
  • This property defines which nodes can participate in the controller election.
  •        

Step 3: Set Advertised Listeners and Log Directories

  1. Update the advertised.listeners property in each file to specify the externally advertised port:

    • server1.properties: advertised.listeners=PLAINTEXT://localhost:9092
    • server2.properties: advertised.listeners=PLAINTEXT://localhost:9094
    • server3.properties: advertised.listeners=PLAINTEXT://localhost:9096
  2. Configure individual log directories for each server:

    • Edit the log.dirs property in each file:
      • server1.properties: log.dirs=/tmp/server-1/kraft-combined-logs
      • server2.properties: log.dirs=/tmp/server-2/kraft-combined-logs
      • server3.properties: log.dirs=/tmp/server-3/kraft-combined-logs

Step 4: Format Storage Directories

  1. Navigate to the main Kafka directory on one server:
               Bash
        cd /opt/kafka
      2. Generate a random directory ID using the kafka-storage.sh script:
         
              ./bin/kafka-storage.sh random-uuid

        Note down the generated ID (e.g., zaAPYCd0R8K5Pxmz6RnFxA).
     
       3. Use the generated ID and each server's configuration file to format the storage directories:

            ./bin/kafka-storage.sh format -t zaAPYCd0R8K5Pxmz6RnFxA -c config/kraft/server1.properties
            ./bin/kafka-storage.sh format -t zaAPYCd0R8K5Pxmz6RnFxA -c config/kraft/server2.properties
            ./bin/kafka-storage.sh format -t zaAPYCd0R8K5Pxmz6RnFxA -c config/kraft/server3.properties


  Step 5: Start the Kafka Brokers


            open new ssh for each server start

  • ./bin/kafka-server.start.sh config/kraft/server1.properties
  • ./bin/kafka-server.start.sh config/kraft/server2.properties
  • ./bin/kafka-server.start.sh config/kraft/server3.properties



Step 6: Stopping Kafka Brokers Gracefully


To stop the Kafka brokers gracefully, use the kafka-server-stop.sh script:

  1. Navigate to the main Kafka directory on any server.
  2. Execute the script:
    ./bin/kafka-server-stop.sh
This script will ensure that brokers:
  • Finish processing any in-flight messages.
  • Transfer leadership of partitions to other available brokers.
  • Cleanly shut down resources.
Important Note: Using Ctrl+C to interrupt the running process will result in an immediate shutdown, potentially leading to data loss. Always use the kafka-server-stop.sh script for graceful shutdown.
controlled.shutdown.enabled=true in your server.properties file for Kafka enables a graceful shutdown of the Kafka broker

Graceful shutdown involves:
  • Finishing any messages currently being processed by the broker.
  • Transferring leadership of any partitions the broker is responsible for to other available brokers in the cluster. This ensures data remains available even after the shutting down broker is gone.
  • Cleaning up resources used by the broker.
             

Comments

Popular posts from this blog

SOLID Principle (Quick Read)

Design Patterns

Building a Smart Holiday Booking System with Agent-to-Agent Communication