Handling 1 Million Messages per Second with Kafka & Spring Boot

Mission: Real-Time Throughput at Scale

Bob, our favorite builder, now works at a fast-growing fintech company. His task? Build a real-time transaction processor that can handle 1 million messages per second.

This isn’t your typical “Hello World” microservice. This is the real deal — Kafka + Spring Boot + production-grade tuning.

What You’ll Learn

✔ Kafka architecture that supports million-scale throughput
✔ Spring Boot Kafka producer/consumer tuning
✔ Partitioning, batching, compression, and parallelism
✔ Real-world deployment and performance benchmarks

1 Understanding the Challenge

Bob’s system needs to:

  • Process 1 million events per second
  • Guarantee low latency (<10ms)
  • Be resilient to failures
  • Scale horizontally without falling over

He chooses Apache Kafka for its distributed log-based architecture and high-throughput streaming capability.

2 Kafka Setup: Scaling with Partitions & Brokers

Bob starts with Kafka cluster design.

Real-Life Analogy:
Think of each Kafka partition as a checkout counter at a supermarket. The more counters, the more customers (messages) you can serve in parallel.

Kafka Tuning for 1M/s

3 Spring Boot Kafka Producer Setup

Bob configures his Kafka producers to optimize performance.

application.yml

spring:
kafka:
producer:
batch-size: 32768
buffer-memory: 67108864
compression-type: lz4
acks: 1
linger-ms: 10
retries: 1

TransactionEventProducer.java

@Autowired
private KafkaTemplate<String, String> kafkaTemplate;

public void send(String topic, String message) {
kafkaTemplate.send(topic, message);
}

Tip: Use ProducerRecord if you want to control partitioning manually.

4 Spring Boot Kafka Consumer Setup

Bob now sets up high-speed consumers.

application.yml

spring:
kafka:
consumer:
group-id: high-speed-consumers
max-poll-records: 1000
fetch-min-size: 50000
fetch-max-wait: 500
enable-auto-commit: false

TransactionEventListener.java

@KafkaListener(topics = "transactions", concurrency = "10")
public void consume(String message) {
// High-speed processing logic
}

✔ Set concurrency = number of partitions / cores to scale thread processing.

5 Batching, Compression, and Parallelism

To push the system toward 1M messages/sec, Bob applies:

  • Batch sending (linger.ms + batch.size)
  • Compression (lz4) reduces network IO
  • Consumer concurrency to leverage all CPU cores
  • Message keying for even partition distribution

Use Avro or Protobuf instead of JSON to save 40–70% in message size.

6 Benchmarks: Real-World Results

Bob deploys his system in Kubernetes using:

  • Kafka (3 brokers, 100 partitions)
  • 3 Producer pods, 6 Consumer pods
  • 8 vCPU, 16 GB RAM per pod

Results:

  • ~1.05 million messages/sec sustained
  • < 10 ms average latency
  • 99.99% delivery success with retries + acks=1

7 Real-World Tools & Monitoring

Bob integrates:

  • Prometheus + Grafana: For Kafka and Spring metrics
  • Kafka Manager / Kowl: To inspect topic health
  • Loki + Fluentd: For log aggregation

✔ Track:

  • Lag per partition
  • Consumer group offsets
  • Producer throughput
  • JVM memory and GC

Find us

linkedin Shant Khayalian
Facebook Balian’s
X-platform Balian’s
web Balian’s
Youtube Balian’s

#kafka #springboot #highthroughput #realtimesystems #messagingarchitecture #scalablesystems #microservices #javaperformance #eventdrivenarchitecture

Leave a Reply

Your email address will not be published. Required fields are marked *