Skip to content

Data Generation V3: Cranking the Dial with Kafka Tuning

After Version 2 got me to a respectable 200K posts/sec, I wasn’t ready to call it a day. BrandPulse demanded more—closer to that 700k posts/sec target—so I dug deeper. Version 3 wasn’t about small tweaks; it was about tearing into Kafka configs, rethinking bottlenecks, and pushing my setup to the edge. This is where I went from “it works” to “it flies,” hitting around 600k posts/sec with a mix of hustle and hard-earned lessons.

The Challenge

Version 2 was stable but plateaued—200K posts/sec felt good, but I knew the math didn’t add up to my theoretical 1.6M/sec with 8 workers. Something was holding me back: serialization was a slug, the network was capping out, and Kafka’s defaults weren’t cutting it. I needed to strip it down, tune it up, and make every millisecond count. The goal? Break 200K and aim for 600k+, setting the stage for the final producer.

The Plan

I zeroed in on five big moves:

  1. Aggressive Kafka Configs: Maxed out batch size, slashed linger time, and ditched compression for raw speed.
  2. Tighter Throttling: Dropped the delay from 100ms to 5ms—faster pacing, more pressure.
  3. Concurrency Boost: Cranked maxInFlightRequests to 1000 for parallel firepower.
  4. Memory Muscle: Bumped buffer memory to 8GB, leaning hard into my machine’s limits.
  5. Fire-and-Forget: Set acks: 0 to skip broker confirmation—risky, but fast.

The Code: V3 Unleashed

Here’s what I rolled out—lean, mean, and built to push boundaries:

javascript
const { Kafka, Partitioners, CompressionTypes } = require('kafkajs');
const { Worker, isMainThread, parentPort, threadId } = require('worker_threads');
const dataSchema = require('../../schema/avroSchema');
const os = require('os');

const kafkaConfig = {
  clientId: `dataStorm-producer-${process.pid}-${threadId}`,
  brokers: ['localhost:9092'],
  retry: { retries: 0, initialRetryTime: 50, maxRetryTime: 1000 },
  producer: {
    createPartitioner: Partitioners.LegacyPartitioner,
    transactionTimeout: 30000,
    compression: null,         // No compression for raw speed
    maxInFlightRequests: 1000, // High concurrency
    idempotent: false,        // Skip checks for speed
    allowAutoTopicCreation: false,
    batchSize: 32 * 1024 * 1024, // 32MB batches
    lingerMs: 2,              // Minimal linger
    acks: 0,                  // Fire-and-forget
    bufferMemory: Math.min(8 * 1024 * 1024 * 1024, Math.floor(os.totalmem() * 0.8)),
    socketTimeout: 60000,
    connectionTimeout: 30000
  }
};

const BATCH_SIZE = 10000;
const BATCH_INTERVAL_MS = 5;

const recordCache = new Array(BATCH_SIZE).fill(null);

const generateBatch = () => {
  const now = new Date().toISOString();
  return recordCache.map(() => ({
    value: dataSchema.toBuffer({
      id: Math.floor(Math.random() * 100000),
      timestamp: now,
      value: Math.random() * 100
    })
  }));
};

if (!isMainThread) {
  const producer = new Kafka(kafkaConfig).producer();

  const runProducer = async () => {
    await producer.connect();
    parentPort.postMessage({ type: 'status', status: 'connected' });

    while (true) {
      try {
        const batch = generateBatch();
        const startTime = Date.now();
        await producer.send({
          topic: 'dataStorm-topic',
          messages: batch,
        });
        const duration = Date.now() - startTime;
        parentPort.postMessage(`${BATCH_SIZE} in ${duration}ms`);
        await new Promise(resolve => setTimeout(resolve, BATCH_INTERVAL_MS));
      } catch (err) {
        console.error(`Worker error: ${err.message}`);
      }
    }
  };

  runProducer().catch(err => {
    console.error(`Fatal worker error: ${err.message}`);
    process.exit(1);
  });
}

if (isMainThread) {
  const WORKER_COUNT = Math.max(4, os.cpus().length * 3);
  const workers = new Set();

  console.log(`Main process started. Spawning ${WORKER_COUNT} workers`);

  const spawnWorker = (id) => {
    const worker = new Worker(__filename);
    worker
      .on('message', (msg) => console.log(`[W${id}] ${msg}`))
      .on('error', (err) => console.error(`[Worker ${id}] Error: ${err.message}`))
      .on('exit', (code) => {
        console.log(`[Worker ${id}] Exited with code ${code}`);
        workers.delete(worker);
        if (code !== 0) spawnWorker(id);
      });
    workers.add(worker);
  };

  for (let i = 0; i < WORKER_COUNT; i++) {
    spawnWorker(i + 1);
  }

  process.on('SIGINT', async () => {
    console.log('\nGracefully shutting down...');
    for (const worker of workers) {
      await worker.terminate();
    }
    process.exit();
  });
}

Results

  • Throughput: Landed around 600k posts/sec with 12 workers (assuming a decent machine). Each worker was pushing ~25K/sec, thanks to the tuned configs.
  • Performance Boost: Dropping compression and linger time, plus jacking up maxInFlightRequests, gave me a solid 50% jump from v2’s 200K.
  • Stability Trade-off: acks: 0 made it blazing fast but risked data loss—fine for a prototype, not production.

Tuning Breakdown

Here’s how the Kafka configs shaped up:

ParameterValueImpactTrade-off
compressionNone+20% speedBigger payloads
batchSize32MBFewer network tripsMemory pressure
lingerMs2msFaster sendsLess batch efficiency
maxInFlightRequests1000+50% throughputPotential disorder
bufferMemory8GBNo blockingHigh RAM usage
acks0+30% speedNo delivery guarantee

The Reality Check

Theory said 12 workers at 10K posts every 5ms could hit 2.4M/sec (12 × 10K × 200 batches/sec). Reality? 600k/sec. Why the gap?

  • Serialization: Avro’s toBuffer() was still a hog—~3ms per message ate into batch time.
  • Network Ceiling: Localhost TCP capped out around 700K/sec; I was pushing its limits.
  • Worker Overlap: More workers (12 vs. 8) meant I/O thrashing—too many cooks in the kitchen.

Challenges Faced

  • Memory Spikes: 8GB buffer sounded cool until RAM usage hit 80%—had to keep an eye on that.
  • Error Spikes: acks: 0 and no retries meant more dropped messages when Kafka lagged.
  • CPU Thrashing: 12 workers on an 8-core machine caused contention—over-optimism bit me.

Lessons Learned

  • Tuning’s a Knife Edge: Push too hard (e.g., acks: 0), and you trade reliability for speed. Balance is key.
  • Know Your Limits: Serialization and network caps showed me hardware matters as much as code.
  • Test the Math: Theoretical 2.4M/sec sounded great—reality taught me to trust benchmarks over dreams.

Why It Matters

This wasn’t just about numbers—it was about proving I could optimize under pressure. In India, we’re used to making the most of what we’ve got, and v3 was that mindset in action: squeezing every drop of performance out of a single machine. It’s the kind of skill that stands out to engineers and recruiters alike—practical, relentless, and results-driven.

Next Up

600k posts/sec was a milestone, but 700k was the prize. Version 4 (v4) tackled serialization and network bottlenecks head-on—check it out for the next leap.

Built with precision engineering and innovative solutions.