Data Generation V3: Cranking the Dial with Kafka Tuning
After Version 2 got me to a respectable 200K posts/sec, I wasn’t ready to call it a day. BrandPulse demanded more—closer to that 700k posts/sec target—so I dug deeper. Version 3 wasn’t about small tweaks; it was about tearing into Kafka configs, rethinking bottlenecks, and pushing my setup to the edge. This is where I went from “it works” to “it flies,” hitting around 600k posts/sec with a mix of hustle and hard-earned lessons.
The Challenge
Version 2 was stable but plateaued—200K posts/sec felt good, but I knew the math didn’t add up to my theoretical 1.6M/sec with 8 workers. Something was holding me back: serialization was a slug, the network was capping out, and Kafka’s defaults weren’t cutting it. I needed to strip it down, tune it up, and make every millisecond count. The goal? Break 200K and aim for 600k+, setting the stage for the final producer.
The Plan
I zeroed in on five big moves:
- Aggressive Kafka Configs: Maxed out batch size, slashed linger time, and ditched compression for raw speed.
- Tighter Throttling: Dropped the delay from 100ms to 5ms—faster pacing, more pressure.
- Concurrency Boost: Cranked
maxInFlightRequests
to 1000 for parallel firepower. - Memory Muscle: Bumped buffer memory to 8GB, leaning hard into my machine’s limits.
- Fire-and-Forget: Set
acks: 0
to skip broker confirmation—risky, but fast.
The Code: V3 Unleashed
Here’s what I rolled out—lean, mean, and built to push boundaries:
const { Kafka, Partitioners, CompressionTypes } = require('kafkajs');
const { Worker, isMainThread, parentPort, threadId } = require('worker_threads');
const dataSchema = require('../../schema/avroSchema');
const os = require('os');
const kafkaConfig = {
clientId: `dataStorm-producer-${process.pid}-${threadId}`,
brokers: ['localhost:9092'],
retry: { retries: 0, initialRetryTime: 50, maxRetryTime: 1000 },
producer: {
createPartitioner: Partitioners.LegacyPartitioner,
transactionTimeout: 30000,
compression: null, // No compression for raw speed
maxInFlightRequests: 1000, // High concurrency
idempotent: false, // Skip checks for speed
allowAutoTopicCreation: false,
batchSize: 32 * 1024 * 1024, // 32MB batches
lingerMs: 2, // Minimal linger
acks: 0, // Fire-and-forget
bufferMemory: Math.min(8 * 1024 * 1024 * 1024, Math.floor(os.totalmem() * 0.8)),
socketTimeout: 60000,
connectionTimeout: 30000
}
};
const BATCH_SIZE = 10000;
const BATCH_INTERVAL_MS = 5;
const recordCache = new Array(BATCH_SIZE).fill(null);
const generateBatch = () => {
const now = new Date().toISOString();
return recordCache.map(() => ({
value: dataSchema.toBuffer({
id: Math.floor(Math.random() * 100000),
timestamp: now,
value: Math.random() * 100
})
}));
};
if (!isMainThread) {
const producer = new Kafka(kafkaConfig).producer();
const runProducer = async () => {
await producer.connect();
parentPort.postMessage({ type: 'status', status: 'connected' });
while (true) {
try {
const batch = generateBatch();
const startTime = Date.now();
await producer.send({
topic: 'dataStorm-topic',
messages: batch,
});
const duration = Date.now() - startTime;
parentPort.postMessage(`${BATCH_SIZE} in ${duration}ms`);
await new Promise(resolve => setTimeout(resolve, BATCH_INTERVAL_MS));
} catch (err) {
console.error(`Worker error: ${err.message}`);
}
}
};
runProducer().catch(err => {
console.error(`Fatal worker error: ${err.message}`);
process.exit(1);
});
}
if (isMainThread) {
const WORKER_COUNT = Math.max(4, os.cpus().length * 3);
const workers = new Set();
console.log(`Main process started. Spawning ${WORKER_COUNT} workers`);
const spawnWorker = (id) => {
const worker = new Worker(__filename);
worker
.on('message', (msg) => console.log(`[W${id}] ${msg}`))
.on('error', (err) => console.error(`[Worker ${id}] Error: ${err.message}`))
.on('exit', (code) => {
console.log(`[Worker ${id}] Exited with code ${code}`);
workers.delete(worker);
if (code !== 0) spawnWorker(id);
});
workers.add(worker);
};
for (let i = 0; i < WORKER_COUNT; i++) {
spawnWorker(i + 1);
}
process.on('SIGINT', async () => {
console.log('\nGracefully shutting down...');
for (const worker of workers) {
await worker.terminate();
}
process.exit();
});
}
Results
- Throughput: Landed around 600k posts/sec with 12 workers (assuming a decent machine). Each worker was pushing ~25K/sec, thanks to the tuned configs.
- Performance Boost: Dropping compression and linger time, plus jacking up
maxInFlightRequests
, gave me a solid 50% jump from v2’s 200K. - Stability Trade-off:
acks: 0
made it blazing fast but risked data loss—fine for a prototype, not production.
Tuning Breakdown
Here’s how the Kafka configs shaped up:
Parameter | Value | Impact | Trade-off |
---|---|---|---|
compression | None | +20% speed | Bigger payloads |
batchSize | 32MB | Fewer network trips | Memory pressure |
lingerMs | 2ms | Faster sends | Less batch efficiency |
maxInFlightRequests | 1000 | +50% throughput | Potential disorder |
bufferMemory | 8GB | No blocking | High RAM usage |
acks | 0 | +30% speed | No delivery guarantee |
The Reality Check
Theory said 12 workers at 10K posts every 5ms could hit 2.4M/sec (12 × 10K × 200 batches/sec). Reality? 600k/sec. Why the gap?
- Serialization: Avro’s
toBuffer()
was still a hog—~3ms per message ate into batch time. - Network Ceiling: Localhost TCP capped out around 700K/sec; I was pushing its limits.
- Worker Overlap: More workers (12 vs. 8) meant I/O thrashing—too many cooks in the kitchen.
Challenges Faced
- Memory Spikes: 8GB buffer sounded cool until RAM usage hit 80%—had to keep an eye on that.
- Error Spikes:
acks: 0
and no retries meant more dropped messages when Kafka lagged. - CPU Thrashing: 12 workers on an 8-core machine caused contention—over-optimism bit me.
Lessons Learned
- Tuning’s a Knife Edge: Push too hard (e.g.,
acks: 0
), and you trade reliability for speed. Balance is key. - Know Your Limits: Serialization and network caps showed me hardware matters as much as code.
- Test the Math: Theoretical 2.4M/sec sounded great—reality taught me to trust benchmarks over dreams.
Why It Matters
This wasn’t just about numbers—it was about proving I could optimize under pressure. In India, we’re used to making the most of what we’ve got, and v3 was that mindset in action: squeezing every drop of performance out of a single machine. It’s the kind of skill that stands out to engineers and recruiters alike—practical, relentless, and results-driven.
Next Up
600k posts/sec was a milestone, but 700k was the prize. Version 4 (v4) tackled serialization and network bottlenecks head-on—check it out for the next leap.