Skip to content
AI Lab
LLMLive

Real-time Sentiment Stream

A high-throughput sentiment analysis pipeline that processes streaming text data in real-time. Uses a fine-tuned DistilBERT model for sub-100ms classification, with a Kafka-based ingestion pipeline handling 10k+ messages/second. Live dashboard shows sentiment distribution, trend lines, and anomaly alerts.

DistilBERT (fine-tuned)RoBERTa
94% accuracy<100ms latency10k msgs/sec throughput

Category

LLM

Status

Live

Tech Stack

DistilBERTKafkaFastAPINext.jsClickHouse

Models

DistilBERT (fine-tuned)RoBERTa
Overview

This experiment builds a high-throughput sentiment analysis pipeline that processes streaming text at 10,000+ messages per second with sub-100ms classification latency. Using a fine-tuned DistilBERT model served via ONNX Runtime, the system classifies sentiment (positive, negative, neutral) with 94% accuracy and visualizes trends on a live-updating dashboard with anomaly detection.

Methodology

I fine-tuned DistilBERT on a combined dataset of 120,000 labeled texts: 50,000 tweets (Sentiment140), 40,000 product reviews (Amazon), and 30,000 support tickets (internal dataset). The model was exported to ONNX for inference optimization. I benchmarked against RoBERTa-base, VADER, and GPT-3.5 zero-shot on a held-out test set of 5,000 examples. Throughput was tested using a Kafka-based pipeline with simulated message streams at 1k, 5k, and 10k messages/second.

Tech Stack

Fine-tuned DistilBERT served via ONNX Runtime for high-throughput inference. Kafka handles message ingestion with consumer group scaling. ClickHouse stores time-series sentiment data for trend analysis. FastAPI serves the classification API. Next.js with WebSocket updates powers the live dashboard.

Key Findings

The most important insights from this experiment.

1

DistilBERT matches RoBERTa at 6x throughput

Fine-tuned DistilBERT achieved 94% accuracy vs RoBERTa's 95.2%, but processed 10,200 msgs/sec vs RoBERTa's 1,700 msgs/sec. For streaming applications, the 1.2% accuracy tradeoff is overwhelmingly worth the 6x throughput gain.

2

ONNX Runtime doubles inference speed

Converting the PyTorch model to ONNX and running with ONNX Runtime reduced per-message latency from 18ms to 8ms. Batching 32 messages reduced amortized latency to 2ms per message.

3

Domain adaptation matters more than model size

DistilBERT fine-tuned on support tickets scored 91% on support ticket sentiment. GPT-3.5 zero-shot scored only 78% on the same data. Domain-specific fine-tuning on a small model beats a large general model.

4

Anomaly detection catches sentiment shifts in <5 seconds

A sliding-window z-score detector on the 30-second sentiment moving average triggers alerts within 5 seconds of significant sentiment shifts, enabling real-time crisis response.

Architecture

Messages arrive via Kafka topics, consumed by a pool of classification workers running the ONNX model. Each classified message is written to ClickHouse with timestamp, source, sentiment label, and confidence score. The dashboard queries ClickHouse for real-time aggregations (sentiment distribution, trend lines, volume metrics) and receives push updates via WebSockets when anomaly thresholds are crossed.

Results

94% classification accuracy on held-out test set. 10,200 messages/second sustained throughput with 8 Kafka consumers. 8ms p50 latency, 23ms p99 latency per message. Anomaly detection latency: 4.7 seconds average. Dashboard renders 60fps with 100ms data refresh intervals.

Challenges

Key technical challenges encountered during this experiment.

Challenge 1

Sarcasm and context-dependent sentiment

Sarcastic messages like "Oh great, another outage" were classified as positive 40% of the time. Added a sarcasm detection head to the model trained on an additional 10,000 sarcasm-labeled examples, reducing misclassification to 15%.

Challenge 2

Kafka consumer lag under burst traffic

Traffic spikes of 50k msgs/sec caused consumer lag to build up. Implemented auto-scaling consumer groups that spin up new workers when lag exceeds 1,000 messages, with graceful scale-down after the burst.

NEXT

Interested in working with Forward?

We build production AI systems and run experiments like this for teams who value rigorous engineering.