Real-time Sentiment Stream
A high-throughput sentiment analysis pipeline that processes streaming text data in real-time. Uses a fine-tuned DistilBERT model for sub-100ms classification, with a Kafka-based ingestion pipeline handling 10k+ messages/second. Live dashboard shows sentiment distribution, trend lines, and anomaly alerts.
Category
LLM
Status
LiveTech Stack
Models
This experiment builds a high-throughput sentiment analysis pipeline that processes streaming text at 10,000+ messages per second with sub-100ms classification latency. Using a fine-tuned DistilBERT model served via ONNX Runtime, the system classifies sentiment (positive, negative, neutral) with 94% accuracy and visualizes trends on a live-updating dashboard with anomaly detection.
I fine-tuned DistilBERT on a combined dataset of 120,000 labeled texts: 50,000 tweets (Sentiment140), 40,000 product reviews (Amazon), and 30,000 support tickets (internal dataset). The model was exported to ONNX for inference optimization. I benchmarked against RoBERTa-base, VADER, and GPT-3.5 zero-shot on a held-out test set of 5,000 examples. Throughput was tested using a Kafka-based pipeline with simulated message streams at 1k, 5k, and 10k messages/second.
Fine-tuned DistilBERT served via ONNX Runtime for high-throughput inference. Kafka handles message ingestion with consumer group scaling. ClickHouse stores time-series sentiment data for trend analysis. FastAPI serves the classification API. Next.js with WebSocket updates powers the live dashboard.
The most important insights from this experiment.
DistilBERT matches RoBERTa at 6x throughput
Fine-tuned DistilBERT achieved 94% accuracy vs RoBERTa's 95.2%, but processed 10,200 msgs/sec vs RoBERTa's 1,700 msgs/sec. For streaming applications, the 1.2% accuracy tradeoff is overwhelmingly worth the 6x throughput gain.
ONNX Runtime doubles inference speed
Converting the PyTorch model to ONNX and running with ONNX Runtime reduced per-message latency from 18ms to 8ms. Batching 32 messages reduced amortized latency to 2ms per message.
Domain adaptation matters more than model size
DistilBERT fine-tuned on support tickets scored 91% on support ticket sentiment. GPT-3.5 zero-shot scored only 78% on the same data. Domain-specific fine-tuning on a small model beats a large general model.
Anomaly detection catches sentiment shifts in <5 seconds
A sliding-window z-score detector on the 30-second sentiment moving average triggers alerts within 5 seconds of significant sentiment shifts, enabling real-time crisis response.
Messages arrive via Kafka topics, consumed by a pool of classification workers running the ONNX model. Each classified message is written to ClickHouse with timestamp, source, sentiment label, and confidence score. The dashboard queries ClickHouse for real-time aggregations (sentiment distribution, trend lines, volume metrics) and receives push updates via WebSockets when anomaly thresholds are crossed.
94% classification accuracy on held-out test set. 10,200 messages/second sustained throughput with 8 Kafka consumers. 8ms p50 latency, 23ms p99 latency per message. Anomaly detection latency: 4.7 seconds average. Dashboard renders 60fps with 100ms data refresh intervals.
Key technical challenges encountered during this experiment.
Sarcasm and context-dependent sentiment
Sarcastic messages like "Oh great, another outage" were classified as positive 40% of the time. Added a sarcasm detection head to the model trained on an additional 10,000 sarcasm-labeled examples, reducing misclassification to 15%.
Kafka consumer lag under burst traffic
Traffic spikes of 50k msgs/sec caused consumer lag to build up. Implemented auto-scaling consumer groups that spin up new workers when lag exceeds 1,000 messages, with graceful scale-down after the burst.
Interested in working with Forward?
We build production AI systems and run experiments like this for teams who value rigorous engineering.