The Streaming Pipeline That Did Not Need to Stream
A B2B SaaS company invested $1.2M in 2024 building Kafka-based streaming infrastructure for "real-time analytics." Eighteen months in, the team audited what the streaming infrastructure actually served. 82 percent of consumers read data that was 6 hours to 24 hours stale because their workloads were periodic batch jobs. Only 18 percent of consumers actually used the sub-second freshness the streaming infrastructure provided.
The streaming infrastructure cost roughly 4x what equivalent batch infrastructure would have cost for the same throughput. The team had over-engineered for a real-time requirement that mostly did not exist.
This pattern is common enough that Confluent's own 2024 customer research found roughly 18 percent of streaming use cases genuinely required sub-second freshness (Confluent, "State of Data in Motion 2024," 2024). The other 82 percent could have run on simpler infrastructure.
If your team is debating real-time architecture for a workload that may not need it, the question worth answering first is whether the use case actually requires real-time at all.
Coined Frame: The Five Use Profiles
Real-time data architecture serves five distinct use profiles. Each one has different requirements. The architectural decision should match the profile, not a generic "real-time" label.
Profile 1 - Operational interactivity. User-facing features that react to recent state. Fraud detection during checkout, personalization based on current session, AI conversational features grounded in recent user actions. Sub-second freshness is genuinely required. Streaming architecture is justified.
Profile 2 - Operational decisioning. Automated decisions that affect business operations. Inventory allocation, dynamic pricing, ad bidding. Freshness requirements typically 30 seconds to 5 minutes. Streaming or near-real-time architecture works.
Profile 3 - Analytical monitoring. Dashboards that humans look at periodically. Sales monitoring, operational health, KPI tracking. Freshness requirements typically 5 minutes to 1 hour. Micro-batch or near-real-time architecture works; pure streaming is over-engineered.
Profile 4 - Analytical exploration. Ad-hoc analysis, business intelligence, periodic reports. Freshness requirements typically 1 hour to 24 hours. Batch architecture works.
Profile 5 - Compliance and audit. Regulatory reporting, audit trails, periodic compliance certification. Freshness requirements typically daily to weekly. Batch architecture works.
Most enterprise data architectures end up with all five profiles in different workloads. Building all of them on streaming infrastructure is expensive. Building all of them on batch infrastructure misses real-time use cases. The honest profile mix determines the architecture mix.
The Real-Time Test
Three questions test whether a use case actually needs real-time.
Question 1 - Does anyone act on freshness below 5 minutes. Not theoretical action. Actual action. Is there a system or person whose behavior changes if data is 30 seconds old versus 5 minutes old. If the answer is no, real-time is not required.
Question 2 - Does the business value of action degrade significantly with delay. Fraud detection has high value to act in 200ms and near-zero value to act 5 minutes later. Sales reporting has roughly the same value to act on at noon or at noon plus 5 minutes. The decay curve tells you the architecture.
Question 3 - Is the alternative latency actually acceptable. Sometimes the "real-time" requirement is a reaction to currently unacceptable latency (data is 6 hours stale and that is wrong). The right answer might be "make it 30 minutes" rather than "make it 30 seconds." 30 minutes is cheaper to build and operate.
A use case that passes all three tests genuinely needs real-time. A use case that passes one or two of them probably needs near-real-time. A use case that passes none of them is batch.
What Real-Time Actually Costs
Streaming infrastructure costs roughly 3-5x what equivalent batch infrastructure costs for the same throughput. The cost differential breaks down into three pieces.
Always-on compute. Streaming systems run continuously. Batch systems run on schedule. The continuous compute baseline is meaningful.
State management overhead. Streaming computations maintain state. Joins, aggregations, windowing all require infrastructure that batch jobs do not need.
Operational complexity. Streaming systems have more failure modes, more tunable parameters, and more observability requirements. The headcount cost over time is real.
For workloads that genuinely require real-time, the cost is justified. For workloads that do not, it is a structural overspend that compounds over time.
What Modern Architectures Actually Look Like
Most enterprise data architectures in 2026 are layered to handle the profile mix.
A streaming layer handles Profiles 1 and 2. Kafka, Kinesis, Pulsar, or equivalent. The streaming layer is narrow because the genuinely real-time use cases are narrow.
A near-real-time layer handles Profile 3. Materialized views, micro-batch ETL on 1-15 minute schedules, CDC pipelines that update at minute granularity. This layer carries more workload than streaming because it serves more use cases.
A batch layer handles Profiles 4 and 5. Daily and hourly ETL on standard data warehouse infrastructure. This layer carries the most workload by volume.
A serving layer above the three handles consumption: BI tools, AI applications, operational systems. The serving layer abstracts the underlying freshness from consumers.
The teams that have rebuilt their architecture around this layered model typically reduce infrastructure cost by 40-60 percent while maintaining or improving real-time use case quality. The cost reduction comes from moving most workloads off streaming, not from making streaming cheaper.
The Migration Path
For organizations that built monolithic streaming infrastructure, the migration path to a layered architecture typically takes 9-18 months.
Phase 1 audits current workloads against the five profiles. Identifies what actually needs streaming versus what was built on streaming because the team had streaming infrastructure.
Phase 2 builds the near-real-time and batch layers in parallel to existing streaming. Workloads migrate by category, with streaming workloads remaining unchanged.
Phase 3 right-sizes the streaming infrastructure to serve only the workloads that genuinely need it. The streaming infrastructure becomes smaller, more specialized, and easier to operate.
The migration is rarely glamorous. The cost savings and operational complexity reductions are significant.
What This Costs
Building a layered architecture from scratch typically costs less than building monolithic streaming, because most of the workload runs on cheaper batch and near-real-time infrastructure.
Migrating from monolithic streaming to layered typically costs $400K-$1.5M of engineering investment for mid-market enterprises, recovered in 12-24 months through infrastructure savings.
What Logiciel Does Here
Logiciel works with data engineering teams whose streaming infrastructure has grown beyond what their actual workload requires, and with teams designing new architectures that need to fit the five-profile mix correctly from the start.
The AI Data Pipelines framework covers the pipeline architecture in more depth. The Data Pipeline Cost Optimization framework covers the cost-side analysis.
A 30-minute working session is enough to assess your current profile mix and identify whether the architecture matches the use cases.
Frequently Asked Questions
How do I assess whether my streaming infrastructure is over-built?
Audit consumer freshness requirements. For each downstream system reading from streaming, document the actual freshness requirement. If most are above 5 minutes, the streaming infrastructure is over-provisioned for the workload.
When should I use Kafka vs simpler streaming alternatives?
Kafka is justified when you have high-throughput streaming workloads with multiple consumers per stream and operational complexity to support the platform. Below that bar, simpler alternatives (Kinesis, Pulsar, or even managed CDC tools) often serve better.
What is the right freshness target for AI applications?
Depends on whether the AI is operational or analytical. Conversational AI grounded in recent user state needs sub-second freshness. AI summarization of yesterday's sales data is fine at daily refresh. Most AI workloads sit in the middle and benefit from near-real-time architecture.
How do I handle the team transition from streaming-first to layered architecture?
Through skill expansion, not replacement. Streaming engineers add batch and near-real-time skills. Batch engineers add streaming skills. The team that operates a layered architecture knows all three modes. Specialized streaming-only teams typically have to retrain.
What is the right vendor stack for a layered architecture?
Less important than fit with your existing infrastructure. Most enterprises in 2026 use a major cloud provider's batch (Snowflake, BigQuery, Databricks), a streaming layer (Confluent Kafka, Kinesis, Pulsar), and CDC tools (Fivetran, Airbyte, native CDC) in combination. The combination matters more than any single tool choice. Sources: - Confluent, "State of Data in Motion 2024" - Snowflake, "Cost Optimization Patterns," 2024