End-to-End Data Flow in a Datahub

Series: Designing a Microservice-Friendly Datahub
PART III — CASE STUDY: MY CSL DATAHUB IMPLEMENTATION
Previous: RabbitMQ as the Inter-Module Backbone in Datahub
Next: Lessons Learned and Future Improvements for my implemented Datahub

Up to this point, we’ve examined each architectural component in isolation. Now it’s time to do the only thing that really proves an architecture works: watch it move.

This article walks through three real, end-to-end scenarios—triggered by users, schedules, and external systems—to show how data flows through the CSL Datahub. The goal is not to impress with complexity, but to demonstrate predictability: how events are born, buffered, translated, routed, and consumed without tight coupling or fragile coordination.

Disclaimer (Context & NDA)
The CSL Datahub implementation described here was designed and built in 2021. While the architectural principles remain valid, some implementation details could be modernized today. To comply with NDA requirements, business-specific logic, schemas, identifiers, and workflows are intentionally generalized.

The Event Lifecycle (A Quick Primer)

Before diving into scenarios, it helps to establish the shared lifecycle every event follows:

State changes in the CSL Web App
An event is emitted to Redis Streams
The event is buffered
The Processor consumes and translates it
The event is published to RabbitMQ
One or more modules consume and react

Nothing skips steps. Nothing shortcuts the pipeline.

That consistency is what makes the system understandable.

Scenario 1: User-Triggered Update

The Trigger

A user updates their profile through the CSL Web App UI.

Step 1: State Change in the Web App

$user->display_name = $input['display_name'];
$user->updated_at = time();
$user->save();

This is the only place where authoritative state is modified.

Step 2: Emit an Event

$redis->xAdd(
    'csl:events',
    '*',
    [
        'type' => 'user.updated',
        'user_id' => $user->id,
        'occurred_at' => time()
    ]
);

This event does not ask for anything. It simply states a fact.

Step 3: Redis Buffers the Event

If downstream systems are slow or unavailable, the event sits safely in Redis Streams. The Web App does not wait.

Step 4: Processor Consumes and Translates

var entries = redis.StreamReadGroup(
    "csl-group",
    "processor-1",
    "csl:events",
    ">"
);

foreach (var entry in entries)
{
    PublishToRabbitMQ(Map(entry));
    redis.StreamAcknowledge("csl:events", "csl-group", entry.Id);
}

Step 5: RabbitMQ Distributes

channel.BasicPublish(
    exchange: "csl.events",
    routingKey: "user.updated",
    body: payload
);

Step 6: Modules React Independently

Notification service sends an email
Analytics service updates counters
Search indexer refreshes a document

None of these modules know about each other. None block the user.

Scenario 2: Cron-Triggered Update

The Trigger

A scheduled job recalculates account status nightly.

Step 1: Cron Runs Inside the Web App

foreach ($users as $user) {
    $user->status = calculateStatus($user);
    $user->save();

    $redis->xAdd(
        'csl:events',
        '*',
        [
            'type' => 'user.status_recalculated',
            'user_id' => $user->id,
            'occurred_at' => time()
        ]
    );
}

Notice what doesn’t change:

Same database
Same event emission
Same pipeline

Cron jobs are just another source of facts.

Step 2: Burst Handling via Redis Streams

Hundreds or thousands of events may be emitted in a short time. Redis absorbs the burst. Consumers catch up at their own pace.

Step 3: Downstream Effects

Billing service recalculates charges
Compliance service flags anomalies
Reporting service aggregates metrics

Some consumers may lag. Others may fail and retry. None affect the cron job’s completion.

Scenario 3: External Module–Triggered Update

The Trigger

An external module completes a long-running process and reports back.

Step 1: External Module Publishes an Event

channel.publish(
  'external.events',
  'process.completed',
  Buffer.from(JSON.stringify({
    processId: 'abc123',
    status: 'success'
  }))
);

Step 2: Processor Consumes from RabbitMQ (calling Web App REST API)

consumer.Received += async (sender, ea) =>
{
    var message = Deserialize(ea.Body);

    await httpClient.PostAsync(
        "/api/csl/process-complete",
        new StringContent(Json(message))
    );

    channel.BasicAck(ea.DeliveryTag, false);
};

Step 3: CSL Web App Updates State

$model = Process::findOne($processId);
$model->status = 'completed';
$model->save();

Step 4: CSL Emits a New Event

$redis->xAdd(
    'csl:events',
    '*',
    [
        'type' => 'process.completed',
        'process_id' => $processId,
        'occurred_at' => time()
    ]
);

The cycle continues. No shortcuts. No special cases.

Idempotency in Practice

In all three scenarios, duplication is possible—and expected.

Consumers defend themselves:

if (processedEventIds.has(event.id)) {
  return;
}
processedEventIds.add(event.id);

Or via database constraints:

INSERT INTO processed_events (event_id)
VALUES (:event_id)
ON DUPLICATE KEY UPDATE event_id = event_id;

Idempotency is what turns retries into reliability.

What These Scenarios Prove

Across all flows:

The Web App never waits for downstream systems
Events are facts, not instructions
Buffers absorb time and failure
Modules react independently
The system converges, even under stress

This is not accidental. It’s the result of respecting boundaries.

Seeing the System as Motion, Not Structure

Once you follow real events end-to-end, the architecture stops being a diagram and starts being a machine.

You can reason about:

Where delays occur
Where failures are isolated
Where retries are safe
Where ownership lives

That’s the difference between knowing an architecture and trusting it.

Where We Go Next

Now that we’ve seen the CSL Datahub in motion, the final step is reflection.

In the next and final article, Lessons Learned and Future Improvements, we’ll step back to examine what worked well, what was harder than expected, and how this architecture could evolve over time—technically and organizationally.

Good architectures don’t just make systems work.
They make systems understandable while they work.

End-to-End Data Flow Scenarios in Datahub

The Event Lifecycle (A Quick Primer)

Scenario 1: User-Triggered Update

The Trigger

Step 1: State Change in the Web App

Step 2: Emit an Event

Step 3: Redis Buffers the Event

Step 4: Processor Consumes and Translates

Step 5: RabbitMQ Distributes

Step 6: Modules React Independently

Scenario 2: Cron-Triggered Update

The Trigger

Step 1: Cron Runs Inside the Web App

Step 2: Burst Handling via Redis Streams

Step 3: Downstream Effects

Scenario 3: External Module–Triggered Update

The Trigger

Step 1: External Module Publishes an Event

Step 2: Processor Consumes from RabbitMQ (calling Web App REST API)

Step 3: CSL Web App Updates State

Step 4: CSL Emits a New Event

Idempotency in Practice

What These Scenarios Prove

Seeing the System as Motion, Not Structure

Where We Go Next

Comments

Designing a Microservice-Friendly Datahub

RabbitMQ as the Inter-Module Backbone in Datahub

More from this blog

Brownfield Migration: The Strangler Fig Approach to BFF Adoption

BFF Resilience Patterns: Circuit Breakers, Retries & Timeouts with Polly

Caching in the BFF: In-Memory, Redis & Response Caching

Observability for BFF: Structured Logging, Distributed Tracing & Azure Application Insights

Testing the BFF: Unit, Integration & Contract Tests

Command Palette

The Event Lifecycle (A Quick Primer)

Scenario 1: User-Triggered Update

The Trigger

Step 1: State Change in the Web App

Step 2: Emit an Event

Step 3: Redis Buffers the Event

Step 4: Processor Consumes and Translates

Step 5: RabbitMQ Distributes

Step 6: Modules React Independently

Scenario 2: Cron-Triggered Update

The Trigger

Step 1: Cron Runs Inside the Web App

Step 2: Burst Handling via Redis Streams

Step 3: Downstream Effects

Scenario 3: External Module–Triggered Update

The Trigger

Step 1: External Module Publishes an Event

Step 2: Processor Consumes from RabbitMQ (calling Web App REST API)

Step 3: CSL Web App Updates State

Step 4: CSL Emits a New Event

Idempotency in Practice

What These Scenarios Prove

Seeing the System as Motion, Not Structure

Where We Go Next

Comments

Designing a Microservice-Friendly Datahub

RabbitMQ as the Inter-Module Backbone in Datahub

More from this blog