Devpath Traveler

Designing the BFF Contract: Request Aggregation & Client-Specific Shaping

Nguyễn Việt Tùng — Wed, 08 Apr 2026 13:11:25 GMT

In the previous article, we established when a BFF earns its overhead. This article assumes you have made that decision and are now facing the harder question: how do you actually design the thing well?

The BFF contract — the API surface your application depends on — is the most consequential design decision in this architecture. Get it right and you have a clean, stable interface that lets the frontend move independently of upstream service changes. Get it wrong and you have a new layer that amplifies the coupling problems you were trying to solve, with the added pleasure of owning the infrastructure.

This article covers the four design concerns that determine which outcome you get: what aggregation actually means in practice, how to shape responses for the client rather than the domain, how to version the contract without recreating a backend team's problems, and where the boundary between BFF logic and upstream logic must be drawn and defended.

The examples and architecture decisions throughout are drawn from a production implementation built for a Norwegian enterprise in the education sector. Where the original system cannot be described in full, concepts have been generalised to meet NDA obligations — but the engineering trade-offs, the failure modes, and the decisions that shaped the final design are real. We use VueJS and .NET Core as the example frameworks since it's based on the real production project.

Aggregation is not just parallel fetching

The word "aggregation" in BFF descriptions usually conjures an image of parallel HTTP calls fanning out to multiple services and the results being merged before the response is returned. That is part of it. But treating aggregation as a mechanical fan-out pattern misses what makes it valuable — and what makes it dangerous.

Consider a dashboard screen in an education platform. The frontend needs: the authenticated user's profile and role, the list of courses they are enrolled in, the next three upcoming sessions, and a count of unread notifications. In a naïve aggregation implementation, the BFF fires four requests in parallel, waits for all four, and concatenates the results into a response object.

This works. It is also fragile in a way that becomes apparent under load.

A more considered design asks: what are the actual dependency relationships between these data sets? The course list depends on the user's organisation ID, which comes from the user profile. The session list is scoped to the course IDs returned by the course list. The notification count is independent. This is not a fan-out — it is a directed graph with two sequential phases:

Phase 1 (parallel):
  → GET /identity/users/{id}         → profile + orgId
  → GET /notifications/unread/count  → notificationCount

Phase 2 (parallel, depends on Phase 1):
  → GET /courses?orgId={orgId}       → courseIds
  
Phase 3 (depends on Phase 2):
  → GET /sessions?courseIds={...}&limit=3 → upcomingSessions

Treating this as a flat parallel fan-out would require either fetching all courses before knowing the org ID (not possible), or making the frontend responsible for the sequencing (which defeats the point). The BFF owns this orchestration — it understands the dependency graph and executes it efficiently, shielding the frontend from the fact that it exists.

This has a practical implication for implementation. In .NET Core, Task.WhenAll handles the parallel phases, and the sequential phases chain naturally:

// Phase 1: independent fetches in parallel
var (profile, notificationCount) = await (
    _userService.GetProfileAsync(userId),
    _notificationService.GetUnreadCountAsync(userId)
).WhenBoth();

// Phase 2: depends on profile
var courses = await _courseService.GetByOrgAsync(profile.OrgId);

// Phase 3: depends on courses
var courseIds = courses.Select(c => c.Id).ToArray();
var upcomingSessions = await _sessionService.GetUpcomingAsync(courseIds, limit: 3);

The aggregation logic lives in the BFF. The frontend makes one request and receives one coherent response. The dependency graph is invisible to the client.

When aggregation goes wrong

Two aggregation anti-patterns appear consistently in production BFFs.

The "God endpoint." A single endpoint that returns everything the application might ever need, used on multiple screens because it is convenient. The God endpoint inflates payload sizes, makes partial failure handling impossible, and couples unrelated features together. If a notification service outage should not take down the course list, they must not share an endpoint. Design endpoints around screen-level data contracts, not around service boundaries.

Cascading failure without isolation. If Phase 2 in the example above fails because the course service is down, a poorly designed BFF either returns a 500 (crashing the whole screen) or silently swallows the error (showing stale or empty data with no indication of the problem). The correct design is explicit partial failure handling: return what succeeded, mark what failed, and let the frontend decide how to render a degraded state.

public record DashboardResponse(
    UserProfile Profile,
    IReadOnlyList Courses,
    IReadOnlyList UpcomingSessions,
    int NotificationCount,
    IReadOnlyList PartialFailures  // e.g. ["courses", "sessions"]
);

This is not error handling for its own sake. It is a deliberate contract: the BFF guarantees it will always return a structurally valid response, and the Vue component decides what to render when parts of it are empty.

Response shaping: the frontend owns the shape

The single most impactful thing a BFF does is decouple the response shape from the domain model. This is also where engineers most frequently make the wrong call.

The wrong call is to return the upstream service's response more or less as-is, perhaps with light field selection. This is understandable — it is the path of least resistance, it keeps the BFF thin, and it avoids the question of what the "right" shape is. But it is not response shaping. It is proxying, and a proxy does not justify the overhead of a dedicated service.

Response shaping means designing the response around the component tree that will consume it. The question is not "what did the upstream service return?" but "what does the Vue component need, and in what shape does it need it?"

Flatten, don't nest

Domain models are frequently deeply nested because they reflect real-world entity relationships. Frontend components rarely need that nesting — they need flat data they can bind to template properties without traversal logic in the component.

An upstream Course entity might look like this:

{
  "id": "c-1",
  "metadata": {
    "title": "Mathematics — Year 9",
    "code": "MATH-9",
    "curriculum": {
      "framework": "NOR-K20",
      "subject": "Mathematics",
      "level": { "grade": 9, "label": "Year 9" }
    }
  },
  "enrollment": {
    "capacity": 30,
    "enrolled": 24,
    "waitlist": 2
  },
  "status": { "code": "ACTIVE", "since": "2024-08-15T00:00:00Z" }
}

A course card component in Vue needs: a title, a code, the enrollment fraction, and a status label. The BFF shapes this into:

{
  "id": "c-1",
  "title": "Mathematics — Year 9",
  "code": "MATH-9",
  "enrollmentLabel": "24 / 30",
  "enrollmentPercent": 80,
  "status": "Active",
  "activeFrom": "2024-08-15"
}

The component receives exactly what it renders. No traversal logic, no null-guard chains, no formatting in the template. The formatting decisions — how to display the enrollment fraction, how to present the date — are made once, in the BFF, and are consistent across every component that uses this data.

Computed fields belong in the BFF

The enrollment label and enrollment percentage in the example above are computed fields — they do not exist in the upstream response and must be derived. They belong in the BFF, not in the component.

The underlying principle: any derivation that is deterministic, presentation-oriented, and would be repeated across multiple components is a BFF concern. This includes percentage calculations, label generation, date formatting, status code translation, and currency formatting with locale awareness.

What does not belong in the BFF: business logic that should live upstream, validation that changes application state, or calculations that depend on runtime user input. The BFF is a rendering layer for data that is already computed — it is not a domain service.

Naming conventions are a contract decision

The upstream service uses enrollmentCapacity and enrolledCount. The BFF exposes enrollmentLabel and enrollmentPercent. The Vue component uses enrollmentLabel and enrollmentPercent.

This means your BFF's property names are part of its contract with the frontend. Changing enrollmentLabel to enrollmentText is a breaking change, even if the value is identical. Name properties for their rendering purpose, not their origin, and treat them with the same stability you would expect from any API you consume.

In practice, this argues for establishing naming conventions before writing the first endpoint and enforcing them through code review. Consistency in the contract reduces cognitive load on both sides of it.

Versioning: the problem you are creating

A BFF contract is an API. APIs have versions, and versions accumulate. This is the most underspecified aspect of BFF design in most articles, because it is uncomfortable to discuss — you are creating a versioning problem in order to solve a coupling problem, and the question is whether the trade is favourable.

There are three approaches, each with a clear use case.

URL versioning for major breaking changes

GET /api/v1/dashboard
GET /api/v2/dashboard

URL versioning is the most visible, most explicit, and most operationally expensive approach. It is appropriate when a response shape change is so significant that the old and new shapes cannot coexist under one contract — for example, when a screen is redesigned and the data model changes completely.

The operational cost is that both versions must be maintained simultaneously during the migration period, and the migration period in practice extends far longer than anticipated. Budget for it.

Header versioning for incremental evolution

GET /api/dashboard
Accept: application/vnd.bff.dashboard+json; version=2

Header versioning keeps URLs stable and moves the version negotiation into headers. It is cleaner for incremental evolution — adding fields, changing response structure within a stable conceptual model — and it does not require duplicating route definitions. The cost is that it is less discoverable and requires slightly more discipline in the client to set the header correctly.

Additive-only evolution: the best versioning strategy

The most effective versioning strategy is one you do not need. If the BFF contract evolves additively — new fields are added, existing fields are never removed, semantics never change — versioning becomes a maintenance concern rather than a migration project.

This is achievable in practice with two rules:

Never remove a field. If a field is no longer needed by the frontend, mark it as deprecated in internal documentation and stop populating it (return null), but leave it in the response schema. Removal is a v2 concern.

Never change the semantics of an existing field. If status currently returns "Active" and you need it to return a structured object, that is a new field — statusDetail — not a change to status. The original field continues as-is.

Additive-only evolution is not infinitely sustainable — response shapes accumulate cruft over time — but it defers versioning costs to the natural cadence of major releases rather than introducing them with every sprint.

The boundary: what belongs in the BFF

This is the question that determines whether your BFF stays healthy or becomes the new monolith. The answer is a firm principle rather than a checklist: the BFF transforms and aggregates; it does not originate.

What this means in concrete terms:

BFF owns

Response aggregation: combining data from multiple upstream services into a single response shaped for the Vue component
Field selection and projection: choosing which upstream fields to include and which to omit
Presentation-layer computation: formatting, label generation, percentage derivation, date localisation
Authentication enforcement: validating the session, exchanging tokens, enforcing access before any upstream call is made
Caching of presentation-layer data: caching aggregated, shaped responses where staleness is acceptable
Error translation: converting upstream error codes into client-meaningful error shapes

Upstream services own

Business rules: what constitutes a valid enrollment, whether a course is at capacity, eligibility logic
Domain validation: ensuring data integrity constraints are enforced at the source of truth
State mutation: creating, updating, and deleting domain entities
Cross-entity consistency: ensuring that a session cannot reference a deleted course

The grey area: where teams disagree

The genuinely contested cases are usually one of two types:

Computed fields that require domain knowledge. Is isEnrollmentOpen — a boolean derived from enrollment capacity and a business rule about the waitlist threshold — a presentation concern or a domain concern? The answer: if the rule could change (and business rules do change), it belongs upstream. The BFF should receive a pre-computed enrollmentStatus from the course service, not derive it locally. A BFF that embeds business rules is a BFF that becomes inconsistent with the backend when those rules change.

Input validation on write endpoints. The BFF handles write operations too — course enrollments, session registrations. Validation that is purely structural (is this field present, is this ID a valid UUID format) is reasonable in the BFF as a fast-fail before the upstream call. Validation that is semantic (is this user eligible to enroll in this course) must happen upstream. Drawing the line here prevents a situation where the BFF and the upstream service have conflicting validation logic.

A practical heuristic: if a product manager could change the rule in a sprint, it belongs upstream. If it is structural and invariant (a user ID is always a UUID), it is acceptable in the BFF.

Designing for the Vue component tree

The most useful frame for BFF endpoint design is not "what data does this screen need?" but "what does the Vue component tree look like, and what does each component expect from its props?".

A screen is composed of components. Each component has a data contract — its props interface. The BFF response should map cleanly onto that props interface, ideally such that a single destructure in the composable produces the data each component needs without further transformation.

In practice this means walking the component tree before designing the endpoint:

DashboardView
├── UserProfileCard        → { displayName, role, avatarUrl }
├── CourseListPanel
│   └── CourseCard (×n)   → { id, title, code, enrollmentLabel, status }
├── UpcomingSessionsList
│   └── SessionItem (×n)  → { id, title, startsAt, courseTitle, locationLabel }
└── NotificationBadge      → { count }

The BFF endpoint for this screen returns an object that mirrors this structure:

{
  "user": {
    "displayName": "Ingrid Solberg",
    "role": "Teacher",
    "avatarUrl": "/avatars/i-solberg.jpg"
  },
  "courses": [
    { "id": "c-1", "title": "Mathematics — Year 9", "code": "MATH-9",
      "enrollmentLabel": "24 / 30", "status": "Active" }
  ],
  "upcomingSessions": [
    { "id": "s-1", "title": "Integration review", "startsAt": "2025-04-08T09:00:00",
      "courseTitle": "Mathematics — Year 9", "locationLabel": "Room 204" }
  ],
  "notifications": { "count": 3 }
}

The Vue composable for this screen receives this response and distributes it to components without further transformation:

const { data } = useDashboard()

// Each ref maps directly to a component's props
const user = computed(() => data.value?.user)
const courses = computed(() => data.value?.courses ?? [])
const upcomingSessions = computed(() => data.value?.upcomingSessions ?? [])
const notificationCount = computed(() => data.value?.notifications.count ?? 0)

No adapter logic. No field mapping. The BFF contract and the component props interface are aligned by design.

A note on OpenAPI and type safety

A BFF contract without a schema is a verbal agreement. It will drift. The response shape the BFF returns today will not match what the Vue components expect in three months, and you will discover the mismatch at runtime.

The mitigation is simple and should be non-negotiable: define the BFF contract with OpenAPI, generate TypeScript types from it, and import those types into the Vue application. Changes to the BFF response shape become compile-time errors in the frontend before they reach the browser.

In .NET Core, Swashbuckle generates an OpenAPI spec from controller or Minimal API route definitions automatically. In Vue 3, openapi-typescript or @hey-api/openapi-ts generates typed interfaces from that spec. The generation step belongs in the build pipeline, not as a manual step.

This is not optional complexity. A BFF that lacks type-safe contracts between its two surfaces — the upstream services and the Vue client — is a BFF that will break silently in production.

What comes next

This article has covered the design principles that govern a well-structured BFF contract. The next article steps back before the implementation begins to address the comparative question that should be answered before writing any code: how does BFF compare to API Gateway and GraphQL as architectural options, where does each pattern win, and where do they coexist?

☰ Series navigation

What Is BFF — and When Is It Actually Worth It?

Nguyễn Việt Tùng — Sun, 22 Mar 2026 04:22:54 GMT

Your frontend has outgrown the API it was given.

At some point, most frontend teams hit the same wall. The backend exposes what it knows — resources, entities, service boundaries — and the frontend is left stitching four API calls into a single screen, massaging data shapes the UI never asked for, and writing adapter logic that has no good place to live. The Backend for Frontend pattern is the answer to that wall. But it comes with a cost: an additional service to build, deploy, and own. This series makes the case for that trade-off — and equally, for the cases where it is not worth making. The examples and architecture decisions throughout are drawn from a production implementation built for a Norwegian enterprise in the education sector. Where the original system cannot be described in full, concepts have been generalised to meet NDA obligations — but the engineering trade-offs, the failure modes, and the decisions that shaped the final design are real.

The problem, stated plainly

Before defining what a BFF is, it is worth being precise about what problem actually warrants one.

Imagine a dashboard screen in an education platform. It needs to render: the current user's profile and role, their organisation's enrolled courses, upcoming sessions for the current week, and unread notifications. In a system where services are organised around domain entities, that screen requires calls to at least four separate endpoints — likely across two or three different services. The responses come back in shapes optimised for storage and domain logic, not for what this particular screen needs.

The frontend handles it. It fires the requests, waits for them to resolve, merges the data, filters out the fields it does not need, transforms date formats, normalises inconsistent ID conventions between services, and then renders. This works. It is also a slow, fragile, and increasingly expensive pattern at scale.

Three specific failure modes appear consistently once systems grow:

Overfetching and underfetching. REST endpoints designed around domain entities return either too much or too little for any given screen. A GET /users/{id} response that includes billing history, audit logs, and security settings satisfies the account settings page — but it is wasteful when all the dashboard needs is a display name and an avatar URL. Conversely, a screen requiring data from three different resource types must make three round trips, each adding latency, each adding a potential failure point.

Adapter logic with no home. The gap between what upstream services return and what the frontend needs gets filled somewhere. In the absence of a dedicated layer, it lands in the frontend itself — in Vuex stores, in composables, in utility functions scattered across the codebase. This logic is hard to test in isolation, invisible to the backend teams who changed the API shape that broke it, and re-implemented separately for each client surface (web app, mobile app, third-party integration).

Security and session complexity pushed to the client. When the frontend talks directly to multiple APIs, it must manage tokens for each of them, handle token refresh across parallel requests, and decide what to expose in the browser. This is not where security boundaries should be drawn. The browser is not a trusted environment, and treating it as one creates problems that are difficult to retrofit away later.

None of these problems are fatal on their own, early in a product's life. The question is what you reach for when they compound — and BFF is one answer, not the only one.

What BFF actually is

The Backend for Frontend pattern, first articulated by Sam Newman in the context of microservices architecture, is straightforward in principle: create a dedicated backend service for each distinct frontend client, owned by the frontend team, whose sole responsibility is serving that client's specific needs.

The key words are dedicated and owned by the frontend team. A BFF is not a general-purpose API gateway. It is not a shared middleware layer. It is a service that knows exactly one consumer — your frontend — and is optimised entirely for that consumer's needs. It aggregates calls to upstream services, shapes responses into exactly what the UI requires, handles authentication at the boundary, and shields the frontend from the complexity and instability of the services behind it.

The Vue application has one API contract to reason about: the BFF. It does not know or care how many upstream services exist, how they are versioned, or what their response shapes look like. That complexity lives in the BFF, where it can be tested, versioned, and changed independently.

The BFF can do several things the frontend cannot do cleanly on its own: it can fire multiple upstream requests in parallel and merge the results before responding; it can cache aggressively for data that does not change per-request; it can enforce authentication and authorisation before a single upstream call is made; and it can translate between authentication contexts — for example, exchanging a session cookie for a service-to-service token without ever exposing a bearer token to the browser.

What BFF actually costs

This is where most introductory articles skip ahead too quickly. The BFF pattern is genuinely useful — but the version of it that works in production looks different from the version described in architecture blog posts, and the gap is filled with operational cost.

You are taking on a new service. This sounds obvious but its implications are underestimated. A new service means a new deployment pipeline, a new container to monitor, a new set of logs to aggregate, a new failure mode to handle, a new component in your runbook. If your team does not already own infrastructure or has not previously maintained a backend service, the operational learning curve is real. The BFF will go down. It will have bugs. It will need updating when upstream services change their contracts. These are not hypothetical costs — they are the routine maintenance costs of any production service, and they do not disappear because the service is thin.

The BFF becomes a coupling point. When your Vue application and your upstream services are decoupled by a BFF, the BFF is not free of coupling — it absorbs it. Every upstream API change that affects the frontend now requires a BFF change too. In a fast-moving system, this can mean the BFF becomes a bottleneck: a place where changes must land before they can reach the frontend. The team that owns the BFF becomes the team that must be unblocked first.

Latency is not free. A BFF adds one network hop between the browser and its data. For most production deployments — where the BFF and its upstream services are colocated in the same cloud region — this hop is in the single-digit milliseconds. But it exists, and for systems already operating close to latency budgets, it matters. The mitigation is co-deployment discipline and caching, both of which require deliberate effort.

The BFF can become a dumping ground. This is the failure mode no one talks about in architecture talks. A BFF that starts as a clean aggregation layer accumulates business logic over time. A validation rule here, a conditional transform there, a calculation that "just needs to live somewhere." Left unchecked, a BFF becomes a monolith with the word "frontend" in its name. The discipline to keep it thin — a translator and aggregator, not a domain engine — is cultural as much as technical, and it requires active enforcement.

When BFF is worth it

With that context established, the cases where BFF earns its overhead are clearer.

Multiple upstream services, single UI surface. If your frontend needs to aggregate data from three or more independent services for routine screens, the aggregation cost is already being paid somewhere. Paying it in the BFF — where it can be tested, cached, and monitored — is better than paying it in the client or across a distributed chain of sequential API calls.

Multiple client surfaces with diverging needs. A web application, a mobile app, and a third-party integration consume fundamentally different API shapes. A response payload appropriate for a desktop dashboard is wasteful over a mobile connection. A BFF per client surface means each client gets exactly what it needs, without the upstream services needing to know or care about client-specific requirements. This is the original use case Sam Newman described, and it remains the strongest one.

Security boundary clarity. If your system involves tokens that must never reach the browser, or authentication flows that require server-side session management — as is the case with Feide, the Norwegian government identity provider used in this series — a BFF gives you a clean place to draw the security perimeter. The BFF holds the session, manages token exchange, and the browser only ever receives a cookie. This is the Token Handler pattern, and it is substantially harder to implement correctly without a dedicated server-side layer.

Unstable upstream contracts. In a microservices environment where teams are moving fast and breaking things at the API layer, a BFF acts as a translation buffer. When an upstream service changes its response shape, you update the BFF. The Vue application is insulated. Without the BFF, that upstream change propagates directly into frontend code — often discovered at runtime rather than compile time.

Team ownership alignment. Perhaps the least technical but most practically important factor: if the frontend team has the capacity to own a backend service, a BFF gives them the autonomy to move at their own pace without being blocked on backend teams for API shape changes. This is an organisational argument as much as an architectural one, and it should be evaluated as such.

When BFF is not worth it

This section is the one most articles omit. The BFF pattern has a real overhead floor that you pay regardless of system complexity. Below a certain threshold, that floor is higher than the problems it solves.

Small teams moving fast on a single surface. If you have one frontend, one backend, and a team of three engineers, a BFF introduces a coordination overhead between your own people that does not exist if the frontend talks directly to the API. The aggregation and shaping problems are real, but they are solvable with thoughtful API design, GraphQL, or simply accepting a thin amount of adapter logic in the frontend until the system is large enough to justify more structure.

A well-designed monolithic API. If your backend already returns response shapes close to what the frontend needs — because the backend team works closely with the frontend team, or because the API was designed frontend-first — a BFF adds a layer without adding meaningful value. The problem a BFF solves is the impedance mismatch between backend domain models and frontend presentation needs. If that mismatch is small, the solution is disproportionate.

Early-stage products with unstable requirements. A BFF contract between the frontend and its upstream services is another API surface to maintain. In the early life of a product, when screen designs change weekly and domain models are still being discovered, the BFF becomes a change multiplier: every significant UI change requires a frontend change, a BFF change, and potentially an upstream change. The stability that makes a BFF valuable is the same stability that is absent in early-stage development.

Teams without infrastructure ownership. If your team has never maintained a deployed service — never dealt with container health checks, never written a deployment pipeline, never handled a 3am incident for something they own — adopting a BFF in production is learning two hard things simultaneously: the architecture and the operations. This is not a reason to avoid BFF permanently, but it is a reason to be honest about timing and capacity.

The decision framework

Rather than a checklist, use three questions. If the answer to fewer than two is "yes," the BFF overhead is likely not justified at this stage.

Does your frontend aggregate data from three or more independent services for routine operations? If most screens require only one or two API calls that already return the right shape, the aggregation value proposition is weak.

Do you have a meaningful security or session management requirement that cannot be cleanly handled in the client? If your authentication flow is stateless, token-based, and entirely client-managed, the security argument for BFF does not apply. If you are dealing with server-side sessions, token exchange, or an identity provider like Feide that requires server-side handling, it does.

Does your team have the capacity to own and operate a backend service independently? This means a deployment pipeline, monitoring, alerting, runbooks, and the willingness to be on-call for it. BFF without operational ownership is technical debt in a server rack.

What this series covers from here

The rest of this series assumes the answer is "yes, BFF is the right call." If it is not — if you read this article and concluded that your system is not there yet — the single most useful thing you can do is bookmark the decision framework above and revisit it in six months. Architecture decisions should trail system complexity, not lead it.

For those continuing: the next article addresses how to design the BFF contract itself — what belongs inside it, what must stay in upstream services, and how to version the API surface you are creating without recreating the problems you were trying to solve.

The implementation articles that follow use .NET Core Minimal APIs for the BFF service, Vue 3 composables for the client-side API layer, Feide for authentication, and Azure Container Instances for deployment. Each article is self-contained, but the architecture decisions made in the early articles carry forward — so reading in order is the path of least resistance.

☰ Series navigation

Introduction to The Frontend's Contract: Building Backends for Frontends with Vue.js, .NET Core & Azure

Nguyễn Việt Tùng — Sat, 21 Mar 2026 04:49:51 GMT

"Every frontend eventually outgrows the API it was given. The BFF pattern is not about adding a layer — it's about taking ownership of the interface between your product and its backend, on your terms. This series is for engineers who want to understand the trade-offs before they commit to the architecture."

Series Guideline

Part I: Foundations (Concepts and Architecture)

Article 1: What Is BFF — and When Is It Actually Worth It?
The problem it solves, the cost it introduces, and the honest answer on when not to use it.
Article 2: Designing the BFF Contract: Request Aggregation & Client-Specific Shaping
API design principles, response shaping, versioning strategy, and what belongs in the BFF vs upstream services.
Article 3: BFF vs API Gateway vs GraphQL: Picking the Right Abstraction
Comparative analysis with real trade-offs. Where each pattern wins, where it falls over, and how they can coexist.

Part II: Implementation (Code)

Article 4: Building the BFF in .NET Core: Minimal APIs, Routing & Aggregation
Standing up the BFF service, aggregating upstream calls, shaping responses, and handling errors with real code.
Article 5: The Vue 3 API Layer: Composables, Error Boundaries & Type Safety
Building a clean, typed client-BFF contract in Vue 3. useApi composables, error handling strategies, and OpenAPI codegen.
Article 6: Auth at the Boundary: Integrating Feide Identity via the BFF
Connecting the BFF to Feide — Norway's government-issued identity provider for educational organisations. OAuth 2.0 + OIDC flow, the Token Handler pattern, and why cookie-based sessions beat tokens in the browser.
Article 7: Shipping to Azure: Docker Images, Artifact Publishing & Azure Container Instances
Full IaaS deployment pipeline — building and tagging Docker images, publishing artifacts, and running the BFF on Azure Container Instances. Includes Azure Front Door routing and when API Management adds value vs noise.

Part III: Production & Operations (Ops)

Article 8: Testing the BFF: Unit, Integration & Contract TestsA layered testing strategy for the BFF. WebApplicationFactory for integration tests, Pact for consumer-driven contract testing with Vue.
Article 9: Observability: Structured Logging, Distributed Tracing & Azure Application Insights
End-to-end traceability across Vue → BFF → upstream services using Azure Application Insights. Correlation IDs, structured logs with Serilog, custom telemetry, and Application Insights dashboards and alerts.

Supplementary articles

Resilience Patterns: Circuit Breakers, Retries & Timeouts with Polly
Making the BFF fault-tolerant using Polly. Handling partial upstream failures gracefully in aggregated responses.
Caching in the BFF: In-Memory, Redis & Response Caching
Where caching belongs in a BFF architecture, how to avoid stale-data bugs, and cache invalidation patterns.
Brownfield Migration: The Strangler Fig Approach to BFF Adoption
Incrementally introducing a BFF in front of an existing monolith or REST API without a big-bang rewrite.

☰ Series navigation

How Browser UX Shapes Security More Than Cryptography

Nguyễn Việt Tùng — Thu, 19 Feb 2026 07:43:33 GMT

Cryptography is precise.

Browsers are not.

If you’ve implemented WebAuthn in a real PWA, you already know this:
The spec is clean. The user experience is not.

The uncomfortable truth is this:

Most authentication systems fail because of UX, not because of broken cryptography.

WebAuthn gives us origin binding, challenge–response, and public-key authentication. That’s beautiful. But what users actually interact with is:

A browser modal.
An OS biometric sheet.
A permission dialog.
A vague error message.
A “NotAllowedError”.

And those surfaces shape behavior more than any algorithm ever will.

Let’s examine how browser and OS UX decisions constrain authentication design — and why UX discipline is often more important than cryptographic strength.

1. Browser and OS UX Constrain Your Architecture

When you call:

await navigator.credentials.get({
  publicKey: options
});

You are not in control anymore.

The browser:

Decides how the prompt looks.
Decides when it appears.
Decides how cancellation behaves.
Decides what error is returned.
Delegates to the OS for biometric UI.

Your PWA is a spectator.

Example: Timing Assumptions

You might assume:

The WebAuthn prompt appears immediately.
The user understands what is happening.
Cancellation is intentional.

In reality:

On Chrome desktop, the modal may appear inline.
On Safari (macOS), Touch ID sheet drops from the top.
On iOS Safari, Face ID overlay obscures the entire screen.
On Android Chrome, the prompt may feel like a system dialog unrelated to your app.

Your architecture must not depend on:

Specific timing.
Specific modal appearance.
Immediate resolution.

This is not a cosmetic issue. It affects retry logic and fallback strategy.

2. The Same WebAuthn Flow Feels Different Everywhere

The WebAuthn API is standardized.

The UX is not.

Chrome (Desktop)

Inline modal.
Clear “Use another device” option.
Relatively consistent error messaging.

Safari (macOS)

OS-native Touch ID sheet.
Less explicit fallback controls.
Errors often appear as generic cancellation.

iOS Safari

Full-screen Face ID overlay.
Sometimes minimal explanation.
Cancellation feels like app failure.

Android Chrome

OS biometric dialog.
Slightly different copy.
Device PIN fallback flows vary by manufacturer.

Your code may be identical:

try {
  const assertion = await navigator.credentials.get({ publicKey: options });
} catch (err) {
  handleError(err);
}

But err.name and user interpretation differ.

Real Example: Cancellation Handling

Common error:

DOMException: NotAllowedError

This can mean:

User cancelled.
Timeout expired.
Platform authenticator unavailable.
Permission denied.

From your frontend perspective:

catch (err) {
  if (err.name === "NotAllowedError") {
    showRetry();
  }
}

But retry logic must consider:

Did the user intentionally cancel?
Did the biometric sensor fail?
Is WebAuthn unsupported?

If you misinterpret cancellation as attack — you create lockouts.

If you misinterpret failure as benign — you create confusion.

UX interpretation is part of your threat model.

3. Permission Dialogs Shape Security Outcomes

Consider initial WebAuthn registration:

await navigator.credentials.create({
  publicKey: options
});

Browser may ask:

“Allow this site to use your security key?”
“Allow Touch ID for this site?”

If your UI does not clearly prepare the user:

The permission dialog feels suspicious.
The user cancels reflexively.
They choose fallback instead.

Repeated friction trains users to:

Prefer weaker flows.
Avoid passwordless enrollment.

Strong crypto loses to confusing UX.

4. Retry Flows Influence Security Behavior

Imagine this frontend flow:

async function startWebAuthn(options) {
  try {
    const assertion = await navigator.credentials.get({ publicKey: options });
    await verify(assertion);
  } catch (err) {
    showRetry();
  }
}

If “Retry” automatically triggers WebAuthn again without context, users may:

Rapidly cancel.
Assume something is broken.
Switch to fallback.

Instead, better UX:

if (err.name === "NotAllowedError") {
  showMessage("Authentication was cancelled. Try again or use Feide login.");
}

Explicit fallback messaging prevents:

Panic.
Repeated failure loops.
Insecure workaround requests (“Can you disable this for me?”).

Retries are not neutral. They shape behavior.

5. Browser UX Affects Security Perception

Security systems rely on trust perception.

If the browser modal:

Looks native and familiar → user trusts it.
Looks alien or unexpected → user suspects phishing.

That’s why WebAuthn is powerful:

Origin binding ensures the browser only shows credentials for the correct site.

But the user doesn’t see origin binding.
They see a modal.

Your UI must:

Clearly explain what is about to happen.
Avoid surprising transitions.
Avoid triggering WebAuthn automatically without context.

Example:

Instead of immediately calling WebAuthn:

<button @click="authenticate">
  Sign in with device
button>

Make the user initiate the action.

User agency increases trust.

6. Why Good UX Prevents Insecure Workarounds

Users do not attack your system.

They bypass it.

If passwordless is confusing, they will:

Ask support to disable it.
Request email-based fallback.
Demand “simpler login”.

If fallback is weak, security erodes.

Good UX reduces these pressures.

Example: Clear device management UI.

Instead of hiding credentials:

var devices = await _db.WebAuthnCredentials
    .Where(c => c.UserId == user.Id)
    .ToListAsync();

Expose:

Device name
Registration date
Revoke button

Transparency builds confidence.

7. Browser Constraints Affect Architecture

You cannot:

Customize biometric prompt text.
Force specific fallback options.
Guarantee consistent timing.

Therefore, architecture must:

Avoid assuming prompt content.
Avoid assuming immediate response.
Support retry and fallback cleanly.
Log error patterns per browser.

Operationally, track:

WebAuthn failures by user agent.
Cancellation frequency.
Fallback usage rates.

UX metrics are security metrics.

8. Cryptography vs Behavior

WebAuthn’s cryptography is solid:

Public key signatures.
Origin binding.
Replay protection.
Counter tracking.

But if:

Users disable it.
Enrollment fails.
Recovery is confusing.
Fallback is hidden.

Then strong algorithms lose to weak experience.

The most secure system is the one users willingly use.

Final Reflection

Security engineers love to debate:

Key lengths.
Counter semantics.
Attestation policies.

But in real deployments, the bigger questions are:

Did the user understand what just happened?
Did the retry flow make sense?
Did cancellation feel safe?
Did fallback feel legitimate?
Did the browser modal align with user expectations?

Browser UX is not decoration layered on top of cryptography.

It is the environment in which cryptography lives.

WebAuthn’s design is brilliant.

But the success of a passwordless-first PWA depends less on elliptic curves — and more on how gracefully your system handles human uncertainty.

Stronger algorithms improve theoretical security.

Clearer UX improves actual security.

And in production systems, actual security is the only kind that matters.

Core Series

Optional Extras

Why Passwordless Alone Is Not an Identity Strategy
→ How Browser UX Shapes Security More Than Cryptography

Why Passwordless Alone Is Not an Identity Strategy

Nguyễn Việt Tùng — Thu, 19 Feb 2026 04:19:02 GMT

When teams adopt WebAuthn or FIDO2, the excitement is understandable:

No passwords.
No phishing.
No credential stuffing.
Biometric UX.
Public-key cryptography.

It feels like the final answer.

But WebAuthn answers exactly one question:

Can this device prove control of a credential for this origin right now?

It does not answer:

Who is this user across systems?
What happens if the device is lost?
How do we bootstrap identity?
How do we link accounts?
How do we recover?
How do we federate across institutions?

Passwordless authentication solves proof of possession.

Identity strategy solves continuity over time.

Those are different problems.

The Illusion of “Pure Passwordless”

It’s tempting to imagine a system that:

Only uses WebAuthn
Has no identity provider
Has no fallback
Has no recovery flow

On paper, that sounds maximally secure.

In reality, it’s brittle.

Let’s walk through real scenarios.

Scenario 1 — Device Loss

User registers WebAuthn credential.

All good.

Then:

Phone is lost.
Laptop is replaced.
Browser storage is cleared.

Now what?

Without fallback:

The account is inaccessible.
Support must intervene manually.
Or recovery becomes weak (email-only reset).

If recovery is ad hoc, security erodes.

If recovery is absent, usability collapses.

This is why fallback is not compromise — it is necessity.

Fallback Is a Design Requirement

Fallback should not mean:

“Use a weaker method.”

It should mean:

“Use an alternate trust anchor.”

In your architecture, that trust anchor was Feide (OIDC).

WebAuthn provided:

Device-bound possession proof.

Feide provided:

Federated identity continuity.

That layering is deliberate.

Passwordless Without Federation Breaks at Scale

In a real system:

Users change devices.
Users move institutions.
Accounts are deactivated upstream.
Identity policies change.

Without federation:

You must manage identity lifecycle yourself.
You must build account verification logic.
You must build secure recovery flows.
You must handle identity merging.

That is significantly more complex than integrating an IdP.

Enrollment Is Identity Design

Enrollment is often treated as a one-time setup.

It is not.

Enrollment defines:

Who is allowed to create a credential?
How is that identity verified?
What trust anchor validates the user at registration?

Example (ASP.NET Core + OIDC bootstrap):

var externalUserId = claims.FindFirst("sub")?.Value;

var user = await FindOrCreateUser(externalUserId);

if (!user.WebAuthnCredentials.Any())
{
    return Redirect("/enable-passwordless");
}

Notice what happened:

OIDC verified identity.
Only then did WebAuthn credential get registered.

WebAuthn did not create identity.

It attached to it.

That ordering matters.

Recovery Is Where Identity Strategy Is Tested

The real test of maturity is not login success.

It’s failure recovery.

Lost device flow:

User authenticates via OIDC.
System validates sub claim.
Existing WebAuthn credentials are revoked.
New device registers fresh credential.

Example revocation logic:

_db.WebAuthnCredentials.RemoveRange(user.WebAuthnCredentials);
await _db.SaveChangesAsync();

Then redirect to registration.

This is structured recovery.

Without OIDC, you would need:

Email-only verification
Manual admin override
Or permanent account loss

None of those scale securely.

Device-Bound Authentication Is Not Portable Identity

WebAuthn credentials are bound to:

Origin
Device
RP ID

They are intentionally non-transferable.

That’s why they’re secure.

But identity is portable.

Identity must:

Survive device turnover
Integrate with external systems
Be recognized across services

That’s federation.

Federation Is Not the Enemy of Passwordless

There’s a misconception:

“If I use OIDC fallback, I weaken passwordless.”

That only happens when fallback bypasses verification.

In your architecture:

OIDC never created a session automatically.
Backend validated ID token.
Internal user mapping occurred.
HTTP-only cookie issued by your system.

OIDC proved identity.

WebAuthn proved possession.

The trust boundaries remained intact.

Architectural Maturity Means Layering

Let’s describe the trust model clearly.

Layer 1: Federation (Feide)

Asserts institutional identity
Manages upstream lifecycle
Provides recovery

Layer 2: Passwordless (WebAuthn)

Proves device possession
Phishing-resistant
Per-origin authentication

Layer 3: Session (HTTP-only cookie)

Server-controlled
Revocable
Protected from JS

Layer 4: Authorization

Application-level access control
Role management

Each layer solves a different problem.

No single layer replaces the others.

The Real Question

When designing authentication, the mature question is not:

“How do we eliminate passwords?”

It is:

“How do we design identity continuity over time?”

Passwordless improves authentication strength.

Federation ensures identity stability.

Together, they create resilience.

What Happens If You Ignore This

If passwordless stands alone:

Enrollment becomes fragile.
Recovery becomes weak.
Identity merging becomes manual.
Device loss becomes support nightmare.
Organizational integration becomes impossible.

The system becomes secure in theory, brittle in reality.

The Strategic Insight

Passwordless is a mechanism.

Identity strategy is a lifecycle.

Mechanisms can be secure.

Lifecycles must be resilient.

Your architecture works because:

It does not idolize passwordless.
It positions WebAuthn as primary.
It retains OIDC as structured fallback.
It treats recovery as planned, not emergency.
It separates identity from possession.

That separation is the mark of architectural maturity.

Final Reflection

Passwordless alone is not enough.

Not because it’s weak.

But because identity is larger than authentication.

A secure system must answer:

Who are you?
Can you prove it now?
What happens if you lose your device?
How do we recognize you tomorrow?
How do we integrate with your organization?

WebAuthn answers one of those questions exceptionally well.

Federation answers the rest.

Designing both — intentionally — is what turns passwordless from a feature into an identity strategy.

Core Series

Optional Extras

→ Why Passwordless Alone Is Not an Identity Strategy
How Browser UX Shapes Security More Than Cryptography

Passwordless: What Worked, What Didn’t, What I’d Change

Nguyễn Việt Tùng — Thu, 19 Feb 2026 03:14:51 GMT

When designing a passwordless-first PWA architecture, the diagram looks elegant.

In production, elegance collides with:

Browser inconsistencies
Institutional identity constraints
Support tickets
Device lifecycle chaos
Monitoring blind spots

Let’s break it down honestly.

What Worked

1️⃣ WebAuthn as Primary Authentication

This worked better than expected.

Users quickly adapted to:

Fingerprint
Face recognition
Device PIN

Support requests about “forgot password” dropped to zero — because passwords were gone.

The combination of:

var result = await _fido2.MakeAssertionAsync(...)

and:

HttpContext.SignInAsync("Cookies", principal);

proved stable and predictable once encoding and session handling were correct.

The strongest success signal:

No phishing-related login issues after deployment.

That is not common.

Avoiding JWT-in-localStorage was absolutely the right call.

options.Cookie.HttpOnly = true;
options.Cookie.SecurePolicy = CookieSecurePolicy.Always;

Benefits:

XSS impact minimized
Simpler revocation model
Clear session lifetime control

Operationally, this reduced attack surface significantly.

3️⃣ Clear Decision Tree

My initial flowchart saved me.

Because when things broke, I always knew which branch was responsible:

WebAuthn failure?
OIDC fallback?
Session misconfiguration?
Credential lifecycle issue?

That clarity matters more than people realize.

Trade-offs I Accepted Knowingly

1️⃣ No Attestation Verification

AttestationConveyancePreference.None

Trade-off:

No hardware manufacturer validation
No enforcement of hardware-backed keys

Why I accepted it:

Lower operational complexity
Better privacy posture
Reduced metadata dependency

In institutional context, identity assurance was already upstream via Feide.

2️⃣ Preferred Instead of Required User Verification

UserVerificationRequirement.Preferred

Trade-off:

Allows authenticators without biometric enforcement
Slightly lower strictness

Why:

Broader device compatibility
Fewer user lockouts
Reduced friction in older hardware environments

Security posture was balanced against accessibility.

3️⃣ No Offline Authentication

PWA expectation:
“It’s installed. It should work offline.”

Reality:
WebAuthn requires server challenge.

I chose not to simulate offline authentication using cached tokens beyond session lifetime.

Trade-off:

Some UX friction
Stronger trust model

Security > illusion of offline login.

What Looked Good on Paper But Failed in Reality

1️⃣ “Users Will Immediately Register Passwordless”

They didn’t.

Even after OIDC login, many skipped enabling passwordless.

The elegant flow:

if (!user.Credentials.Any())
    return Redirect("/enable-passwordless");

In reality:
Users ignored prompts.

Lesson:
Make passwordless enrollment prominent and incentivized.

2️⃣ Counter Strictness

Initially:

if (result.Counter <= storedCounter)
    throw new SecurityException("Possible cloned authenticator");

This caused false positives.

Some authenticators:

Always returned 0
Didn’t increment reliably

Lesson:
Spec compliance is messier than the spec implies.

Relaxed logic to handle zero counters more intelligently.

3️⃣ Browser Error Consistency

I assumed:

“All browsers implement WebAuthn uniformly.”

Reality:

Different error messages
Different cancellation behaviors
Slight timing differences

VueJS error handling needed refinement:

catch (err) {
  if (err.name === "NotAllowedError") {
    showRetry();
  } else {
    showFallbackOption();
  }
}

UX required careful branching.

Operational Lessons

1️⃣ Logging Matters More Than Crypto

You need logs for:

Challenge generation
Assertion verification result
Counter updates
OIDC callback mapping
Session creation

Example structured logging:

_logger.LogInformation("WebAuthn assertion verified for user {UserId}, counter updated to {Counter}",
    user.Id, result.Counter);

Without this, debugging failures becomes guesswork.

2️⃣ Monitoring Authentication Metrics

Track:

WebAuthn success rate
WebAuthn failure rate
OIDC fallback frequency
Credential registrations per day
Counter mismatch events

These reveal patterns:

Device compatibility issues
Misconfiguration
User confusion

Authentication is not “set and forget.”

3️⃣ Support Edge Cases

Real tickets included:

“My fingerprint stopped working after OS update.”
“I cleared my browser data and now can’t log in.”
“I logged in via Feide but it says no account.”

Each required:

Clear recovery path
Transparent error messaging
Internal documentation

Edge cases are not rare. They are normal.

4️⃣ Account Linking Confusion

Some users had:

Multiple institutional identities
Email changes
Duplicate accounts

Relying solely on email would have been disastrous.

Using sub claim for linking was critical:

var externalUserId = claims.FindFirst("sub")?.Value;

Stable identifiers are everything.

What I Would Change

1️⃣ Stronger Enrollment Enforcement

Instead of optional passwordless enablement:

I would require it after first successful OIDC login.

Security adoption improves when it’s default, not optional.

2️⃣ Better Device Management UI

Users should see:

List of registered devices
Last used timestamp
Revoke option

Backend model already supports it:

SELECT * FROM WebAuthnCredentials WHERE UserId = @UserId

But UX should surface it more clearly.

3️⃣ Structured Monitoring Dashboard

Real-time visibility into:

Assertion failures
Counter mismatches
OIDC errors

Would reduce reactive debugging.

4️⃣ Automated Credential Health Checks

Periodic validation:

Detect stale counters
Detect inactive credentials
Flag suspicious behavior

WebAuthn gives strong primitives. Monitoring must match.

The Big Lesson

The hardest part of passwordless authentication is not cryptography.

It is lifecycle management.

WebAuthn works.

OIDC works.

HTTP-only cookies work.

But the real challenge is designing:

Failure handling
Device transitions
Recovery paths
Operational visibility

Security architecture is not proven at deployment.

It is proven over time.

Final Reflection

If I rebuilt this system:

I would keep passwordless-first.
I would keep Feide federation.
I would keep server-controlled sessions.
I would invest earlier in monitoring and enrollment enforcement.

What surprised me most?

How much calmer authentication became once passwords were gone.

No resets.
No reuse.
No phishing alerts.

Just possession proof + federated identity continuity.

That combination feels less like a feature and more like an upgrade to the trust model of the application itself.

And that, ultimately, was the goal of the entire series. This concludes the series, but you can still check out my next optional extras articles next.

Core Series

Optional Extras

Integrating OIDC (Feide) as Fallback and Recovery

Nguyễn Việt Tùng — Wed, 18 Feb 2026 02:59:25 GMT

WebAuthn gave us phishing-resistant, device-bound authentication.
But devices get lost. Browsers reset. Users switch laptops. Institutions manage identities centrally.

That’s where OIDC (Feide) enters — not as a competitor to passwordless, but as structural support.

This article walks through my real implementation:

Frontend: VueJS PWA
Backend: ASP.NET Core
Database: SQL Server
Passwordless: fido2-net-lib
Federation: OpenID Connect (Feide)
Session: HTTP-only cookie

And we’ll focus on four things:

What can Feide bring to the table
How OIDC fits without undermining WebAuthn
Security boundaries between IdP and my system
Account linking in practice

Disclaimer

This article describes architectural patterns and technical approaches based on a real-world implementation. All examples, code snippets, and flow descriptions have been generalized and simplified for educational purposes. No proprietary business logic, confidential configurations, credentials, or organization-specific details are disclosed. The focus is strictly on publicly documented standards (WebAuthn, OIDC) and implementation patterns within a standard VueJS + ASP.NET Core + SQL Server stack.

What can Feide bring to the table

Feide is widely used in Norwegian education and research sectors. That matters for three reasons:

1️⃣ Institutional Identity Already Exists

Users already have:

A managed identity
Centralized credential lifecycle
Organizational trust

Recreating identity inside my PWA would be redundant and weaker.

2️⃣ Compliance & Governance

Institutional IdPs typically enforce:

MFA policies
Password strength
Account revocation
Auditing

By integrating Feide, my system inherits that upstream assurance without storing passwords.

3️⃣ Recovery and Bootstrap

WebAuthn is device-bound.

Feide provides:

Cross-device identity continuity
Secure account recovery
Bootstrap trust for new devices

How OIDC Fits Without Undermining Passwordless

The common fear:

“If I add OIDC fallback, doesn’t that weaken passwordless?”

Only if fallback is careless.

My architecture enforces this model:

WebAuthn = primary authentication
Feide OIDC = bootstrap + recovery
HTTP-only cookie = session integrity
SQL Server = credential persistence

Feide does not authenticate users inside my system directly.

Feide asserts identity.

WebAuthn proves device possession.

Those are different trust layers.

Real OIDC Integration (ASP.NET Core)

My implemented Authorization Code flow with PKCE.

OIDC Configuration

services.AddAuthentication(options =>
{
    options.DefaultScheme = "Cookies";
    options.DefaultChallengeScheme = "oidc";
})
.AddCookie("Cookies", options =>
{
    options.Cookie.HttpOnly = true;
    options.Cookie.SecurePolicy = CookieSecurePolicy.Always;
    options.Cookie.SameSite = SameSiteMode.Lax;
})
.AddOpenIdConnect("oidc", options =>
{
    options.Authority = "https://auth.feide.no";
    options.ClientId = Configuration["Feide:ClientId"];
    options.ClientSecret = Configuration["Feide:ClientSecret"];
    options.ResponseType = "code";
    options.SaveTokens = false;
    options.GetClaimsFromUserInfoEndpoint = true;

    options.Scope.Add("openid");
    options.Scope.Add("profile");
    options.Scope.Add("email");

    options.TokenValidationParameters.NameClaimType = "name";
});

Important detail:

options.SaveTokens = false;

You do not store IdP tokens in the browser.

You convert identity into a server-controlled session.

OIDC Callback Flow

[HttpGet("callback")]
public async Task Callback()
{
    var authenticateResult = await HttpContext.AuthenticateAsync("oidc");

    if (!authenticateResult.Succeeded)
        return Unauthorized();

    var externalUserId = authenticateResult.Principal.FindFirst("sub")?.Value;

    var user = await FindOrCreateUser(externalUserId);

    SignInUser(user.Id);

    if (!user.WebAuthnCredentials.Any())
        return Redirect("/enable-passwordless");

    return Redirect("/dashboard");
}

This is critical:

Feide proves identity.
The system maps that identity to internal user record.
The system issues session cookie.

The IdP does not create sessions in my system.

Security Boundaries Between OIDC and My System

Understanding boundaries prevents architectural confusion.

What OIDC Is Responsible For

Authenticating the user upstream
Issuing ID tokens
Managing institutional identity lifecycle
Enforcing upstream MFA policies

What My System Is Responsible For

Mapping external identity (sub) to internal user
Managing WebAuthn credentials
Verifying FIDO2 assertions
Issuing and invalidating session cookies
Authorization within my application

OIDC is not trusted to:

Authorize application actions
Manage WebAuthn devices
Maintain the session integrity

Trust is layered, not delegated.

Account Linking Considerations

This is where real complexity lives.

OIDC provides:

{
  "sub": "abcd1234",
  "email": "user@example.edu"
}

But what if:

Email changes?
User logs in with different institutional account?
Duplicate local account exists?

You must choose a stable linking strategy.

Recovery Flow Using Feide

Lost device scenario:

User clicks “Login with Feide”
OIDC completes
Identity verified
System invalidates old WebAuthn credentials
User registers new credential

Example revocation:

_db.WebAuthnCredentials.RemoveRange(user.WebAuthnCredentials);
await _db.SaveChangesAsync();

Then redirect to registration.

Recovery is structured. Not improvised.

Why This Does Not Undermine Passwordless

Weak fallback undermines security when:

It bypasses verification
It skips policy
It exists only as emergency shortcut

My implementation ensures:

OIDC must complete successfully
Session is server-issued
WebAuthn remains primary method
Registration after OIDC is explicit

This maintains assurance.

VueJS PWA Integration

From frontend:

function loginWithFeide() {
  window.location.href = "/api/auth/feide-login";
}

No tokens stored client-side.
No JWT in localStorage.
No client-managed identity state.

The PWA only reacts to session cookie.

This keeps attack surface small.

What This Architecture Achieves

By combining:

WebAuthn (device-bound proof)
Feide OIDC (identity continuity)
SQL Server (credential persistence)
HTTP-only cookies (session security)

You achieve:

Phishing resistance
Device lifecycle resilience
Institutional identity integration
Controlled fallback
Clear trust boundaries

Most importantly:

You avoid false dichotomy.

This is not:

“Passwordless vs Federation.”

It is:

“Passwordless for authentication. Federation for identity continuity.”

Final Reflection

Integrating OIDC did not weaken the system.

It completed it.

WebAuthn without federation is brittle.
Federation without WebAuthn is phishable.

Together, they form a layered trust architecture.

In the next article, we’ll examine operational lessons learned after deploying this combined system — including monitoring, auditing, and real-world behavioral patterns that only surface after production traffic begins.

Because authentication design doesn’t end at implementation.

It evolves under pressure.

Core Series

Optional Extras

Implementing WebAuthn in Practice

Nguyễn Việt Tùng — Tue, 17 Feb 2026 08:38:24 GMT

WebAuthn looks deceptively simple at a high level:

Generate challenge
Call browser API
Verify signature
Done

In practice, it is not that simple.

WebAuthn is cryptographically elegant but operationally unforgiving.
Small mistakes create subtle security gaps or inexplicable failures.

This article walks through:

The tooling used
The data model design
Real code from ASP.NET Core + VueJS
Common pitfalls
And what surprised me during implementation

Disclaimer

Tooling Used

Backend: `fido2-net-lib`

For .NET Core, fido2-net-lib is one of the most mature and spec-compliant WebAuthn libraries available.

It handles:

Challenge generation
Attestation verification
Assertion verification
Counter validation
Origin validation
Credential parsing

Initialization:

var fido2 = new Fido2(new Fido2Configuration
{
    ServerDomain = "yourdomain.com",
    ServerName = "Your App",
    Origin = "https://yourdomain.com"
});

The important realization:

The library handles cryptography —
You must handle state.

Frontend: Native WebAuthn API

In VueJS, no heavy library was required.
The browser already implements WebAuthn.

Registration:

const credential = await navigator.credentials.create({
  publicKey: options
});

Authentication:

const assertion = await navigator.credentials.get({
  publicKey: options
});

However:

You must convert Base64URL fields correctly between server and client.

This is one of the first places things break.

Data Model Design (SQL Server)

This is where real decisions matter.

A WebAuthn credential is not just an ID.

Here’s the simplified SQL model:

CREATE TABLE WebAuthnCredentials (
    Id UNIQUEIDENTIFIER PRIMARY KEY,
    UserId UNIQUEIDENTIFIER NOT NULL,
    CredentialId VARBINARY(MAX) NOT NULL,
    PublicKey VARBINARY(MAX) NOT NULL,
    SignatureCounter BIGINT NOT NULL,
    CreatedAt DATETIME2 NOT NULL DEFAULT SYSUTCDATETIME()
);

Why VARBINARY?

Because:

Credential IDs are binary.
Public keys are binary (COSE format).
Storing them as strings introduces encoding risk.

Why store SignatureCounter?

The counter protects against cloned authenticators.

If the new counter ≤ stored counter, something is wrong.

WebAuthn security is incomplete without counter tracking.

Registration Flow (Real Implementation)

Step 1: Generate Options

[HttpPost("register-options")]
public IActionResult RegisterOptions()
{
    var user = GetCurrentUser();

    var options = _fido2.RequestNewCredential(
        new Fido2User
        {
            Id = Encoding.UTF8.GetBytes(user.Id.ToString()),
            Name = user.Email,
            DisplayName = user.Email
        },
        new List(),
        AuthenticatorSelection.Default,
        AttestationConveyancePreference.None
    );

    HttpContext.Session.SetString("fido2.attestationChallenge", options.Challenge);

    return Ok(options);
}

Notice:

Challenge is stored server-side.
Attestation preference set to None (privacy-friendly).
No credentials excluded in this example.

Step 2: Verify Attestation

[HttpPost("verify-registration")]
public async Task VerifyRegistration([FromBody] AuthenticatorAttestationRawResponse attestation)
{
    var challenge = HttpContext.Session.GetString("fido2.attestationChallenge");

    var result = await _fido2.MakeNewCredentialAsync(
        attestation,
        new List(),
        (args) => args.Challenge == challenge
    );

    var credential = new WebAuthnCredential
    {
        UserId = GetCurrentUserId(),
        CredentialId = result.Result.CredentialId,
        PublicKey = result.Result.PublicKey,
        SignatureCounter = result.Result.Counter
    };

    _db.WebAuthnCredentials.Add(credential);
    await _db.SaveChangesAsync();

    return Ok();
}

Key insight:

The challenge validator delegate must explicitly check equality.

Do not assume the library does that for you.

Authentication Flow (Assertion)

Generate Assertion Options

var options = _fido2.GetAssertionOptions(
    storedCredentials,
    UserVerificationRequirement.Preferred
);

HttpContext.Session.SetString("fido2.challenge", options.Challenge);

Verify Assertion

var result = await _fido2.MakeAssertionAsync(
    clientResponse,
    storedCredential.PublicKey,
    storedCredential.SignatureCounter,
    args => args.Challenge == challenge
);

storedCredential.SignatureCounter = result.Counter;
await _db.SaveChangesAsync();

The counter update is not optional.

It is part of replay protection.

Common Implementation Pitfalls

1. Base64URL encoding mismatches

Browser returns ArrayBuffers.
ASP.NET expects byte arrays.

If encoding conversion is inconsistent, verification fails silently.

Solution: Use consistent Base64URL encoding utilities.

Example

const assertion = await navigator.credentials.get({ publicKey: options });

await fetch("/api/auth/verify-webauthn", {
  method: "POST",
  body: JSON.stringify(assertion)
});

Problem: assertion.rawId is an ArrayBuffer — not Base64URL.

Explicit conversion helpers:

function bufferToBase64Url(buffer) {
  return btoa(String.fromCharCode(...new Uint8Array(buffer)))
    .replace(/\+/g, '-')
    .replace(/\//g, '_')
    .replace(/=/g, '');
}

2. Forgetting challenge persistence

If the challenge:

is not stored,
or stored per user incorrectly,
or overwritten in concurrent requests,

verification fails.

Challenge must be:

short-lived,
per session,
non-reusable.

Example

HttpContext.Session.SetString("challenge", options.Challenge);

Then later:

var challenge = HttpContext.Session.GetString("fido2.challenge");

By using 2 different keys would introduce this bug:

Fido2VerificationException: Challenge mismatch

Or:

Fido2VerificationException: Invalid challenge.

3. Not validating origin

Origin mismatch is a common deployment issue.

If your production URL differs from development configuration, authentication breaks.

Example:

Your production:

https://app.yourdomain.com

But config says:

Origin = "https://yourdomain.com"

Subdomain mismatch would lead to this error:

Fido2VerificationException: Invalid origin

Or:

Origin https://app.yourdomain.com does not match expected https://yourdomain.com

4. Counter mishandling

Some authenticators:

return 0 initially.
do not increment as expected.

Your logic must handle legitimate zero counters.

Rejecting zero blindly causes user lockout.

Example

Authenticator returns:

counter = 0

Stored counter also:

Your logic:

if (result.Counter <= storedCounter)
{
    throw new SecurityException("Possible cloned authenticator");
}

Immediate lockout and return error:

Fido2VerificationException: Signature counter did not increase.

Or your own thrown exception:

Possible cloned authenticator detected.

Correct logic: Only enforce monotonicity when counter > 0.

5. Misunderstanding attestation

Attestation verifies device manufacturer.

Most applications do not need this.

Setting AttestationConveyancePreference.None:

avoids privacy concerns,
reduces complexity,
avoids metadata verification headaches.

Example:

You enable:

AttestationConveyancePreference.Direct

Now browser returns full attestation.

But you don’t validate metadata, which would returns:

Fido2VerificationException: Attestation format not supported

Or:

Fido2VerificationException: No metadata service configured

Bonus: Browser-Side Errors

User Cancels

DOMException: The operation was aborted.

Not Allowed

DOMException: The user aborted a request.

Unsupported Platform

NotSupportedError: The operation is not supported.

These are not backend problems — but your UX must handle them gracefully.

What Surprised Me During Implementation

1. How much state management matters

The cryptography is handled by the library.

The complexity lives in:

challenge storage,
session lifecycle,
device registration state,
error branching.

WebAuthn is less about math and more about disciplined state handling.

2. Browser inconsistencies

Different browsers:

format errors differently,
handle cancellation differently,
vary in UI timing.

Your retry UX must account for that.

3. The importance of fallback

The first time a device:

failed biometric recognition,
or returned unexpected counter values,

I realized:

Passwordless-only systems are fragile.

Fallback is not optional.

4. Offline expectations vs reality

Because this is a PWA, users assume:

“It’s installed. It should just work.”

But WebAuthn requires:

live challenge from server,
real-time verification.

Offline login is not true authentication.

Designing expectations around that was essential.

5. The psychological difference

Once implemented properly:

Users stopped typing passwords.

They trusted the system more.

That was not because of UI polish.

It was because:

no secrets were transmitted,
no reset emails were needed,
no password rules existed.

Security felt natural.

That is rare.

Final Reflection

Implementing WebAuthn is not:

copying code from documentation,
adding biometric login,
or flipping a feature flag.

It is:

modeling credentials correctly,
handling state carefully,
validating challenges strictly,
updating counters reliably,
integrating session management securely.

It is architecture expressed through code.

In the next article, we’ll examine the integration of Feide OIDC in more depth — including account linking, token validation, and how federated identity interacts with my passwordless credential lifecycle.

Because WebAuthn proves possession.

Federation proves identity continuity.

Both are required for resilient systems.

Core Series

Optional Extras

Passwordless PWA Flow Architecture Walkthrough

Nguyễn Việt Tùng — Mon, 16 Feb 2026 11:05:54 GMT

Modern authentication diagrams are clean.

Real systems are not.

My architecture intentionally combines:

WebAuthn (FIDO2) for phishing-resistant authentication
Feide (OIDC) for federated identity, recovery, and bootstrap
SQL Server for credential persistence
HTTP-only cookies for secure session handling
VueJS PWA as the user-facing layer

At the center of the system is one key decision:

Does this user already have passwordless enabled?

Everything branches from there.

Disclaimer

The Real Flowchart: The System as a Decision Tree

My initial flowchart expresses the core logic clearly:

User requests authentication.
System checks: Has passwordless been enabled?
If yes → Attempt WebAuthn authentication.
If no → Redirect to Feide OIDC.
If WebAuthn fails → Allow retry.
If retries exhausted → End or fallback.
After successful OIDC → Offer passwordless registration.

This is not UX decoration.

This is an explicit trust state machine.

Let’s walk through it step by step with real code.

Step 1 — VueJS PWA: Begin Authentication

The PWA does not guess the strategy.

It asks the backend.

// VueJS (Composition API)
async function beginLogin(email) {
  const response = await fetch("/api/auth/begin", {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({ email })
  });

  const result = await response.json();

  if (result.strategy === "webauthn") {
    await startWebAuthn(result.options);
  } else if (result.strategy === "oidc") {
    window.location.href = result.redirectUrl;
  }
}

The browser is a mediator.
It does not decide trust.

Step 2 — ASP.NET Core: Decide the Strategy

My backend controls the trust graph.

[HttpPost("begin")]
public async Task Begin([FromBody] LoginRequest request)
{
    var user = await _db.Users
        .Include(u => u.Credentials)
        .FirstOrDefaultAsync(u => u.Email == request.Email);

    if (user == null || !user.Credentials.Any())
    {
        return Ok(new {
            strategy = "oidc",
            redirectUrl = BuildFeideRedirect()
        });
    }

    var fido2 = new Fido2(new Fido2Configuration
    {
        ServerDomain = "yourdomain.com",
        ServerName = "Your App",
        Origin = "https://yourdomain.com"
    });

    var options = fido2.GetAssertionOptions(
        user.Credentials.Select(c => new PublicKeyCredentialDescriptor(c.CredentialId)).ToList(),
        UserVerificationRequirement.Preferred
    );

    HttpContext.Session.SetString("fido2.challenge", options.Challenge);

    return Ok(new {
        strategy = "webauthn",
        options
    });
}

Why this branch exists:

WebAuthn only works if credentials exist.
Backend must know account state.
Trust decisions cannot be client-side.

Step 3 — WebAuthn Authentication (VueJS + Browser API)

async function startWebAuthn(options) {
  try {
    const assertion = await navigator.credentials.get({
      publicKey: options
    });

    const res = await fetch("/api/auth/verify-webauthn", {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify(assertion)
    });

    if (res.ok) {
      window.location.href = "/dashboard";
    } else {
      showRetryOption();
    }
  } catch (err) {
    showRetryOption();
  }
}

The browser enforces:

Origin binding
Authenticator interaction
User presence / verification

It does not verify the signature.

That’s backend responsibility.

Step 4 — Backend Verification with fido2-net-lib

[HttpPost("verify-webauthn")]
public async Task Verify([FromBody] AuthenticatorAssertionRawResponse clientResponse)
{
    var challenge = HttpContext.Session.GetString("fido2.challenge");

    var storedCredential = await _db.Credentials
        .FirstOrDefaultAsync(c => c.CredentialId == clientResponse.Id);

    if (storedCredential == null)
        return Unauthorized();

    var fido2 = new Fido2(_config);

    var result = await fido2.MakeAssertionAsync(
        clientResponse,
        storedCredential.PublicKey,
        storedCredential.SignatureCounter,
        args => args.Challenge == challenge
    );

    storedCredential.SignatureCounter = result.Counter;
    await _db.SaveChangesAsync();

    SignInUser(storedCredential.UserId);

    return Ok();
}

This branch exists because:

Only the server verifies cryptographic proof.
Counters detect cloned authenticators.
Session issuance must be server-controlled.

private void SignInUser(Guid userId)
{
    var claims = new List
    {
        new Claim(ClaimTypes.NameIdentifier, userId.ToString())
    };

    var identity = new ClaimsIdentity(claims, "Cookies");
    var principal = new ClaimsPrincipal(identity);

    HttpContext.SignInAsync("Cookies", principal, new AuthenticationProperties
    {
        IsPersistent = true,
        ExpiresUtc = DateTime.UtcNow.AddHours(8)
    });
}

Configured in Startup.cs:

services.AddAuthentication("Cookies")
    .AddCookie("Cookies", options =>
    {
        options.Cookie.HttpOnly = true;
        options.Cookie.SecurePolicy = CookieSecurePolicy.Always;
        options.Cookie.SameSite = SameSiteMode.Lax;
    });

Why HTTP-only cookie?

Protects against XSS token theft.
Avoids storing JWT in localStorage.
Keeps session server-controlled.

OIDC Fallback — Feide Integration

If passwordless is not enabled, redirect to Feide.

private string BuildFeideRedirect()
{
    return $"{_config["Feide:Authority"]}/authorize" +
           $"?response_type=code" +
           $"&client_id={_config["Feide:ClientId"]}" +
           $"&redirect_uri={_config["Feide:RedirectUri"]}" +
           $"&scope=openid profile email" +
           $"&code_challenge={GeneratePKCEChallenge()}" +
           $"&code_challenge_method=S256";
}

Callback endpoint:

[HttpGet("callback")]
public async Task Callback(string code)
{
    var token = await ExchangeCodeForToken(code);

    var idToken = ValidateIdToken(token.IdToken);

    var user = await FindOrCreateUser(idToken.Sub);

    SignInUser(user.Id);

    if (!user.Credentials.Any())
        return Redirect("/enable-passwordless");

    return Redirect("/dashboard");
}

This branch exists because:

Devices are lost.
Users switch devices.
Federation provides lifecycle continuity.
OIDC provides bootstrap trust.

WebAuthn Registration After OIDC

When enabling passwordless:

[HttpPost("register-options")]
public IActionResult RegisterOptions()
{
    var user = GetCurrentUser();

    var options = _fido2.RequestNewCredential(
        new Fido2User
        {
            DisplayName = user.Email,
            Id = Encoding.UTF8.GetBytes(user.Id.ToString()),
            Name = user.Email
        },
        new List(),
        AuthenticatorSelection.Default,
        AttestationConveyancePreference.None
    );

    HttpContext.Session.SetString("fido2.attestationChallenge", options.Challenge);

    return Ok(options);
}

Client registers via navigator.credentials.create.

Server verifies and stores:

_db.Credentials.Add(new Credential {
    UserId = user.Id,
    CredentialId = result.Result.CredentialId,
    PublicKey = result.Result.PublicKey,
    SignatureCounter = result.Result.Counter
});

Why this branch exists:

Passwordless-first upgrades users.
Registration is explicit.
Device lifecycle is managed.

Role Breakdown

Browser (VueJS PWA)

Initiates flows
Calls WebAuthn API
Handles redirects
Does not store session tokens

Authenticator

Stores private key
Verifies biometric locally
Signs challenges
Never exposes key

Backend (.NET Core)

Controls strategy
Generates challenges
Verifies assertions
Tracks counters
Integrates OIDC
Issues session cookie
Persists credentials in SQL Server

Trust is centralized.
Proof is decentralized.

Why Each Branch Exists

Branch	Real-World Reason
WebAuthn first	Phishing-resistant primary auth
OIDC fallback	Recovery + cross-device bootstrap
Retry WebAuthn	Biometric glitches happen
Registration after OIDC	Upgrade path to passwordless
HTTP-only session cookie	Protect against XSS token theft
Counter tracking	Detect cloned authenticators

None of these branches are decorative.

Each corresponds to a failure mode in reality.

Final Architectural Insight

This system is not:

“Biometric login.”

It is:

Identity bootstrap via federation.
Device-bound authentication via FIDO2.
Session integrity via secure cookies.
Lifecycle management via SQL persistence.
Explicit failure handling.
Clear decision tree.

Passwordless-first is not about removing complexity.

It is about relocating trust:

Away from shared secrets.
Toward cryptographic proof.
While preserving federated continuity.

And when drawn as a flowchart, the system looks clean.

When implemented in VueJS + ASP.NET Core + SQL Server + Feide + fido2-net-lib, it becomes real.

And real systems are where architecture proves itself.

Next article, we’ll explore what broke, what surprised us, and what we learned when this passwordless-first architecture moved from diagram to production.

Core Series

Introduction
Article 1 — Authentication is Not Login
Article 2 — What “Passwordless” Actually Means
Article 3 — WebAuthn & FIDO2, Explained Without the Spec
Article 4 — OpenID Connect as the Glue
Article 5 — Designing a Passwordless-First PWA Architecture
Article 6 — UX and Failure Are Part of the Security Model
→ Article 7 — A Real Passwordless PWA Flow (Architecture Walkthrough)
Article 8 — Implementing WebAuthn in Practice
Article 9 — Integrating OIDC (Feide) as Fallback and Recovery
Article 10 — What Worked, What Didn’t, What I’d Change

Optional Extras

UX and Failure Are Part of the Security Model

Nguyễn Việt Tùng — Sun, 15 Feb 2026 03:09:51 GMT

Security engineers love cryptography because it is clean.

Humans are not.

The strongest authentication protocol in the world can be undone by:

a confusing error message,
an unclear retry flow,
a missing recovery path,
or a user who simply wants to get their work done.

If your authentication design does not account for failure as a first-class scenario, it is not secure — it is brittle.

Passwordless systems amplify this truth.

When WebAuthn works, it feels effortless.
When it fails, it reveals whether your architecture was designed for real life or for a demo.

UX is not decoration layered on top of security.
UX is how security expresses itself.

Retry flows are part of the threat model

Consider a simple scenario:

A user attempts WebAuthn authentication.
They cancel the prompt.

What does that mean?

They changed their mind?
The biometric failed?
The authenticator was unavailable?
A malicious script attempted background authentication?
They clicked too quickly?

Your system must interpret failure deliberately.

Immediate retry?

Too many automatic retries can:

confuse users,
create loops,
mask real issues.

Manual retry?

Clear, explicit retry buttons give users control — and reduce panic.

Escalation to fallback?

At what point does the system say:
“Let’s use your identity provider instead”?

Retry logic is not UX polish.
It is part of the attack surface.

An attacker probing authentication flows will:

trigger errors,
observe timing differences,
test fallback conditions.

Your retry model must:

avoid leaking information,
avoid enabling brute force,
avoid trapping legitimate users.

Lockouts: protection or punishment?

Lockouts are traditionally used to prevent brute-force attacks.

But in passwordless systems:

there is no password to brute force,
biometric verification happens locally,
challenge–response is resistant to replay.

So what are we locking out?

If:

a signature counter mismatch occurs,
an authenticator appears cloned,
repeated failures happen,

a lockout might be justified.

But lockouts must be:

transparent,
recoverable,
tied to real risk signals.

Otherwise, they punish legitimate users for:

device glitches,
browser inconsistencies,
OS updates,
or simply aging hardware.

A mature system distinguishes between:

suspicious activity,
normal friction.

Graceful degradation is more secure than aggressive rejection.

Multi-device reality

Real users do not live on a single device.

They:

switch between phone and laptop,
replace hardware every few years,
clear browsers,
use shared or managed devices.

A passwordless-first system must assume:

multiple credentials per account,
multiple authenticators per user,
credentials that appear and disappear over time.

This changes UX expectations.

When a user logs in from a new device, the system should:

not imply something is wrong,
guide them through identity verification,
allow secure credential registration.

Multi-device support is not optional.
It is the default human condition.

Lost device scenarios are inevitable

The most dangerous authentication system is one that assumes users will never lose access to their authenticators.

Phones are lost.
Laptops are stolen.
Security keys are misplaced.

If your system has no structured recovery path, users will demand one — and you will implement it under pressure.

Good recovery design includes:

A trusted bootstrap identity method (e.g., OIDC).
Clear verification steps.
Revocation of lost credentials.
Controlled registration of new credentials.
Audit visibility for the user.

Recovery must be:

secure,
observable,
friction-aware.

Security questions are not recovery.
Email-only resets are not recovery.
Administrative override is not recovery.

Federated identity exists partly to solve this lifecycle problem.

Why fallback is not a weakness

There is a persistent misconception:

“If the system falls back to another method, it weakens security.”

This is only true if fallback is poorly designed.

Fallback becomes dangerous when it:

bypasses primary controls,
uses weaker authentication without policy,
exists only as an emergency hack.

Fallback becomes strong when it:

is part of the architecture,
requires equivalent assurance,
is auditable,
is rate-limited,
and does not undermine the trust model.

In passwordless-first systems:

WebAuthn provides:

phishing-resistant, device-bound authentication.

OIDC provides:

identity portability,
lifecycle continuity,
bootstrap trust.

They are not substitutes.
They are complementary trust anchors.

The presence of fallback does not weaken security.
Unplanned fallback does.

Graceful degradation is a security feature

Graceful degradation means:

If the optimal path fails,
the system degrades to a slightly less optimal but still secure path —
without chaos.

For example:

WebAuthn unavailable → redirect to OIDC.
OIDC temporarily down → delay login with clear messaging.
Authenticator counter mismatch → require identity re-verification.

The goal is not uninterrupted access at any cost.
The goal is continuity of trust.

Users interpret friction differently depending on clarity.

An unexplained failure feels insecure.
A clearly communicated alternative feels safe.

UX decisions shape security outcomes

A confusing biometric prompt can cause:

users to disable security features,
users to choose weaker alternatives,
users to distrust the system.

An unclear fallback path can cause:

support overload,
ad hoc account resets,
insecure manual overrides.

Every prompt, error message, and redirect is part of the security boundary.

When designing authentication UX, ask:

Does this flow reduce ambiguity?
Does this error explain next steps?
Does this retry loop prevent confusion?
Does this fallback preserve assurance?

Security is not just cryptographic strength.
It is user confidence combined with protocol integrity.

Designing for failure makes systems stronger

Authentication is not about proving success.
It is about handling failure safely.

Passwordless-first systems that ignore failure scenarios:

look elegant in diagrams,
collapse under edge cases,
generate emergency workarounds.

Passwordless-first systems that embrace failure:

define fallback clearly,
support multi-device reality,
structure recovery intentionally,
treat UX as part of the threat model.

That is the difference between a feature and an architecture.

In the next phase of this series, we move from theory to a real implementation — walking through a complete PWA authentication flow that combines WebAuthn and OpenID Connect in production.

Because architecture only proves itself when it survives the unpredictable behavior of actual users.

Core Series

Optional Extras

Designing a Passwordless-First PWA Architecture

Nguyễn Việt Tùng — Sat, 14 Feb 2026 10:35:02 GMT

By this point in the series, we’ve established three things:

Passwords are structurally fragile.
WebAuthn provides phishing-resistant, device-bound authentication.
OpenID Connect provides portable, federated identity.

Now comes the harder question:

How do you design a real Progressive Web App where all of this coexists cleanly?

Because authentication in a PWA is not just about verifying a user once.
It’s about handling devices, sessions, fallbacks, offline behavior, and long-lived installs — without quietly reintroducing the weaknesses you just eliminated.

A passwordless-first architecture is not “always use WebAuthn.”
It’s about deciding when to use it, when to fall back, and how to make those decisions explicit.

Decision points: when to attempt WebAuthn vs fallback

A mature system does not guess. It decides.

There are several common entry scenarios:

1. Known returning user with registered credentials

If the server knows:

this identity has WebAuthn credentials registered,
and the browser supports WebAuthn,

then the system should attempt WebAuthn immediately.

This is the fast path:

request challenge,
call navigator.credentials.get(),
verify assertion,
issue session.

This path should feel frictionless.

2. User has no registered credential

If the server sees:

no WebAuthn credential on file,

it must not attempt WebAuthn.

Instead:

redirect to federated login (OIDC),
or use whatever bootstrap identity method exists.

After successful identity verification:

offer credential registration.

Passwordless-first does not mean passwordless-only.

3. WebAuthn attempt fails

Failure can mean:

user cancels,
authenticator unavailable,
browser does not support feature,
device lost,
counter mismatch,
challenge expired.

Your architecture must define what failure means.

Some failures allow retry.
Some require fallback to OIDC.

The critical point:
Fallback is not an afterthought. It is a planned branch.

If your system has no defined fallback path, it is not production-ready.

Server responsibilities: the part nobody can skip

Passwordless pushes complexity into correctness.

Your server is responsible for:

Credential storage

For each credential, you must store:

credential ID,
public key,
signature counter,
user association,
metadata (optional).

This storage must be:

integrity-protected,
scoped per user,
revocable.

Storing only the public key is not enough.
You must track counters to detect cloned authenticators.

Challenge management

Every authentication attempt must include:

a cryptographically random challenge,
short expiration,
binding to user and session state.

Challenges must:

be unpredictable,
not reusable,
not long-lived.

If you reuse challenges, you reintroduce replay risk.

Assertion verification

Server must:

validate signature against stored public key,
confirm challenge matches,
check origin and RP ID,
verify counter monotonicity.

This is where many implementations quietly break.

WebAuthn security is only as strong as the verification logic.

Credential lifecycle management

Real systems need:

credential revocation,
device labeling,
multi-device support,
audit logs.

Without lifecycle management, passwordless becomes brittle.

Session management in PWAs

Here is where things become interesting.

PWAs blur the line between:

website,
installed app,
long-running client.

Session management must balance:

security,
persistence,
user convenience.

After WebAuthn or OIDC authentication:

you still need a session mechanism.

Common options:

Server-side session (recommended)

Store session ID in HTTP-only cookie.
Session data lives on server.
Token not accessible to JavaScript.

This minimizes XSS risk.

Signed JWT issued after authentication.
Stored in secure, HTTP-only cookie.
Verified per request.

Useful for distributed systems, but must:

be short-lived,
rotated carefully.

Local storage (generally unsafe)

Storing tokens in localStorage:

exposes them to XSS,
encourages long-lived tokens,
complicates revocation.

For PWAs especially, local storage can feel convenient.
It is usually the wrong tradeoff.

Secure token handling: cookies vs storage

The rule is simple:

If JavaScript can read it, malicious JavaScript can steal it.

HTTP-only cookies:

are not accessible via JS,
are automatically sent with requests,
support SameSite protections,
reduce XSS impact.

When properly configured:

Secure,
HTTP-only,
SameSite=Lax or Strict,

cookies are safer for session tokens than browser storage.

The complexity lies in:

CSRF protection,
cross-origin flows,
OIDC redirect handling.

These must be addressed deliberately — not avoided.

Why offline PWAs complicate authentication

PWAs can:

cache assets,
run offline,
queue background sync,
appear installed and persistent.

Authentication systems were not originally designed for this.

Here’s the tension:

WebAuthn requires:

server-issued challenge,
live verification,
session establishment.

Offline mode has:

no server access,
no challenge generation,
no verification endpoint.

Therefore:

You cannot perform real authentication offline.

What you can do is:

cache a previously authenticated session,
gate access behind local checks,
defer sensitive actions until online.

This creates design decisions:

How long can an offline session persist?

Too short:

poor UX.

Too long:

increased risk on stolen devices.

What actions are allowed offline?

Read-only?
Cached data only?
Queued writes?

Offline capability forces you to define trust boundaries explicitly.

Designing the authentication state machine

A passwordless-first PWA architecture behaves like a state machine:

Unknown user → bootstrap via OIDC.
Known user with credential → attempt WebAuthn.
WebAuthn success → issue session.
WebAuthn failure → retry or fallback.
Credential lost → recover via OIDC.
Offline mode → limited local session access.

Every branch must be:

intentional,
tested,
observable.

If your authentication system is not drawn as a flowchart, you probably haven’t finished designing it.

Passwordless-first means opinionated design

It means:

WebAuthn is default.
Federation is structured fallback.
Sessions are server-controlled.
Tokens are protected from JavaScript.
Recovery is first-class.
Offline mode is constrained deliberately.

It does not mean:

removing identity providers,
removing server state,
trusting devices blindly,
or assuming biometrics solve lifecycle problems.

Architecture is about deciding where trust lives.

Passwordless-first architectures shift trust:

away from shared secrets,
toward device-bound credentials,
while preserving federated identity for continuity.

In the next article, we’ll explore how UX decisions — error messages, prompts, retries — shape security outcomes more than cryptography alone.

Because even the strongest architecture must survive human behavior.

Core Series

Optional Extras

OpenID Connect as the Glue

Nguyễn Việt Tùng — Thu, 12 Feb 2026 15:22:41 GMT

If WebAuthn answers the question:

“Can this device prove control of a credential right now?”

OpenID Connect answers a different question entirely:

“Who is this person across systems, time, and organizations?”

Modern authentication systems fail when they confuse those two layers.

WebAuthn is about proof of possession.
OpenID Connect (OIDC) is about portable identity assertions.

When building a Progressive Web App, especially one meant to survive device loss, multi-device usage, and organizational boundaries, OIDC becomes the connective tissue that passwordless authentication alone cannot provide.

This article explains what OIDC actually does, why PWAs still need federated identity, and how flows like Authorization Code + PKCE fit into a passwordless-first architecture.

What OpenID Connect actually provides

OpenID Connect is an identity layer built on OAuth 2.0.

OAuth by itself is about delegated authorization — letting an app access resources on your behalf.

OIDC extends OAuth to answer:

“Who is the authenticated user?”

It does this by issuing an ID token, typically a signed JSON Web Token (JWT), containing claims such as:

subject identifier,
issuer,
audience,
authentication time,
optional profile attributes.

In plain terms, OIDC allows one trusted system (an Identity Provider, or IdP) to tell another system:

“I have authenticated this user, and here is cryptographic proof.”

That’s it.

OIDC does not:

define how users authenticate internally,
guarantee phishing resistance,
manage device credentials,
handle authorization inside your app.

It provides identity assertions, not session logic and not passwordless mechanics.

This distinction is critical.

What OIDC does not provide

Because OIDC is often treated as a complete solution, it’s worth being explicit about what it does not do.

It does not:

eliminate passwords (many IdPs still use them internally),
replace device-bound authentication,
prevent phishing unless combined with phishing-resistant mechanisms,
manage account lifecycle inside your application.

OIDC is transport for identity claims.
It is not an authentication method in itself.

Think of it as a passport issued by another authority.
It tells you who someone is — it does not determine how they proved it.

Why PWAs still need federated identity

At first glance, a passwordless PWA using WebAuthn might seem self-contained.

User registers a credential.
User signs in with a biometric.
Done.

But real systems rarely live in isolation.

PWAs face several realities:

Device loss

If a user loses their only registered device, how do they recover?

WebAuthn is intentionally device-bound. That is a strength for security — but it requires a recovery path.

Federated identity allows:

bootstrap access,
credential re-registration,
account restoration without weak recovery questions.

Multi-device usage

Users expect to log in from:

phone,
laptop,
tablet,
shared workstation.

WebAuthn supports multiple credentials per account — but how does the user prove ownership of the account to register a new device?

Federated identity provides a portable proof of identity across devices.

Organizational trust

In enterprise or education environments, users already have identities managed by:

corporate directories,
institutional identity providers,
centralized account governance.

Federated identity allows your PWA to integrate into that trust fabric instead of inventing its own.

Lifecycle management

Identity providers often handle:

account provisioning,
deactivation,
role updates,
compliance requirements.

A passwordless-only system must reimplement these concerns or delegate them.

In most real-world architectures, delegation wins.

Authorization Code + PKCE (conceptual overview)

Modern browser-based applications — including PWAs — should use the Authorization Code flow with PKCE when integrating with OIDC.

You do not need to memorize the spec to understand the reasoning.

The flow exists to solve a simple problem:

How can a public client (like a browser app) securely obtain tokens without exposing secrets?

Here’s the high-level sequence:

The app initiates login with the IdP.
The browser redirects the user to the IdP’s authentication page.
The user authenticates there.
The IdP redirects back with an authorization code.
The app exchanges that code for tokens.

PKCE (Proof Key for Code Exchange) adds protection by:

generating a one-time verifier,
sending a hashed version during the initial request,
proving possession of the original verifier during token exchange.

This prevents intercepted authorization codes from being reused.

The key idea is this:
Authorization Code + PKCE prevents token theft in public clients.

It does not replace WebAuthn.
It complements it.

Using IdPs for bootstrap

One of the most elegant uses of OIDC in a passwordless-first architecture is during initial account creation.

A user may:

authenticate via an external IdP,
receive a verified identity,
then register a WebAuthn credential tied to that account.

After that:

WebAuthn becomes the default authentication method,
OIDC remains available as fallback.

In this model:

OIDC establishes identity,
WebAuthn establishes device-bound proof,
the two reinforce each other.

Using IdPs for recovery

Recovery is where pure passwordless systems often reveal their fragility.

If:

the only credential is lost,
and no alternative exists,
the account becomes inaccessible.

An IdP provides a recovery path that does not require:

weak security questions,
email-only resets,
or administrative overrides.

The system can:

Require OIDC login.
Validate identity via trusted external authority.
Allow new credential registration.
Invalidate old credentials.

This turns recovery into a structured process instead of an improvised exception.

Using IdPs for portability

Federated identity also enables portability across systems.

If multiple applications trust the same IdP:

users authenticate once,
identity claims propagate,
account linking becomes consistent.

WebAuthn credentials are bound to origins.
OIDC identities are portable across origins.

The combination gives you:

local phishing-resistant authentication,
global identity continuity.

That pairing is powerful.

WebAuthn and OIDC are not competitors

There is a persistent misconception that passwordless authentication replaces federated identity.

It doesn’t.

They solve orthogonal problems:

WebAuthn:

secure, phishing-resistant proof of possession,
bound to origin,
device-specific.

OIDC:

portable identity assertion,
cross-system trust,
organizational integration.

When combined thoughtfully:

WebAuthn becomes the fast path,
OIDC becomes the resilient path,
the user experience remains smooth,
the architecture remains robust.

OIDC as architectural glue

If WebAuthn is the lock on the door, OIDC is the passport system.

WebAuthn ensures:

the right key opens the right lock.

OIDC ensures:

the person behind the key is recognized across contexts.

PWAs that attempt to eliminate federation entirely often rediscover its necessity the hard way — during device loss, enterprise integration, or compliance review.

The mature posture is not choosing between passwordless and federation.

It is designing a system where:

passwordless handles everyday authentication,
federation handles identity continuity,
and neither is forced to solve the other’s problems.

In the next article, we move from protocols to architecture:
how WebAuthn, OIDC, sessions, storage, and fallback flows combine inside a real PWA authentication system.

Because theory becomes meaningful only when it survives implementation.

Core Series

Optional Extras

WebAuthn & FIDO2, Explained Without the Spec

Nguyễn Việt Tùng — Tue, 10 Feb 2026 15:46:54 GMT

If you read the WebAuthn specification end to end, you’ll come away with two thoughts:

This is extremely well designed.
No human should be expected to learn it this way.

WebAuthn didn’t appear to make logins prettier. It exists because the web needed a way to authenticate users without shared secrets, without training users to detect phishing, and without centralizing catastrophic risk.

This article explains what WebAuthn and FIDO2 solve, how they work at a conceptual level, and why their security properties emerge naturally from the design — not from UI tricks or user behavior.

No spec quotes. No diagrams full of acronyms. Just the moving parts that matter.

The problems WebAuthn was designed to solve

WebAuthn doesn’t fix “bad passwords.”
It eliminates the need for passwords entirely.

The core problems it targets are structural:

Shared secrets don’t scale

Passwords are secrets shared between user and server. That single fact creates:

phishing,
credential stuffing,
password reuse,
database breach fallout.

As long as the same secret can be typed and replayed, attackers will find ways to collect it.

Even well-hashed passwords are still verifier secrets.
If an attacker steals the database, they gain:

offline attack capability,
reusable material,
leverage across systems.

WebAuthn removes this entire category of risk by design.

Authentication should not rely on user judgment

Security systems that depend on users spotting fake URLs are already lost.

WebAuthn pushes phishing resistance into the browser and protocol layer, where it belongs. Users don’t need to “be careful.” The system refuses to cooperate with the attacker.

The core idea: public-key credentials

WebAuthn is built on public-key cryptography, but you don’t need to think in equations.

Here’s the mental model:

Each user registers a credential for a specific website.
That credential is a key pair:
- a private key stored securely on the user’s device,
- a public key stored on the server.
The private key never leaves the device.
The public key is useless on its own.

This immediately changes the trust model:

stealing the server database doesn’t let attackers log in,
stealing one device doesn’t compromise other accounts,
credentials can’t be replayed or reused elsewhere.

Challenges and assertions: proving freshness

If the server never sees a secret, how does authentication work?

Through challenge–response.

The challenge

When a user wants to authenticate:

the server generates a random, unpredictable challenge,
the challenge is tied to the session and expires quickly,
the server sends it to the client.

This ensures freshness.
An old response will never work again.

The assertion

The client asks the authenticator:

“Sign this challenge for this site.”

The authenticator:

verifies the user locally (if required),
signs the challenge with the private key,
returns a signed assertion.

The server:

verifies the signature using the stored public key,
confirms the challenge matches,
checks counters to detect replay or cloning.

No secrets compared.
No passwords transmitted.
No reusable proof created.

Platform vs roaming authenticators

Authenticators come in different forms, and WebAuthn treats them explicitly.

Platform authenticators

These are built into devices:

phone biometrics,
laptop fingerprint readers,
OS-level secure enclaves.

Characteristics:

excellent UX,
tightly integrated with the device,
private keys stored in hardware-backed storage.

They are ideal for PWAs because:

they feel native,
they require no extra hardware,
they encourage passwordless adoption.

Roaming authenticators

These are external devices:

USB security keys,
NFC or Bluetooth tokens.

Characteristics:

portable across devices,
extremely strong isolation,
ideal for high-assurance environments.

They solve a different problem: portability without centralization.

Well-designed systems allow both, because users have different constraints.

User presence vs user verification

This distinction is subtle and often misunderstood.

User presence (UP)

User presence means:

the user performed a conscious action,
such as touching a button or tapping a key.

It prevents silent, background authentication.

User verification (UV)

User verification means:

the authenticator verified who is using it,
via biometrics or a PIN.

UV answers: is this the legitimate user of this device?
UP answers: did someone physically interact with the device?

WebAuthn supports both:

some flows require UV,
others accept UP only,
policy decides what’s acceptable.

This flexibility allows systems to balance:

security,
accessibility,
device capability.

Why WebAuthn is phishing-resistant by design

This is the most important property — and it’s not accidental.

WebAuthn credentials are bound to origin.

That means:

a credential created for example.com
will not work for examp1e.com (notice the difference)
or inside an iframe on another site
or on a cloned login page.

The browser enforces this binding.
The user never sees the decision.

A phishing site can:

perfectly mimic your UI,
use the same text,
even embed your real site visually.

But when it asks the browser to authenticate:

no matching credential exists,
the authenticator refuses,
the attack fails silently.

No warning dialogs.
No user training.
No race against social engineering.

This is what “security by design” looks like.

What WebAuthn does not do

Clarity here prevents bad architecture later.

WebAuthn does not:

manage identities across systems,
recover lost accounts,
replace authorization logic,
eliminate the need for backend validation,
solve UX by itself.

WebAuthn solves one problem extremely well:
How can a user prove control of a credential securely, without shared secrets, and without being phishable?

Everything else still requires system design.

WebAuthn as infrastructure, not magic

When WebAuthn works well, it feels invisible:

a prompt,
a touch,
and you’re in.

That simplicity hides real complexity:

cryptography,
device trust,
browser enforcement,
careful server validation.

But that complexity exists whether you manage it or not.
WebAuthn simply exposes it in a form that can be reasoned about and secured.

In the next article, we’ll step back from the protocol and look at how WebAuthn fits into a full PWA architecture — including sessions, fallback paths, and real-world failure modes.

Because passwordless authentication doesn’t live in a vacuum.
It lives inside systems built by humans, for humans, on devices that get lost.

Core Series

Optional Extras

What “Passwordless” Actually Means

Nguyễn Việt Tùng — Sun, 08 Feb 2026 14:56:45 GMT

“Passwordless” has become one of those terms that everyone uses and few people define.

Depending on who you ask, it can mean:

logging in with Face ID,
receiving a magic link by email,
approving a push notification,
using a security key,
or never seeing a login screen at all.

Some of these approaches are genuinely passwordless.
Some merely hide the password.
Some quietly depend on passwords more than ever.

If you don’t define what you mean by passwordless, you can’t design it — and you certainly can’t reason about its security.

This article draws clean boundaries between passwordless, MFA, and passwordless-first, explains the underlying authentication factors in plain language, and shows where WebAuthn actually fits in the picture.

Passwordless vs MFA vs passwordless-first

These terms are often used interchangeably. They are not the same.

MFA: strengthening a password-centric system

Multi-Factor Authentication (MFA) starts from the assumption that a password exists.

The system asks for:

something the user knows (password),
plus something they have (OTP, push approval),
or something they are (biometrics).

MFA reduces risk, but the password remains:

the primary identifier,
the primary target,
and the primary liability.

If the password is phished, reused, or leaked, MFA becomes a race condition instead of a guarantee.

MFA is a reinforcement strategy — not a redesign.

Passwordless: no shared secret with the server

A system is truly passwordless when:

the user does not know a reusable secret,
the server does not store a password equivalent,
and authentication relies on challenge–response, not comparison.

This does not mean there is no authentication factor.
It means the factor is non-reusable and non-transferable.

Email magic links, for example, are passwordless — but fragile.
Security keys and WebAuthn credentials are passwordless — and strong.

The difference is not UX. It’s cryptography.

Passwordless-first: an architectural posture

Passwordless-first describes how the system is designed, not a single mechanism.

In a passwordless-first system:

passwordless is the default path,
fallback exists for recovery and portability,
and passwords (if they exist at all) are not the core identity proof.

This distinction matters because:

real users lose devices,
real systems need recovery,
real organizations need federation.

Passwordless-first systems assume failure and design around it.
Pure passwordless systems often pretend failure won’t happen.

The three authentication factors (without jargon)

Most authentication systems are built from three categories of evidence.
Understanding them clarifies almost every design decision.

Knowledge: something you know

Passwords, PINs, security questions.

Strengths:

portable,
easy to reset.

Weaknesses:

phishable,
guessable,
reusable,
leakable.

Knowledge factors scale badly because humans are terrible secret keepers.

Possession: something you have

Phones, hardware keys, authenticator apps, browsers with stored credentials.

Strengths:

not easily copied,
can be bound to a device,
works well with cryptography.

Weaknesses:

devices can be lost,
possession must be proven securely.

Possession factors are the backbone of modern passwordless systems.

Inherence: something you are

Biometrics like fingerprints, face recognition, or voice.

Strengths:

convenient,
fast,
user-friendly.

Weaknesses:

cannot be changed,
not secret,
not suitable for server-side verification.

Biometrics are excellent local gates.
They are terrible remote identifiers.

This is why modern systems never send biometrics to servers. They use biometrics to unlock something else.

Where WebAuthn fits — and what it actually does

WebAuthn does not authenticate users with biometrics.

WebAuthn authenticates devices and credentials using public-key cryptography.

Here’s the key idea:

the server issues a random challenge,
the client signs it using a private key,
the server verifies the signature with a stored public key.

That’s it.

Biometrics enter the picture only because:

the private key is protected by the authenticator,
and the authenticator requires user verification (biometric or PIN) before using it.

In other words:

the biometric unlocks the key,
the key proves possession,
the signature proves freshness,
and the origin binding prevents phishing.

WebAuthn combines:

possession (the device),
optional inherence (biometric),
and strong cryptography.

That combination is what makes it powerful — not the fingerprint itself.

Common myths about passwordless

“Passwordless means no backend”

False.

Passwordless systems require more backend discipline, not less.

The server must:

generate and track challenges,
store credential metadata securely,
verify signatures correctly,
manage counters and replay protection,
and handle fallback and recovery.

Passwordless removes one fragile secret.
It replaces it with protocol correctness.

“Passwordless locks users to one device”

Only if you design it that way.

WebAuthn credentials are device-bound by default, but systems can support:

multiple registered devices,
roaming authenticators,
cloud-synced credentials (with caveats),
federated recovery via identity providers.

Device loss is not a WebAuthn problem.
It’s an identity lifecycle problem.

“Biometrics identify the user”

They don’t.

Biometrics verify local presence.
They do not establish identity on the network.

Any system that treats biometrics as a remote identifier is misunderstanding both security and privacy.

“Passwordless removes the need for identity providers”

It doesn’t.

Passwordless answers: How does the user prove control right now?
Identity providers answer: Who is this user across systems and time?

The strongest systems use both.

Passwordless is a shift in trust, not a feature

The real change passwordless introduces is where trust lives.

Passwords centralize trust on the server:

one database,
many secrets,
catastrophic failure modes.

Passwordless distributes trust:

keys on devices,
verification on servers,
failure isolated per credential.

This is why passwordless feels different when done properly.
It’s not just smoother — it’s structurally safer.

But only if it’s designed as a system.

In the next article, we’ll zoom in on WebAuthn and FIDO2 themselves, explaining how the protocol works without dragging you through the spec — and why it enables things passwords never could.

Because once you see the mechanics, the architectural choices become inevitable.

Core Series

Optional Extras

Authentication Is Not Login

Nguyễn Việt Tùng — Sat, 07 Feb 2026 14:25:55 GMT

Modern web applications are full of login screens — but surprisingly few of them have a well-designed authentication system.

This distinction matters more than it sounds.
If you treat authentication as a UI feature instead of a security system, every decision that follows will be reactive, fragile, and hard to evolve. Passwordless authentication, biometrics, passkeys, and federated identity all fail when they are bolted onto the wrong mental model.

Before we talk about FIDO, WebAuthn, or PWAs, we need to untangle three ideas that are constantly conflated: identity, authentication, and authorization.

Identity, authentication, and authorization are not the same thing

They often appear together, but they solve different problems.

Identity answers the question: Who is this user supposed to be?
An identity is a logical construct. It might be an email address, a student ID, an employee number, or a subject identifier from an identity provider. Identity exists even when no one is logged in.

Authentication answers the question: Can this user prove they control that identity right now?
Authentication is an event. It happens at a moment in time, using evidence: something the user knows, has, or is. When authentication succeeds, the system gains confidence that the user is who they claim to be.

Authorization answers the question: What is this authenticated identity allowed to do?
Authorization is policy. It decides access to resources after authentication has already happened.

A login screen collapses all three into a single gesture.
A well-designed system does not.

When people say “login,” they usually mean:

identify the user,
authenticate them,
create a session,
and authorize access — all at once.

This compression hides complexity, which is why authentication systems often break under real-world pressure.

Why passwords became a liability

Passwords weren’t always terrible. They were simple, portable, and easy to implement. But they were never designed for the environment they now inhabit.

The modern web has:

thousands of services per user,
phishing at industrial scale,
automated credential testing,
shared devices,
password managers,
and users trained to ignore security warnings just to get work done.

Passwords fail not because users are careless, but because the model is brittle.

A password:

must be remembered,
must be transmitted (even if hashed),
must be reused or rotated,
and must remain secret — forever.

Every one of those constraints breaks under scale.

Once a password exists, it becomes:

a reusable secret,
a target for phishing,
a commodity for attackers,
and a liability for operators.

The security industry tried to patch this with:

complexity rules,
forced rotation,
MFA bolted on afterward,
CAPTCHA,
and endless UX friction.

The result was predictable: more prompts, more fatigue, more insecure workarounds.

Passwordless authentication didn’t emerge because passwords were inconvenient.
It emerged because passwords are structurally incompatible with modern threat models.

Threat models that actually matter for PWAs

Progressive Web Apps inherit all the threats of the web, plus a few of their own.

If you’re building a PWA, these are the threats that should shape your authentication design.

Phishing

Phishing works because passwords are portable secrets.
A fake site only needs to look convincing long enough for the user to type something.

No amount of password complexity fixes this.
If the user can type the secret, an attacker can ask for it.

This is the single strongest argument for WebAuthn-based authentication: credentials are bound to origin. The browser refuses to authenticate for the wrong site. Phishing stops working at the protocol level, not the UX level.

Credential stuffing

Attackers don’t guess passwords anymore. They replay them.

A breach in one system becomes an attack surface for thousands of others. PWAs are particularly exposed because they often serve global audiences with minimal friction to sign up.

Once a password database exists, credential stuffing is inevitable.

Replay attacks

If an authentication response can be reused, it will be.

Tokens, cookies, and session identifiers must be scoped, time-bound, and rotated correctly. PWAs complicate this because they blur the line between web app and installed app, often encouraging long-lived sessions.

Modern authentication systems rely on challenge–response instead of static secrets precisely to prevent replay.

Client compromise and shared devices

PWAs run on devices you do not control:

shared computers,
stolen phones,
locked-down corporate environments.

Authentication must assume that devices can be lost and recovered, not just trusted forever. This is where many “passwordless-only” designs quietly fail.

Why “just add biometrics” is a misunderstanding

Biometrics are not an authentication system.
They are a local user verification mechanism.

This distinction is subtle and critical.

When a user authenticates with biometrics in a WebAuthn flow:

the biometric never leaves the device,
it never identifies the user to the server,
and it is not the credential.

The real credential is a cryptographic key pair stored in the authenticator.
The biometric merely unlocks the private key.

This means:

biometrics do not replace identity,
biometrics do not replace account recovery,
biometrics do not replace authorization,
biometrics do not remove the need for backend logic.

“Adding biometrics” without redesigning the authentication flow usually results in:

biometric prompts guarding a password,
biometrics unlocking stored tokens,
or biometrics acting as cosmetic MFA.

These designs feel modern but inherit all the weaknesses of the underlying system.

True passwordless authentication requires:

server-issued challenges,
public-key verification,
device-bound credentials,
and a fallback strategy for when devices are lost or unavailable.

Biometrics are part of the experience, not the architecture.

Authentication is a system, not a screen

The core mistake teams make is treating authentication as a moment instead of a lifecycle.

A real authentication system must account for:

enrollment,
authentication,
failure,
retry,
recovery,
device loss,
account linking,
and evolution over time.

This is why modern systems combine approaches:

passwordless for speed and phishing resistance,
federated identity for portability and recovery,
policy for authorization,
and UX as an explicit security control.

The goal is not to eliminate login screens.
The goal is to design a system where authentication decisions are deliberate, layered, and resilient.

In the next article, we’ll narrow the scope and define what “passwordless” actually means — and what it does not mean — before diving into WebAuthn and FIDO2 themselves.

Because once the mental model is right, the APIs finally make sense.

Core Series

Optional Extras

Introduction to Passwordless: Modern Authentication Patterns for PWAs

Nguyễn Việt Tùng — Sun, 01 Feb 2026 07:52:28 GMT

Passwords were never designed to scale with the modern web — and Progressive Web Apps inherit all of their weaknesses while adding new constraints of their own.

This series explores how modern PWAs can move beyond passwords using FIDO-powered biometrics (WebAuthn), without falling into the trap of treating passwordless as a silver bullet. Starting from foundational concepts and best practices, it builds toward a real-world implementation I’ve created and improved from a POC that combines passwordless-first authentication with OpenID Connect for fallback and recovery.

The goal is not to explain APIs in isolation, but to show how authentication systems are actually designed, operated, and evolved in practice.

Series structure

Part I — Foundations

Goal: Align readers on what authentication really is before touching APIs.

Article 1 — Authentication is Not Login

Identity vs authentication vs authorization
Why passwords became a liability
Threat models relevant to PWAs (phishing, replay, credential stuffing)
Why “just add biometrics” is a misunderstanding

Article 2 — What “Passwordless” Actually Means

Passwordless vs MFA vs passwordless-first
Knowledge, possession, inherence factors (plain-language explanation)
Where WebAuthn fits
Common myths (passwordless = no backend, passwordless = device lock-in)

Part II — Standards & building blocks

Goal: Introduce the standards without drowning readers in specs.

Article 3 — WebAuthn & FIDO2, Explained Without the Spec

What problems WebAuthn solves
Public-key credentials, challenges, assertions
Platform vs roaming authenticators
User verification vs user presence
Why WebAuthn is phishing-resistant by design

Article 4 — OpenID Connect as the Glue

What OIDC actually provides (and what it doesn’t)
Why PWAs still need federated identity
Authorization Code + PKCE (conceptual, not tutorial-heavy)
Using IdPs for bootstrap, recovery, and portability

Part III — Architecture & best practices

Goal: Show how real systems combine these pieces.

Article 5 — Designing a Passwordless-First PWA Architecture

Decision points: when to attempt WebAuthn vs fallback
Server responsibilities (credential storage, challenges, counters)
Session management in PWAs
Secure token handling (cookies vs storage)
Why offline PWAs complicate auth more than expected

Article 6 — UX and Failure Are Part of the Security Model

Retry flows, lockouts, and graceful degradation
Multi-device reality
Lost device scenarios
Why fallback is not a weakness but a requirement

Part IV — My real implementation

Goal: Ground theory in a real, working flow.

Article 7 — A Real Passwordless PWA Flow (Architecture Walkthrough)

Walk through the full authentication flow I designed
Explain the decision tree (passwordless enabled vs not)
Role of the backend vs browser vs authenticator
Why each branch exists

Article 8 — Implementing WebAuthn in Practice

Tooling used (e.g. WebAuthn server libs, client helpers)
Data models (credential ID, public key, counters)
Common implementation pitfalls
What surprised me during implementation

Article 9 — Integrating OIDC (Feide) as Fallback and Recovery

Why Feide was chosen
How OIDC fits without undermining passwordless
Security boundaries between IdP and my system
Account linking considerations

Part V — Reflection & lessons learned

Goal: Help readers avoid future mistakes.

Article 10 — What Worked, What Didn’t, What I’d Change

Trade-offs I accepted knowingly
Things that looked good on paper but failed in reality
Operational lessons (support, monitoring, edge cases)

Optional supplemental articles

Why Passwordless Alone Is Not an Identity Strategy
- Explain why fallback (OIDC, IdPs, recovery flows) is a design requirement, not a compromise
- Connect passwordless to enrollment, recovery, device loss, and federation
- Show architectural maturity without diving into APIs
How Browser UX Shapes Security More Than Cryptography
- How browser and OS UX decisions constrain authentication design
- Why the same WebAuthn flow feels different in Chrome, Safari, and mobile OSes
- How retries, error messages, and permission dialogs affect security outcomes
- Why good UX prevents insecure workarounds more effectively than stronger algorithms

Core Series

Optional Extras

Cron vs Queue vs Event: Choosing the Right Trigger

Nguyễn Việt Tùng — Sun, 01 Feb 2026 04:39:35 GMT

Most systems don’t fail because they picked the wrong database or framework. They fail because they picked the wrong trigger.

Something runs too early. Or too late. Or too often. Or only when a user happens to click a button. The code is correct, the infrastructure is healthy—and the behavior is still wrong.

This article is about learning to choose how work starts. Not how it’s written, not how it’s optimized, but what causes it to run in the first place. The frameworks and tools code examples used in this article are derived from my real production implementation.

Cron, queues, and events are not interchangeable. They encode different assumptions about time, causality, and responsibility. Understanding those assumptions is how you avoid architectural debt that only becomes visible years later.

Three Triggers, Three Worldviews

At a high level, these mechanisms answer different questions:

Cron answers: “Is it time?”
Queue answers: “Is there work waiting?”
Event answers: “Did something happen?”

They may all execute code, but they live in different mental models.

Time-Triggered Execution (Cron)

Cron is time-driven. It does not care whether anything changed. It cares only that the clock says “now.”

In a Yii/HumHub setup, this often looks like:

* * * * * php yii cron/run

Inside the application:

class CleanupJob extends \humhub\components\CronJob
{
    public function run()
    {
        Post::deleteAll(['<', 'created_at', strtotime('-30 days')]);
    }

    public function getSchedule()
    {
        return self::SCHEDULE_DAILY;
    }
}

Cron is ideal when:

Work must happen even if nothing else happens
You are reconciling or enforcing invariants
You are comfortable with best-effort timing
Skipped runs are acceptable or recoverable

Cron is indifferent. It runs whether the system is busy or idle, healthy or degraded.

That indifference is its power—and its danger.

Work-Triggered Execution (Queues)

Queues are state-driven. They execute because work exists, not because time passed.

In Yii:

Yii::$app->queue->push(new SendNotificationJob([
    'userId' => $userId,
]));

And later:

php yii queue/run

Queues are ideal when:

Work volume is unpredictable
You need retries, delays, or prioritization
Execution should scale independently of scheduling
Latency matters more than exact timing

Queues care deeply about backlog. If nothing is waiting, nothing runs. If a lot is waiting, they absorb pressure instead of collapsing.

Queues answer how much and how fast. They do not answer when something should be considered in the first place.

Event-Driven Execution

Events are causality-driven. They run because something specific happened.

In application code:

Event::trigger(User::class, User::EVENT_AFTER_INSERT, new Event([
    'sender' => $user,
]));

Or conceptually:

onUserRegistered($user) {
    // react immediately
}

Events are ideal when:

A specific state change matters
The reaction must be immediate or contextual
You want minimal latency
You can tolerate missed events only rarely

Events encode meaning, not schedule. They answer why something should run.

But events are fragile when used for work that must happen regardless of user behavior.

A Simple Decision Matrix

When deciding how to trigger work, ask these questions:

1. Does this need to happen even if nobody does anything?

Yes → Cron
No → Queue or Event

2. Does this need to happen immediately when something changes?

Yes → Event
No → Cron or Queue

3. Is the amount of work unpredictable or bursty?

Yes → Queue
No → Cron or Event

4. Can this safely run twice?

No → Avoid cron unless idempotency or locking exists
Yes → Cron becomes viable

5. Is time the reason this work exists?

Yes → Cron
No → Queue or Event

Most bad designs come from answering these questions incorrectly—or not asking them at all.

Hybrid Patterns (Where Real Systems Live)

Mature systems rarely choose just one trigger. They compose them.

Pattern 1: Cron → Queue (Time Decides, Queue Executes)

This is the HumHub pattern that I’ve implemented.

* * * * * php yii cron/run

public function run()
{
    Yii::$app->queue->push(new RecalculateStatsJob());
}

Cron decides when.
The queue decides how fast.

This is ideal for periodic but heavy work.

Pattern 2: Event → Queue (Meaning Decides, Queue Executes)

Event::on(User::class, User::EVENT_AFTER_INSERT, function ($event) {
    Yii::$app->queue->push(new SendWelcomeEmailJob([
        'userId' => $event->sender->id,
    ]));
});

The event provides context.
The queue provides resilience.

This keeps user-facing actions fast while preserving intent.

Pattern 3: Event + Cron (Immediate + Reconciliation)

Events do the fast path. Cron does the safety net.

// Event-driven
onOrderPaid($order) {
    markAsProcessed($order);
}

// Cron-driven reconciliation
class OrderReconciliationJob extends CronJob {
    public function run() {
        $this->fixInconsistentOrders();
    }
}

This pattern accepts that events can be missed and uses time-based checks to restore correctness.

This is not redundancy. It is defensive architecture.

Why Misuse Creates Architectural Debt

The most expensive mistakes are subtle.

Using Cron for Event-Driven Work

Example:

“Send emails every minute and see who signed up.”

Problems:

Unnecessary polling
Delayed reactions
Growing query cost
Confusing intent

You encoded meaning (user signed up) as time (every minute). That mismatch becomes technical debt.

Using Events for Time-Based Guarantees

Example:

“Clean up expired sessions when users log in.”

What if nobody logs in?

Now cleanup depends on behavior unrelated to the task’s purpose. That’s accidental coupling.

Using Queues as a Scheduler

Example:

“Push a delayed job and hope it fires at the right time.”

Queues are not clocks. Delays drift. Retries compound. Restarts blur guarantees.

Time-based intent belongs to a scheduler, not a backlog.

The Core Insight: Triggers Encode Assumptions

Every trigger bakes in assumptions:

Cron assumes time is the reason work exists
Queues assume work volume is the problem
Events assume causality is the signal

If those assumptions are wrong, the system still works—just increasingly badly.

That’s why misuse is so dangerous. It doesn’t break immediately. It ages poorly.

Designing Instead of Guessing

Instead of asking:

“How should I run this code?”

Ask:

“Why should this code run at all?”

If the answer is “because time passed” → Cron
If the answer is “because work exists” → Queue
If the answer is “because something happened” → Event

Only after that do frameworks, tools, and syntax matter.

The Takeaway

Cron, queues, and events are not competitors. They are orthogonal tools.

Good systems:

Use cron sparingly and deliberately
Let queues absorb pressure
Let events carry meaning
Combine them where guarantees matter

Bad systems pick one and force everything through it.

Choosing the right trigger is not an implementation detail.
It’s an architectural commitment.

And once you see that clearly, entire classes of problems stop appearing—not because you fixed them, but because you never created them in the first place.

Core Series

Introduction
Part 1: Cron: The Invisible Operating System
Part 2: Anatomy of a Cron Job
Part 3: Cron at Scale: Patterns and Anti-Patterns
Part 4: Cron in Frameworks: From Theory to Convention
Part 5: HumHub & Yii: Design Intent Behind the Cron Architecture
Part 6: A Real Production Setup: What I Actually Built
Part 7: Failure Modes, Tradeoffs, and Lessons Learned
Part 8: The Evolution Path: From Cron to Orchestration

Optional Extras

⏳ Cron Lies: When Scheduled Jobs Don’t Run
🔁 Idempotency: The Most Important Word in Cron
→ ⚖️ Cron vs Queue vs Event

Idempotency: The Most Important Word in Cron You’re Probably Ignoring

Nguyễn Việt Tùng — Sun, 01 Feb 2026 04:18:35 GMT

If cron has a single moral lesson, it’s this: time does not guarantee uniqueness.

Jobs run twice. Or zero times. Or half a time. Or later than expected. Cron does not promise exactly once execution, and every system that assumes it does eventually pays for that assumption.

Idempotency is how you survive that reality.

This article unpacks what idempotency really means, why cron jobs must be idempotent, what goes wrong when they aren’t, and how to design idempotent jobs specifically in Yii and HumHub, based on my implemented real production patterns—not theory.

What Idempotency Actually Means (Without Hand-Waving)

A piece of logic is idempotent if running it multiple times produces the same final state as running it once.

Not “similar.”
Not “probably fine.”
The same.

Classic examples:

Setting a value: status = 'active'
Rebuilding an index from source-of-truth data
Deleting records older than a cutoff date

Non-idempotent actions:

Incrementing counters
Sending emails
Charging money
Appending rows blindly
“Process everything since last run” without state

Idempotency is not about how often something runs.
It’s about what happens when it runs again.

Cron forces this distinction because repetition is normal, not exceptional.

Why Cron Jobs Must Be Idempotent

Cron has no memory (or stateless, in more technical term).

It does not know:

Whether a job ran before
Whether it finished
Whether it partially failed
Whether it overlapped with itself
Whether the system was down for hours

From cron’s point of view, every run is a fresh attempt.

That means every cron job must assume:

It may run twice
It may run late
It may run concurrently
It may be retried manually
It may be restarted mid-execution

If a job cannot tolerate these conditions, it is fragile by definition.

Idempotency is not a “best practice” here.
It is a precondition for correctness.

Real Non-Idempotent Disasters (All Painfully Common)

1. Duplicate Emails

public function run()
{
    foreach ($this->getUsersToNotify() as $user) {
        $this->mailer->send($user);
    }
}

What happens when:

The job overlaps?
Someone reruns it manually?
The server restarts mid-run?

Users get two emails. Or three. Or five.

Nothing crashes. Logs look normal. Trust erodes quietly.

2. Double Charges / Double Credits

public function run()
{
    $orders = Order::find()->where(['status' => 'pending'])->all();

    foreach ($orders as $order) {
        $this->charge($order);
        $order->status = 'paid';
        $order->save();
    }
}

If the process crashes after charge() but before save():

Next run charges again
Status still says pending
You now have financial damage, not a bug

This is how companies end up writing apology emails.

3. “Since Last Run” Logic Without State

$lastRun = strtotime('-1 hour');
$items = Item::find()->where(['>', 'created_at', $lastRun])->all();

This assumes:

The job ran exactly one hour ago
Time moved forward smoothly
No downtime occurred

All three assumptions are false in production.

The Mental Shift: Cron Jobs Are Reconciliation Jobs

The safest cron jobs don’t process events.
They reconcile state.

Instead of:

“Do X for everything that happened since last time”

Think:

“Given the current truth, what should the system look like now?”

That shift is the heart of idempotency.

Yii & HumHub: Practical Idempotency Patterns

Let’s move from theory to code. Yii & Humhub are used to powered my real production implementation.

1. Use State as a Guard, Not Time

Instead of tracking when something last ran, track what has already been done.

$users = User::find()
    ->where(['notification_sent' => false])
    ->all();

foreach ($users as $user) {
    $this->sendNotification($user);
    $user->notification_sent = true;
    $user->save();
}

Now:

Reruns do nothing
Partial runs resume safely
Overlaps converge on the same result

This is idempotency via explicit state.

2. Make Jobs Self-Skipping

In HumHub-style cron jobs, it’s completely valid for a job to decide it has nothing to do.

public function run()
{
    if (!$this->hasWorkToDo()) {
        Yii::info('CleanupJob: nothing to clean');
        return;
    }

    $this->cleanup();
}

A job that runs and does nothing is successful, not broken.

Idempotent jobs are comfortable with no-ops.

3. Use Unique Constraints as a Safety Net

Databases are excellent idempotency enforcers.

UNIQUE (user_id, notification_type)

try {
    Notification::create([
        'user_id' => $userId,
        'type' => 'weekly_summary',
    ]);
} catch (\yii\db\Exception $e) {
    // Already exists → safe to ignore
}

Now:

Double execution collapses into one record
The database enforces correctness
Your job logic stays simple

This is one of the strongest patterns available.

4. Lock Only When You Must

Idempotency reduces the need for locks—but doesn’t eliminate it.

For jobs that must not overlap:

$lock = Yii::$app->mutex;

if (!$lock->acquire('daily_cleanup', 0)) {
    return;
}

try {
    $this->cleanup();
} finally {
    $lock->release('daily_cleanup');
}

Locks prevent concurrency.
Idempotency prevents damage.

They solve different problems. Use both intentionally.

5. Rebuild, Don’t Mutate, When Possible

The most robust cron jobs rebuild derived data from scratch.

SearchIndex::deleteAll();

foreach (Post::find()->all() as $post) {
    SearchIndex::index($post);
}

This feels inefficient—but it’s correct.

If the source of truth is reliable, rebuilding is naturally idempotent.

Optimization can come later. Correctness cannot.

Why Idempotency Is Rarely Explained Well

Because it’s uncomfortable.

It forces you to confront:

Partial failure
Repetition
Uncertainty
The illusion of control

Most tutorials assume:

Jobs run once
Systems are up
Time is reliable
Humans don’t rerun things

Production assumes none of that.

Idempotency is not glamorous.
It doesn’t show up in benchmarks.
But it quietly prevents disasters.

Why This Is Immediately Practical

Once you start asking:

“What happens if this runs twice?”

Your design changes immediately.

You stop using counters blindly
You stop trusting timestamps
You stop assuming order
You start encoding intent in state

Cron becomes safer overnight—not because cron changed, but because your thinking did.

The Real Takeaway

Cron doesn’t punish non-idempotent code instantly.
It waits.

It lets the system run.
It lets data accumulate.
It lets assumptions fossilize.

Then one day:

A server restarts
A job overlaps
Someone reruns a command

And the damage appears all at once.

Idempotency is how you make cron boring again.

And boring, in production systems, is the highest compliment there is.

Core Series

Introduction
Part 1: Cron: The Invisible Operating System
Part 2: Anatomy of a Cron Job
Part 3: Cron at Scale: Patterns and Anti-Patterns
Part 4: Cron in Frameworks: From Theory to Convention
Part 5: HumHub & Yii: Design Intent Behind the Cron Architecture
Part 6: A Real Production Setup: What I Actually Built
Part 7: Failure Modes, Tradeoffs, and Lessons Learned
Part 8: The Evolution Path: From Cron to Orchestration

Optional Extras

⏳ Cron Lies: When Scheduled Jobs Don’t Run
→ 🔁 Idempotency: The Most Important Word in Cron
⚖️ Cron vs Queue vs Event

Cron Lies: When Scheduled Jobs Don’t Run

Nguyễn Việt Tùng — Sun, 01 Feb 2026 03:53:32 GMT

Cron has a reputation for honesty. You tell it when, it runs then. If something didn’t happen, the instinctive conclusion is: “the code must be broken.”

In production, that assumption is often wrong.

This article is about the more uncomfortable truth: cron can fail without lying to you explicitly. Jobs don’t run, or don’t run when you think they do, and the system emits no signal strong enough to attract attention. Everything looks calm. Until a human notices a missing effect.

This is where defensive thinking begins.

The Most Dangerous Failure Mode: Silence

Cron almost never crashes loudly. It fails quietly.

If a scheduled job:

Never executes
Executes under the wrong conditions
Executes later than expected
Executes and does nothing

…the system often produces the same observable result: nothing happens.

That silence is what makes cron failures expensive. They age quietly.

“Every Minute” Is a Filter, Not a Promise

Let’s start with the most common misunderstanding:

* * * * * php yii cron/run

This does not mean:

“This command will run every 60 seconds, no matter what.”

What it actually means:

“At each minute boundary where the system clock matches this expression, attempt execution.”

That distinction matters more than most people realize.

Cron does not:

Track last execution time
Compensate for downtime
Retry missed runs
Guarantee spacing between runs

It evaluates the current time, not elapsed time.

If the system clock jumps forward, backward, or disappears entirely for a while, cron does not reconcile history. It lives in the present tense only.

Downtime Windows: The Lie of Continuity

In my real production setup, this constraint existed:

Development and UAT servers were shut down daily from 00:00 to 08:00 SGT

During that window:

No cron daemon
No cron/run
No queue/run

From cron’s perspective, those minutes never existed.

Consider an interval job inside HumHub (which powered my real production web app) that conceptually runs “hourly”:

class CleanupJob extends \humhub\components\CronJob
{
    public function run()
    {
        // cleanup logic
    }

    public function getSchedule()
    {
        return self::SCHEDULE_HOURLY;
    }
}

If this job was due at:

01:00
02:00
03:00

…and the server was down?

Those executions are not “late.”
They are lost.

When the server boots at 08:00:

cron/run evaluates now
The job decides whether now matches its schedule
Missed intent is not replayed unless explicitly coded

This is not a bug. It’s a property.

The Defensive Lesson

If a job must run for every interval, cron alone is insufficient.
You need state.

If a job can tolerate skips, cron is perfectly adequate—but only if you consciously design for that tolerance.

Silent Failure #1: Disabled Cron Services

One of the most insidious cron failures is also the simplest:

The cron daemon is not running.

This happens more often than people admit:

Servers reboot
Cron services fail to start
Containers don’t include cron at all
systemd timers are misconfigured

From the application’s point of view:

Nothing throws an error
No exception is logged
No stack trace exists

My Yii or HumHub code is innocent. It was never invoked.

Defensive Pattern

Treat cron as external infrastructure, not application logic.

At minimum:

Periodically log “cron heartbeat” execution
Alert if no cron activity is observed for a threshold window

Cron itself will not tell you it is dead.

Silent Failure #2: Output Suppression Without Compensation

Your production crontab did this:

>/dev/null 2>&1

That choice was intentional and reasonable. But it came with a requirement:

Every meaningful failure must be logged inside the application.

If a job does this:

public function run()
{
    $result = $this->callExternalApi();

    if (!$result) {
        return;
    }

    // continue
}

Then a failure looks identical to:

Job never ran
Job ran but exited early
Job ran and succeeded with no effect

From the outside, all three collapse into silence.

Defensive Pattern

Every cron job should answer at least one of these questions explicitly:

“I ran”
“I skipped myself”
“I failed”

Not necessarily loudly—but traceably.

Silence must be meaningful, not ambiguous.

Silent Failure #3: Clock Drift

Cron trusts the system clock. Completely.

If the clock is wrong:

Jobs run at the wrong time
Jobs cluster unexpectedly
Jobs appear to “randomly” skip

Clock drift is especially dangerous because:

It accumulates slowly
It rarely breaks tests
It rarely shows up in logs

You don’t need extreme drift for damage. A few minutes is enough to:

Break SLAs
Miss external deadlines
Violate assumptions baked into job logic

Defensive Pattern

Time-based systems should:

Assume time can be wrong
Avoid exact-time equality checks
Prefer ranges over instants

Cron is punctual only relative to the clock it’s given.

Silent Failure #4: Overlap That Cancels Itself Out

A job that overlaps itself can fail without error.

Example:

* * * * * php yii queue/run

If queue/run takes longer than one minute:

A second instance starts
Both compete for resources
One may exit early
Or both may do partial work

If jobs are idempotent, this is survivable.
If they are not, this is corruption without alarms.

Nothing crashes. No exception bubbles up.
The system simply does the wrong thing quietly.

The Big Lie: “If It Didn’t Happen, Something Would Have Alerted”

Cron does not alert.
Cron does not retry.
Cron does not remember.

Unless you build signals around it, cron failures degrade into:

Missing emails
Stale data
Incomplete cleanup
User-visible oddities discovered late

This is why cron failures are so often discovered by humans first.

Why This Is So Valuable to Learn Early

This article pairs perfectly with the dev/UAT downtime case because it reveals something deeper:

Cron teaches you what your system actually assumes about time.

Nightly downtime forces the issue.
Clock drift exposes it.
Silent failures make it undeniable.

Once you internalize this, your design changes:

Jobs become idempotent
Skipped runs are expected, not feared
“Every minute” is treated as best-effort
Observability moves into the application layer

You stop trusting time blindly.

The Defensive Mindset Cron Forces

Cron is not malicious.
It’s not broken.
It’s just honest in a way that software engineers aren’t used to.

It will:

Try to run things when time matches
Say nothing if it can’t
Move on without guilt

If you design with that truth in mind, cron becomes predictable—even comforting.

If you don’t, cron will lie to you politely for months.

And the worst part is not that scheduled jobs don’t run.

It’s that nobody notices when they don’t.

Core Series

Introduction
Part 1: Cron: The Invisible Operating System
Part 2: Anatomy of a Cron Job
Part 3: Cron at Scale: Patterns and Anti-Patterns
Part 4: Cron in Frameworks: From Theory to Convention
Part 5: HumHub & Yii: Design Intent Behind the Cron Architecture
Part 6: A Real Production Setup: What I Actually Built
Part 7: Failure Modes, Tradeoffs, and Lessons Learned
Part 8: The Evolution Path: From Cron to Orchestration

Optional Extras

→ ⏳ Cron Lies: When Scheduled Jobs Don’t Run
🔁 Idempotency: The Most Important Word in Cron
⚖️ Cron vs Queue vs Event

The Evolution Path: From Cron to Orchestration

Nguyễn Việt Tùng — Sun, 25 Jan 2026 04:25:51 GMT

Cron has a strange reputation arc. Early in a system’s life, it feels empowering. Later, it’s blamed for things it never promised to do. Somewhere in the middle, teams either replace it wholesale—or quietly keep it while pretending they didn’t.

The truth is less dramatic. Cron doesn’t become obsolete; systems outgrow what they ask cron to do.

This final article is about placing cron correctly in the modern ecosystem: knowing when it’s enough, when it needs help, and how to evolve without ripping out the foundation you’ve already built.

When Cron Is Enough

Cron is enough when time-based intent is simple and execution is bounded.

In the system described earlier—built with Yii and HumHub—cron remained effective because it stayed within its competence:

Schedules were coarse (minute-level, not second-level)
Jobs were idempotent
Heavy work was delegated
The number of scheduling entry points was small
Operational expectations were clear

A setup like this is not fragile. It’s boring. And boring infrastructure ages well.

If your system:

Runs on a small number of hosts
Has predictable workloads
Can tolerate minute-level latency
Already separates scheduling from execution

Then cron is not your bottleneck. Complexity elsewhere will hurt you first.

When Cron Starts to Feel Tight

Cron starts to feel constraining when execution outpaces scheduling.

Common signals include:

Queues growing faster than they drain
Jobs overlapping more often than expected
Pressure to reduce latency below one minute
Multiple machines needing coordination
Manual reruns becoming operationally risky

None of these mean cron is “bad.” They mean cron is being asked to coordinate behavior across time, load, and topology—which is beyond its remit.

This is the point where you add layers, not replacements.

Adding Persistent Workers

The first evolution is usually not replacing cron, but removing its loop overhead.

In earlier examples, async jobs were processed like this:

* * * * * php yii queue/run

This works well, but it has a cost:

PHP boots every minute
Configuration reloads every minute
Cold starts dominate short jobs

With higher volume, the natural step is persistent workers.

Conceptually, the code doesn’t change:

Yii::$app->queue->push(new SendNotificationJob([
    'userId' => $userId,
]));

What changes is how workers run:

Long-lived processes
Managed by a supervisor
Restarted on failure
Scaled horizontally

Cron still matters here. It often remains responsible for:

Kicking off workers if they die
Scheduling low-frequency maintenance tasks
Acting as a safety net

Cron steps back from execution, not from scheduling.

Introducing Distributed Queues

The next pressure point is distribution.

As soon as multiple machines process jobs, new questions appear:

Who owns which jobs?
How are retries coordinated?
How do you prevent double execution?

Distributed queues answer these questions by centralizing state.

From the application’s point of view, nothing dramatic changes:

Yii::$app->queue->push(new RebuildIndexJob([
    'resourceId' => $id,
]));

But operationally:

Workers can live anywhere
Failures are isolated
Throughput scales independently of scheduling

Cron’s role narrows again:

Trigger periodic enqueues
Schedule reconciliation or cleanup
Act as a time-based initiator

The system becomes event-heavy, but cron still provides temporal structure.

Cloud Schedulers: Cron, With a Different Accent

Cloud schedulers often get framed as “replacing cron.” They don’t. They externalize it.

A cloud scheduler:

Triggers execution based on time
Runs independently of your hosts
Integrates with managed services

What changes is where the clock lives, not the concept.

Instead of:

* * * * * php yii cron/run

You get:

A managed time trigger
Calling an endpoint
Or invoking a job runner

The same design questions remain:

What happens if execution is delayed?
How do you ensure idempotency?
Where does state live?

Cloud schedulers reduce operational burden, not architectural responsibility.

Migration Strategies That Don’t Hurt

The biggest mistake teams make is trying to “modernize” cron in one leap.

The safer pattern is progressive delegation.

Start by isolating scheduling intent
If schedules live in crontab entries scattered across servers, centralize them in application code first.
Delegate execution gradually
Move heavy logic behind queues or workers without changing cron triggers.
Introduce persistence where it helps
Replace minute-based polling with long-lived workers only when startup cost dominates.
Externalize time last
Move the clock out of the OS only when infrastructure maturity supports it.

At no point do you need to declare cron “deprecated.” You just ask it to do less.

Hybrid Models: Cron + Event-Driven Systems

In mature systems, cron rarely disappears. It becomes one trigger among many.

A common hybrid looks like this:

Events trigger most work
Queues handle execution
Workers process continuously
Cron handles:
- Reconciliation
- Cleanup
- Periodic audits
- Safety checks

Cron becomes the system’s conscience—periodically asking, “Is reality still consistent with our assumptions?”

That role doesn’t go away, no matter how event-driven you become.

Knowing Cron’s Proper Place

Cron’s proper place is not at the center of execution.
It’s at the boundary between time and intent.

It answers:

When should something be considered?

It does not answer:

How should it scale?
How should it recover?
How should it coordinate across machines?

The moment you expect cron to answer those questions, you’re setting yourself up for disappointment.

But when you let cron do exactly what it’s good at—and no more—it remains one of the most stable pieces of infrastructure you’ll ever rely on.

The Quiet Ending

There’s a reason cron survives every architectural fashion cycle. It encodes a truth that doesn’t age: time keeps passing, whether your system reacts or not.

Modern orchestration doesn’t replace that truth. It builds around it.

If readers walk away from this series with one instinct sharpened, let it be this:

Don’t ask cron to be clever. Ask it to be punctual—and design everything else to handle the consequences.

That’s not nostalgia.
That’s systems thinking.

Core Series

Introduction
Part 1: Cron: The Invisible Operating System
Part 2: Anatomy of a Cron Job
Part 3: Cron at Scale: Patterns and Anti-Patterns
Part 4: Cron in Frameworks: From Theory to Convention
Part 5: HumHub & Yii: Design Intent Behind the Cron Architecture
Part 6: A Real Production Setup: What I Actually Built
Part 7: Failure Modes, Tradeoffs, and Lessons Learned
→ Part 8: The Evolution Path: From Cron to Orchestration

Optional Extras

⏳ Cron Lies: When Scheduled Jobs Don’t Run
🔁 Idempotency: The Most Important Word in Cron
⚖️ Cron vs Queue vs Event

Devpath Traveler

Designing the BFF Contract: Request Aggregation & Client-Specific Shaping

Aggregation is not just parallel fetching

When aggregation goes wrong

Response shaping: the frontend owns the shape

Flatten, don't nest

Computed fields belong in the BFF

Naming conventions are a contract decision

Versioning: the problem you are creating

URL versioning for major breaking changes

Header versioning for incremental evolution

Additive-only evolution: the best versioning strategy

The boundary: what belongs in the BFF

BFF owns

Upstream services own

The grey area: where teams disagree

Designing for the Vue component tree

A note on OpenAPI and type safety

What comes next

☰ Series navigation

What Is BFF — and When Is It Actually Worth It?

The problem, stated plainly

What BFF actually is

What BFF actually costs

When BFF is worth it

When BFF is not worth it

The decision framework

What this series covers from here

☰ Series navigation

Introduction to The Frontend's Contract: Building Backends for Frontends with Vue.js, .NET Core & Azure

Series Guideline

Part I: Foundations (Concepts and Architecture)

Part II: Implementation (Code)

Part III: Production & Operations (Ops)

Supplementary articles

☰ Series navigation

How Browser UX Shapes Security More Than Cryptography

1. Browser and OS UX Constrain Your Architecture

Example: Timing Assumptions

2. The Same WebAuthn Flow Feels Different Everywhere

Chrome (Desktop)

Safari (macOS)

iOS Safari

Android Chrome

Real Example: Cancellation Handling

3. Permission Dialogs Shape Security Outcomes

4. Retry Flows Influence Security Behavior

5. Browser UX Affects Security Perception

6. Why Good UX Prevents Insecure Workarounds

7. Browser Constraints Affect Architecture

8. Cryptography vs Behavior

Final Reflection

☰ Series Navigation

Core Series

Optional Extras

Why Passwordless Alone Is Not an Identity Strategy

The Illusion of “Pure Passwordless”

Scenario 1 — Device Loss

Fallback Is a Design Requirement

Passwordless Without Federation Breaks at Scale

Enrollment Is Identity Design

Recovery Is Where Identity Strategy Is Tested

Device-Bound Authentication Is Not Portable Identity

Federation Is Not the Enemy of Passwordless

Architectural Maturity Means Layering

The Real Question

What Happens If You Ignore This

The Strategic Insight

Final Reflection

☰ Series Navigation

Core Series

Optional Extras

Passwordless: What Worked, What Didn’t, What I’d Change

What Worked

1️⃣ WebAuthn as Primary Authentication

2️⃣ HTTP-only Cookie Sessions

3️⃣ Clear Decision Tree

Trade-offs I Accepted Knowingly

1️⃣ No Attestation Verification

2️⃣ Preferred Instead of Required User Verification

Backend: `fido2-net-lib`