Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

Getting Started

Welcome to the Authentication Service! This tutorial will guide you through setting up your local development environment and running the service for the first time.

Prerequisites

Before you begin, ensure you have the following installed:

  • Docker and Docker Compose
  • Just (a command runner: brew install just or cargo install just)
  • Node.js 18+ (optional, as the service runs in Docker)

1. Clone and Prepare

  1. Clone the repository and navigate to the project root:

    git clone <repository-url>
    cd auth-service
    
  2. Create your local environment file:

    cp docker/prod/.env.example docker/dev/.env
    

    Note: For development, the default credentials in .env.example are usually sufficient, but ensure they don’t conflict with other services.

2. Start the Service

The project uses Just to simplify Docker commands. To start the development environment (including PostgreSQL and Redis):

just up dev

This command will:

  • Build the Docker image.
  • Start PostgreSQL and Redis containers.
  • Run npm install inside the application container.
  • Start the application with SWC (hot-reload enabled).

3. Verify the Installation

Once the containers are healthy, you can verify the service is running:

API Documentation (Swagger)

Open your browser at: http://localhost:3001/auth/swagger

Health Check

You can check the health status of the service and its dependencies:

curl http://localhost:3001/auth/v1/health/ready

4. Useful Commands

Here are the most common commands you’ll use during development:

  • View Logs: just logs dev
  • Stop Service: just down dev
  • Run Tests: just test
  • Container Shell: just shell

Next Step: Check out the Development Guide for more details on daily tasks.

Development Guide

This guide covers common tasks and best practices when developing for the Authentication Service.

Local Workflow

The entire service is containerized to ensure consistency across environments. The project uses a Justfile to wrap complex Docker commands.

Running the Service

  • Start: just up dev (starts DB, Redis, and API)
  • Stop: just down dev (removes containers and volumes)
  • Logs: just logs dev (follows logs from all containers)

Executing Commands

To run commands inside the running auth-service container:

just shell
# inside the shell, you can run:
npm run lint
npm run format

Testing

Tests are executed inside the Docker environment to ensure they run against the correct versions of dependencies.

  • All Tests: just test
  • Unit Tests: docker compose -f docker/dev/compose.yml exec auth-service npm run test
  • E2E Tests: docker compose -f docker/dev/compose.yml exec auth-service npm run test:e2e

Debugging

The development container exposes port 9229 for the Node.js debugger. You can attach your IDE (VS Code, WebStorm) to this port to debug the running application.

Coding Standards

  • Linting: We use ESLint. Run npm run lint to check for issues.
  • Formatting: We use Prettier. Run npm run format to automatically fix formatting.
  • SWC: We use the SWC builder for faster compilation during development.

See also: Database Migrations

Database Migrations

This guide explains how to manage schema changes using TypeORM migrations.

How it Works

In the development environment, migrations are configured to run automatically on startup. The DB_MIGRATIONS_RUN=true environment variable controls this behavior.

Creating a New Migration

When you modify an entity in src/modules/authentication/entities/, you need to generate a migration:

  1. Ensure the service is running: just up dev.
  2. Generate the migration file:
    just shell
    npm run migration:generate -- src/modules/app/migrations/NameOfYourMigration
    
  3. A new file will be created in src/modules/app/migrations/. Review it to ensure it does exactly what you expect.

Running Migrations Manually

If you need to run or revert migrations manually:

# Run pending migrations
npm run migration:run

# Revert the last applied migration
npm run migration:revert

Best Practices

  • Never modify an existing migration file once it has been committed. Create a new one instead.
  • Check for data loss: Always review the generated SQL, especially if you are dropping columns or changing types.
  • Production: In production, DB_SYNCHRONIZE is always set to false, and migrations are the only way to update the schema.

Production Deployment Guide

This guide describes how to deploy the Authentication Service in a production-like environment using Docker Compose.

🐳 Docker Deployment

The production setup uses a optimized Dockerfile and a hardened compose.yml.

1. Build the Image

just up prod

This command uses docker/prod/compose.yml to build and launch the containers.

2. Environment Configuration

Ensure your .env file in docker/prod/ is properly configured. Key variables:

  • NODE_ENV=production
  • DB_SYNCHRONIZE=false
  • DB_MIGRATIONS_RUN=true
  • JWT_PRIVATE_KEY / JWT_PUBLIC_KEY (ECDSA P-256)

3. JWT Key Generation (Mandatory)

For production, you must generate your own keys. Do not use the defaults.

# Generate private key
openssl ecparam -genkey -name prime256v1 -noout -out private-key.pem

# Generate public key
openssl ec -in private-key.pem -pubout -out public-key.pem

🔒 Security Best Practices

Database

  • Migrations: Always use npm run migration:run. Never use synchronize: true.
  • Backups: Implement a regular pg_dump schedule.

Redis

  • Use Redis Sentinel for high availability.
  • Ensure REDIS_PASSWORD is strong and unique.

Infrastructure (Kubernetes/Istio)

While Docker Compose is used for standalone deployments, the primary production environment is Kubernetes with Istio:

  • mTLS: Ensure Istio PeerAuthentication is set to STRICT.
  • Resources: Assign at least 1vCPU and 2GB RAM per replica.

📊 Monitoring

  • Health Checks: Monitor /auth/v1/health/ready.
  • Metrics: Prometheus metrics are available at :3001/metrics (if enabled).
  • Logs: Use a log aggregator (Loki, ELK) to collect JSON logs from the container.

System Features

The Authentication Service (auth-service) provides a robust set of features for user identity management, multi-device synchronization, and end-to-end security.

Core Authentication

  • Phone Number Verification: Registration and login using SMS-based verification (integration with Twilio/Vonage).
  • Stateless JWT: Issuance and validation of Access and Refresh tokens using ECDSA (P-256) signing.
  • Session Management: Full control over active sessions with automatic token rotation and revocation.

Security & 2FA

  • Two-Factor Authentication (2FA): Support for TOTP (Google Authenticator, etc.) with QR code setup.
  • Backup Codes: Generation and management of emergency recovery codes for 2FA.
  • Brute Force Protection: Rate limiting and account lockout mechanisms.
  • Audit Logs: Comprehensive history of login attempts and security events.

Multi-Device Management

  • Device Registration: Automatic fingerprinting and registration of new devices (iOS, Android, Web).
  • Remote Logout: Ability for users to view and revoke access for any of their active devices.
  • QR Code Login: Quick authentication on a new device by scanning a challenge from an already authenticated device.

End-to-End Encryption (Signal Protocol)

  • Identity Keys: Management of long-term identity keys for cryptographic verification.
  • PreKeys System: Storage and distribution of PreKeys and Signed PreKeys to enable asynchronous session establishment.
  • Multi-Device Sync: Cryptographic synchronization of conversation states across all user devices.

Technical details:

Database Schema

The service uses PostgreSQL for persistent data and Redis for temporary/high-performance data.

PostgreSQL Schema Overview

erDiagram
    USERS_AUTH ||--o{ DEVICES : "possesses"
    USERS_AUTH ||--o{ PREKEYS : "possesses"
    USERS_AUTH ||--o{ SIGNED_PREKEYS : "possesses"
    USERS_AUTH ||--o{ IDENTITY_KEYS : "possesses"
    USERS_AUTH ||--o{ BACKUP_CODES : "possesses"
    USERS_AUTH ||--o{ LOGIN_HISTORY : "possesses"
    DEVICES ||--o{ LOGIN_HISTORY : "used during"
    
    USERS_AUTH {
        uuid id PK
        string phoneNumber UK
        string twoFactorSecret
        boolean twoFactorEnabled
        timestamp lastAuthenticatedAt
        timestamp createdAt
        timestamp updatedAt
    }
    
    DEVICES {
        uuid id PK
        uuid userId FK
        string deviceName
        string deviceType
        string deviceFingerprint UK
        string publicKey
        timestamp lastActive
        boolean isVerified
        boolean isActive
    }
    
    PREKEYS {
        uuid id PK
        uuid userId FK
        int keyId
        string publicKey
        boolean isOneTime
        boolean isUsed
    }
    
    SIGNED_PREKEYS {
        uuid id PK
        uuid userId FK
        int keyId
        string publicKey
        string signature
        timestamp expiresAt
    }
    
    IDENTITY_KEYS {
        uuid id PK
        uuid userId FK
        string publicKey
        string privateKeyEncrypted
    }
    
    BACKUP_CODES {
        uuid id PK
        uuid userId FK
        string codeHash
        boolean used
    }
    
    LOGIN_HISTORY {
        uuid id PK
        uuid userId FK
        uuid deviceId FK
        string ipAddress
        timestamp createdAt
        string status
    }

Table Descriptions

USERS_AUTH

The core table for user identity.

  • phoneNumber: Unique E.164 identifier.
  • twoFactorSecret: Encrypted TOTP secret.

DEVICES

Tracks all hardware/browsers associated with a user.

  • deviceFingerprint: Unique identifier generated by the client.
  • publicKey: The Signal Protocol public key for this specific device.

Cryptographic Keys (Signal Protocol)

  • PREKEYS: One-time use keys for asynchronous messaging.
  • SIGNED_PREKEYS: Semi-persistent keys signed by the Identity Key.
  • IDENTITY_KEYS: Long-term keys identifying the user.

Redis Structures

Redis is used for high-availability temporary data with short TTLs:

  • Verification Codes: verification:{id} - SMS codes (TTL: 15m).
  • Active Sessions: session:{id} - JWT metadata for revocation (TTL: variable).
  • QR Challenges: qr_challenge:{id} - Temporary challenges for QR login (TTL: 5m).
  • Rate Limits: rate_limit:{type}:{id} - Brute-force protection counters.

Bug Reports Archive

This section tracks significant bugs and their resolutions for the Authentication Service.

[2026-02-19] No Redis Sentinel Support and Silent Health Check Failure

Issue

The auth-service lacked support for Redis Sentinel in its configuration, preventing high-availability deployments. Additionally, the health check mechanism was “silent,” meaning it wouldn’t properly report a degraded state if Redis was unreachable but the application was still running.

Impact

  • Availability: Redis failover was not supported.
  • Observability: Kubernetes probes could not detect Redis connection issues, leading to traffic being routed to “zombie” instances.

Resolution

  • Integrated ioredis with Sentinel support.
  • Updated HealthModule to explicitly check Redis connectivity.
  • Added REDIS_SENTINEL_NODES and REDIS_SENTINEL_NAME environment variables.

For more recent issues, please refer to the project’s GitHub Issues.

System Architecture

The Authentication Service is a stateless microservice built with NestJS, designed to operate within an Istio Service Mesh.

Global Architecture

The service interacts with other microservices (User, Notification) via gRPC over mTLS, managed automatically by Istio.

graph TD
    A[API Gateway + Istio Ingress] --> B[Auth Service Pod]
    
    subgraph "Kubernetes Cluster"
        subgraph "auth-service Pod"
            B1[Auth Container] 
            B2[Envoy Sidecar]
        end
        
        subgraph "user-service Pod"
            C1[User Container]
            C2[Envoy Sidecar]
        end
        
        B2 -.->|mTLS gRPC| C2
    end
    
    B --> E[(PostgreSQL)]
    B --> F[(Redis)]
    B --> G[External SMS Service]

Architectural Principles

  • Stateless: All instances are interchangeable. State is kept in PostgreSQL and Redis.
  • Zero Trust: All inter-service communications are encrypted and authenticated via mTLS (Istio).
  • Device-Centric: Security is managed at the device level, not just the user level.
  • API First: Full OpenAPI/Swagger documentation for all REST endpoints.

Technical Stack

ComponentTechnology
FrameworkNestJS (TypeScript)
DatabasePostgreSQL 15
CacheRedis 7
CommunicationREST (External), gRPC (Internal)
SecurityJWT (ECDSA P-256), bcrypt, Signal Protocol
MeshIstio / Envoy

Service Structure

The code is organized by functional domains:

  • authentication: Core JWT and login logic.
  • devices: Multi-device management and QR login.
  • two-factor-authentication: TOTP and backup codes.
  • phone-verification: SMS provider integration.
  • tokens: Refresh token rotation and lifecycle.

End-to-End Encryption

Whispr uses the Signal Protocol to ensure that only the participants in a conversation can read the messages.

Core Concepts

X3DH (Extended Triple Diffie-Hellman)

X3DH is used to establish a shared secret key between two users, even if one of them is offline. It uses:

  • Identity Keys: Long-term stable keys.
  • Signed PreKeys: Semi-stable keys.
  • One-Time PreKeys: Single-use keys retrieved from the auth-service.

Double Ratchet Algorithm

Once a session is established, the Double Ratchet algorithm ensures Forward Secrecy and Future Secrecy:

  • DH Ratchet: New Diffie-Hellman exchanges are performed regularly.
  • Symmetric Ratchet: New message keys are derived for every single message.

The Role of Auth-Service

The auth-service acts as a Key Directory. It does not participate in encryption but stores public keys so other users can find them:

  1. Key Storage: Users upload their Public Identity Keys and batches of Public PreKeys.
  2. Key Distribution: When Alice wants to message Bob, she asks auth-service for Bob’s “PreKey Bundle”.
  3. Multi-Device Support: auth-service tracks keys for every device a user owns. A message for “Bob” is actually encrypted multiple times: once for each of Bob’s registered devices.

Multi-Device Synchronization

Since private keys never leave the device where they were generated, each device has its own cryptographic identity.

  • When you send a message from your Phone, it is also encrypted for your Tablet.
  • The auth-service ensures that all devices receive these “sync messages”.

Reference: Database Keys Mapping

ADR 0001: Separation of Responsibilities Between auth-service and user-service

Status

Accepted

Date

2025-04-11

Context

In our microservices architecture for the Whispr application, we need to precisely define the separation of responsibilities between the authentication service (auth-service) and the user service (user-service). This decision is particularly important as it impacts:

  1. User data management.
  2. Multi-device authentication.
  3. Service autonomy and resilience.
  4. Performance of frequent authentication operations.
  5. Implementation complexity of E2E encryption.

Our current architecture uses gRPC for inter-service communication and maintains distinct PostgreSQL databases for each service. We also use Redis for temporary authentication data.

Decision

We have decided to implement controlled denormalization of user data between auth-service and user-service, with the following distribution:

In auth-service (PostgreSQL)

  • users_auth table containing:

    • id (same UUID as in user-service).
    • phoneNumber (unique identifier for authentication).
    • twoFactorSecret (authentication-related data).
    • twoFactorEnabled (flag).
    • lastAuthenticatedAt (timestamp).
    • Temporal information (createdAt, updatedAt).
  • Tables related to devices and cryptographic keys:

    • devices, prekeys, signed_prekeys, identity_keys, backup_codes, login_history.

In user-service (PostgreSQL)

  • users table containing the full profile (firstName, lastName, username, etc.) and preferences.

Consequences

Advantages

  1. Service Autonomy: The auth-service can operate independently for critical operations.
  2. Performance: No need for synchronous gRPC calls for every authentication check.
  3. Enhanced Security: Separation of sensitive authentication data from user profile data.

Disadvantages

  1. Partial Data Duplication: The phone number and user identifier are duplicated.
  2. Synchronization Required: Mechanisms must be in place to maintain consistency.

Success Metrics

  • Authentication operation response time < 200ms.
  • auth-service availability > 99.9%.