Building Scalable Architecture: Technical Framework for Marketplaces
Learn how to architect marketplace platforms that scale to millions of users. Includes microservices patterns, database sharding, caching strategies, and deployment checklists.
Who Is This For?
This guide is specifically designed for:
Startup Stage:
Expanding operations, optimizing infrastructure, and systematically scaling revenue.
Best For Role:
Technical implementation guides and code examples for developers.
Expected Impact:
Foundational work that pays dividends over months and years.
What You'll Learn
- Design microservices architecture for marketplace platforms
- Implement database sharding and replication strategies
- Build multi-level caching systems
- Deploy scalable infrastructure with monitoring
- Optimize for 1M+ concurrent users
Prerequisites
- •Experience with Next.js and TypeScript
- •Understanding of database concepts (SQL, indexes)
- •Familiarity with cloud deployment (AWS, Vercel)
- •Basic knowledge of caching and CDNs
What This Guide Covers
Scalable architecture enables marketplaces to handle millions of users while maintaining performance and reliability. This guide provides technical frameworks for building platforms that scale.
You will learn:
- •Microservices architecture patterns for marketplaces
- •Database design and sharding strategies
- •Caching systems for performance optimization
- •Infrastructure deployment and monitoring
- •Load balancing and fault tolerance
Framework source: Architecture patterns from platforms serving millions of users. For the strategic perspective on architecture decisions, read 7 architecture decisions marketplace founders regret. For database-specific patterns, see database architecture patterns.
Understanding Marketplace Architecture
Core Components
Every scalable marketplace requires these foundational components:
User Management System:
- •Multiple user types (buyers, sellers, admins)
- •Role-based access control (RBAC)
- •Authentication and session management
Product/Service Catalog:
- •Flexible schema for various offerings
- •Search indexing
- •Category management
Search & Discovery Engine:
- •Full-text search
- •Faceted filtering
- •Relevance ranking
Transaction Engine:
- •Payment processing
- •Escrow management
- •Refund handling
Communication System:
- •Real-time messaging
- •Notification delivery
- •Email templates
Review & Rating System:
- •Multi-criteria ratings
- •Verification mechanisms
- •Aggregation calculations
Scalability Requirements
Performance Targets:
- •Page load time: <2 seconds
- •API response time: <200ms
- •Search query time: <100ms
- •Concurrent users: 100,000+
- •Transactions per second: 1,000+
Reliability Targets:
- •Uptime: 99.9% (8.76 hours downtime/year)
- •Data durability: 99.999999999% (11 nines)
- •Zero data loss during failures
Tech Stack Selection
Recommended Stack for 2025
Frontend:
- •Framework: Next.js 15 with App Router
- •State Management: Zustand or React Query
- •UI Components: Tailwind CSS + Shadcn/ui
- •Type Safety: TypeScript
Backend:
- •API Framework: Node.js with NestJS
- •Database: PostgreSQL (relational) + JSONB (flexibility)
- •Cache Layer: Redis for sessions and queries
- •Search Engine: Elasticsearch or Typesense
- •Queue System: Bull or BullMQ for background jobs
Infrastructure:
- •Frontend Hosting: Vercel (global edge)
- •Backend Hosting: AWS/Railway (containers)
- •CDN: Cloudflare (global content delivery)
- •Monitoring: Sentry (errors) + Datadog (performance)
Key Rationale:
- •Next.js: SEO-friendly, fast, great developer experience
- •PostgreSQL: Proven scalability, JSONB for flexible schemas
- •Redis: Sub-millisecond latency for frequently accessed data
- •Elasticsearch: Powerful search with <100ms queries
- •Vercel: Global edge network with <50ms TTFB
Database Design Patterns
Multi-Tenant Architecture
For B2B marketplaces, isolate data while maintaining efficiency:
CREATE TABLE organizations (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name VARCHAR(255) NOT NULL,
slug VARCHAR(255) UNIQUE NOT NULL,
tier VARCHAR(50) DEFAULT 'starter',
settings JSONB DEFAULT '{}',
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
organization_id UUID REFERENCES organizations(id) ON DELETE CASCADE,
email VARCHAR(255) UNIQUE NOT NULL,
role VARCHAR(50) NOT NULL CHECK (role IN ('admin', 'buyer', 'seller')),
profile JSONB DEFAULT '{}',
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE INDEX idx_users_org ON users(organization_id);
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_users_role ON users(organization_id, role);
Key principles:
- •Use UUIDs for distributed systems (no collisions)
- •JSONB for flexible user profiles (avoids schema migrations)
- •Cascade deletes for data consistency
- •Composite indexes for common query patterns
Database Sharding Strategy
When single database reaches limits (>10TB or >100k transactions/second):
Sharding Approach:
- •Shard by organization_id (B2B) or user_id (B2C)
- •Route queries to correct shard based on key
- •Use Vitess or Citus for transparent sharding
Example Sharding Logic:
function getShardForOrganization(organizationId: string): number {
// Hash organization ID to determine shard (0-7 for 8 shards)
const hash = createHash("md5").update(organizationId).digest("hex");
return parseInt(hash.substring(0, 2), 16) % 8;
}
async function queryUsersByOrg(organizationId: string) {
const shard = getShardForOrganization(organizationId);
const db = databaseConnections[shard];
return db.query("SELECT * FROM users WHERE organization_id = $1", [
organizationId,
]);
}
When to shard:
- •Database size >5TB
- •Write throughput >50k writes/second
- •Read throughput that caching cannot handle
Read Replicas
Before sharding, use read replicas for scaling reads:
Configuration:
- •1 Primary (handles all writes)
- •3-5 Read Replicas (handle all reads)
- •Route reads to replicas, writes to primary
Implementation:
import { Pool } from "pg";
const primaryPool = new Pool({
host: process.env.DB_PRIMARY_HOST,
// ... config
});
const replicaPools = [
new Pool({ host: process.env.DB_REPLICA_1_HOST }),
new Pool({ host: process.env.DB_REPLICA_2_HOST }),
new Pool({ host: process.env.DB_REPLICA_3_HOST }),
];
let replicaIndex = 0;
function getReplicaPool(): Pool {
// Round-robin distribution
const pool = replicaPools[replicaIndex];
replicaIndex = (replicaIndex + 1) % replicaPools.length;
return pool;
}
async function readQuery(query: string, params: any[]) {
return getReplicaPool().query(query, params);
}
async function writeQuery(query: string, params: any[]) {
return primaryPool.query(query, params);
}
Caching Strategy
Multi-Level Cache Architecture
Level 1: Browser Cache (CDN-cached static assets)
- •Cache duration: 1 year for versioned assets
- •Invalidation: Version-based (app.v123.js)
Level 2: Redis Cache (API responses, session data)
- •Cache duration: 5 minutes to 1 hour
- •Invalidation: TTL or event-based
Level 3: Database Query Cache (PostgreSQL shared buffers)
- •Cache duration: Managed by database
- •Invalidation: Automatic on data changes
Redis Caching Implementation
Product Caching Example:
import { Redis } from "ioredis";
const redis = new Redis(process.env.REDIS_URL);
async function getCachedProduct(productId: string) {
const cacheKey = `product:${productId}`;
// L1: Check Redis cache
const cached = await redis.get(cacheKey);
if (cached) {
return JSON.parse(cached);
}
// L2: Query database
const product = await db.products.findUnique({
where: { id: productId },
include: {
seller: true,
categories: true,
images: true,
},
});
if (product) {
// Cache for 1 hour
await redis.setex(cacheKey, 3600, JSON.stringify(product));
}
return product;
}
Cache Invalidation Pattern:
async function updateProduct(productId: string, data: UpdateData) {
// Update database
const updated = await db.products.update({
where: { id: productId },
data,
});
// Invalidate cache
await redis.del(`product:${productId}`);
// Also invalidate listing caches that include this product
await redis.del(`seller:${updated.sellerId}:products`);
return updated;
}
Cache-Aside Pattern
async function getCachedData<T>(
key: string,
fetchFn: () => Promise<T>,
ttl: number = 3600,
): Promise<T> {
// Try cache first
const cached = await redis.get(key);
if (cached) return JSON.parse(cached);
// Cache miss: fetch data
const data = await fetchFn();
// Store in cache
await redis.setex(key, ttl, JSON.stringify(data));
return data;
}
// Usage
const user = await getCachedData(
`user:${userId}`,
() => db.users.findUnique({ where: { id: userId } }),
1800, // 30 minutes
);
Microservices Architecture
Service Decomposition
Break monolith into focused services:
Core Services:
- •User Service: Authentication, profiles, permissions
- •Catalog Service: Products/services, categories, attributes
- •Search Service: Indexing, search queries, filters
- •Transaction Service: Payments, escrow, refunds
- •Messaging Service: Real-time chat, notifications
- •Review Service: Ratings, reviews, aggregations
Supporting Services: 7. Email Service: Transactional emails, templates 8. Analytics Service: Event tracking, reporting 9. Media Service: Image upload, resizing, CDN 10. Notification Service: Push, SMS, in-app
Service Communication
Synchronous (REST/GraphQL):
- •Use for: Read operations, real-time requirements
- •Pros: Simple, immediate response
- •Cons: Tight coupling, cascading failures
Asynchronous (Message Queue):
- •Use for: Write operations, background processing
- •Pros: Decoupling, fault tolerance
- •Cons: Eventual consistency, complexity
Example with BullMQ:
import { Queue, Worker } from "bullmq";
// Producer (Transaction Service)
const emailQueue = new Queue("emails", {
connection: redis,
});
await emailQueue.add("purchase-confirmation", {
userId: buyer.id,
orderId: order.id,
amount: order.amount,
});
// Consumer (Email Service)
new Worker(
"emails",
async (job) => {
if (job.name === "purchase-confirmation") {
await sendEmail({
to: job.data.userId,
template: "purchase-confirmation",
data: job.data,
});
}
},
{ connection: redis },
);
Performance Optimization
N+1 Query Prevention
Problem: Loading users and their organizations separately:
// BAD: N+1 queries (1 for users + N for organizations)
const users = await db.users.findMany();
for (const user of users) {
user.organization = await db.organizations.findUnique({
where: { id: user.organizationId },
});
}
Solution: Use database joins or batch loading:
// GOOD: Single query with join
const users = await db.users.findMany({
include: {
organization: true,
},
});
Database Indexing
Common Index Patterns:
-- Search by email (user login)
CREATE INDEX idx_users_email ON users(email);
-- Search products by seller
CREATE INDEX idx_products_seller ON products(seller_id);
-- Search products by category
CREATE INDEX idx_products_category ON products(category_id);
-- Composite index for filtered searches
CREATE INDEX idx_products_category_active ON products(category_id, is_active)
WHERE is_active = true;
-- Partial index for active listings only
CREATE INDEX idx_active_listings ON listings(seller_id, created_at)
WHERE status = 'active';
-- Full-text search index
CREATE INDEX idx_products_search ON products
USING GIN (to_tsvector('english', name || ' ' || description));
Connection Pooling
PostgreSQL Connection Pool:
import { Pool } from "pg";
const pool = new Pool({
host: process.env.DB_HOST,
database: process.env.DB_NAME,
user: process.env.DB_USER,
password: process.env.DB_PASSWORD,
max: 20, // Maximum pool size
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 2000,
});
// Graceful shutdown
process.on("SIGTERM", async () => {
await pool.end();
});
Load Balancing
Application Load Balancer
Round-Robin Distribution:
# nginx.conf
upstream backend_servers {
server backend1.example.com:3000;
server backend2.example.com:3000;
server backend3.example.com:3000;
}
server {
location /api {
proxy_pass http://backend_servers;
proxy_set_header X-Real-IP $remote_addr;
}
}
Session Stickiness (when needed):
upstream backend_servers {
ip_hash; # Route same IP to same server
server backend1.example.com:3000;
server backend2.example.com:3000;
}
CDN Configuration
Cloudflare Cache Rules:
// next.config.js
module.exports = {
async headers() {
return [
{
source: "/images/:path*",
headers: [
{
key: "Cache-Control",
value: "public, max-age=31536000, immutable",
},
],
},
{
source: "/api/:path*",
headers: [
{
key: "Cache-Control",
value: "private, no-cache, no-store, must-revalidate",
},
],
},
];
},
};
Monitoring and Observability
Application Performance Monitoring
Sentry for Error Tracking:
import * as Sentry from "@sentry/nextjs";
Sentry.init({
dsn: process.env.SENTRY_DSN,
environment: process.env.NODE_ENV,
tracesSampleRate: 1.0, // 100% of transactions
});
// Custom context
Sentry.setUser({
id: user.id,
email: user.email,
});
// Track custom metrics
Sentry.metrics.increment("checkout.completed", 1, {
tags: { payment_method: "stripe" },
});
Datadog for Performance:
import { StatsD } from "hot-shots";
const statsd = new StatsD({
host: process.env.DATADOG_HOST,
prefix: "marketplace.",
});
// Track API response times
app.use((req, res, next) => {
const start = Date.now();
res.on("finish", () => {
const duration = Date.now() - start;
statsd.timing("api.response_time", duration, {
route: req.route?.path,
method: req.method,
});
});
next();
});
Health Check Endpoints
// /api/health
export async function GET() {
const checks = {
database: await checkDatabase(),
redis: await checkRedis(),
elasticsearch: await checkElasticsearch(),
};
const healthy = Object.values(checks).every((check) => check.healthy);
return Response.json(
{
status: healthy ? "healthy" : "unhealthy",
checks,
timestamp: new Date().toISOString(),
},
{
status: healthy ? 200 : 503,
},
);
}
async function checkDatabase() {
try {
await db.$queryRaw`SELECT 1`;
return { healthy: true };
} catch (error) {
return { healthy: false, error: error.message };
}
}
Deployment Strategy
Container Orchestration
Docker Compose (development):
version: "3.8"
services:
app:
build: .
ports:
- "3000:3000"
environment:
- DATABASE_URL=postgresql://postgres:password@db:5432/marketplace
- REDIS_URL=redis://redis:6379
depends_on:
- db
- redis
db:
image: postgres:15
environment:
POSTGRES_DB: marketplace
POSTGRES_PASSWORD: password
volumes:
- pgdata:/var/lib/postgresql/data
redis:
image: redis:7
ports:
- "6379:6379"
volumes:
pgdata:
Kubernetes (production):
apiVersion: apps/v1
kind: Deployment
metadata:
name: marketplace-api
spec:
replicas: 3
selector:
matchLabels:
app: marketplace-api
template:
metadata:
labels:
app: marketplace-api
spec:
containers:
- name: api
image: marketplace-api:latest
ports:
- containerPort: 3000
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: db-secret
key: url
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "1Gi"
cpu: "1000m"
livenessProbe:
httpGet:
path: /api/health
port: 3000
initialDelaySeconds: 30
periodSeconds: 10
Zero-Downtime Deployment
Blue-Green Deployment Strategy:
- •Deploy new version (green) alongside current (blue)
- •Test green environment
- •Switch load balancer to green
- •Monitor for issues
- •Keep blue running for rollback
- •After validation, decommission blue
Rolling Update Strategy:
- •Update 1 instance at a time
- •Health check before proceeding to next
- •Gradual traffic shift
- •Automatic rollback on failure
Disaster Recovery
Backup Strategy
Database Backups:
- •Continuous: Write-Ahead Log (WAL) archiving
- •Daily: Full database backup
- •Hourly: Incremental backup
- •Retention: 30 days full, 90 days incremental
Implementation:
# Automated backup script
#!/bin/bash
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_FILE="marketplace_${TIMESTAMP}.sql.gz"
pg_dump -h $DB_HOST -U $DB_USER $DB_NAME | gzip > $BACKUP_FILE
aws s3 cp $BACKUP_FILE s3://backups/database/
# Clean up old backups (keep 30 days)
find /backups -name "marketplace_*.sql.gz" -mtime +30 -delete
Disaster Recovery Plan
Recovery Time Objective (RTO): 4 hours Recovery Point Objective (RPO): 1 hour
Runbook:
- •Detect failure (monitoring alerts)
- •Assess impact (which services affected)
- •Restore from backup (latest valid backup)
- •Verify data integrity (run validation queries)
- •Resume operations (switch traffic to recovered instance)
- •Post-mortem (document incident and preventions)
Security Considerations
Essential Security Measures
Authentication & Authorization:
import { sign, verify } from "jsonwebtoken";
// Generate JWT with refresh token
function generateTokens(userId: string) {
const accessToken = sign({ userId, type: "access" }, process.env.JWT_SECRET, {
expiresIn: "15m",
});
const refreshToken = sign(
{ userId, type: "refresh" },
process.env.JWT_REFRESH_SECRET,
{ expiresIn: "7d" },
);
return { accessToken, refreshToken };
}
// Verify and refresh
async function refreshAccessToken(refreshToken: string) {
const payload = verify(refreshToken, process.env.JWT_REFRESH_SECRET);
return generateTokens(payload.userId);
}
Data Encryption:
import { createCipheriv, createDecipheriv, randomBytes } from "crypto";
const ALGORITHM = "aes-256-gcm";
const KEY = Buffer.from(process.env.ENCRYPTION_KEY, "hex");
function encrypt(text: string): string {
const iv = randomBytes(16);
const cipher = createCipheriv(ALGORITHM, KEY, iv);
let encrypted = cipher.update(text, "utf8", "hex");
encrypted += cipher.final("hex");
const authTag = cipher.getAuthTag();
return `${iv.toString("hex")}:${authTag.toString("hex")}:${encrypted}`;
}
function decrypt(encryptedData: string): string {
const [ivHex, authTagHex, encrypted] = encryptedData.split(":");
const decipher = createDecipheriv(ALGORITHM, KEY, Buffer.from(ivHex, "hex"));
decipher.setAuthTag(Buffer.from(authTagHex, "hex"));
let decrypted = decipher.update(encrypted, "hex", "utf8");
decrypted += decipher.final("utf8");
return decrypted;
}
Rate Limiting:
import rateLimit from "express-rate-limit";
const apiLimiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100, // 100 requests per window
message: "Too many requests, please try again later",
});
app.use("/api/", apiLimiter);
// Stricter limits for auth endpoints
const authLimiter = rateLimit({
windowMs: 60 * 60 * 1000, // 1 hour
max: 5, // 5 attempts per hour
message: "Too many login attempts",
});
app.use("/api/auth/login", authLimiter);
Key Takeaways
Architecture Patterns:
- •Start monolith, extract microservices as needed
- •Use PostgreSQL with JSONB for flexibility
- •Implement multi-level caching (browser, Redis, database)
- •Deploy read replicas before sharding
- •Use message queues for asynchronous operations
Performance Optimization:
- •Index common query patterns
- •Prevent N+1 queries with joins
- •Implement connection pooling
- •Use CDN for static assets
- •Cache aggressively with TTL-based invalidation
Scalability Milestones:
- •0-10K users: Monolith + Redis cache
- •10K-100K users: Read replicas + CDN
- •100K-1M users: Microservices + load balancing
- •1M+ users: Database sharding + global CDN
Monitoring & Reliability:
- •Implement health checks on all services
- •Track error rates, response times, throughput
- •Set up alerting for anomalies
- •Maintain 99.9% uptime target
- •Test disaster recovery procedures quarterly
Deployment:
- •Use Docker for consistent environments
- •Implement CI/CD for automated deployments
- •Deploy with zero-downtime strategies
- •Maintain rollback capability
- •Monitor post-deployment metrics
Next Steps
- •Download the Architecture Diagram Template to map your system components
- •Implement caching layer for most-accessed data
- •Set up monitoring with Sentry and Datadog
- •Create health check endpoints for all services
- •Document disaster recovery procedures and test quarterly
Scalable architecture is built incrementally. Start with solid foundations, measure performance, and optimize based on actual usage patterns.
How much should your build actually cost?
Get a personalized investment estimate based on your platform type, scope, and timeline.
Open the Investment CalculatorDownloads
About the Author

Chris Mask
Founder & CEO
Serial entrepreneur, marketplace architect, and AI-assisted development pioneer with 7+ years building two-sided platforms. Founded Directorism after launching and exiting two successful marketplace businesses. Has personally architected and consulted on 200+ marketplace and directory projects. Recognized authority on cold-start problems, platform economics, marketplace SEO, and leveraging AI tools for rapid development. Early adopter of AI-powered coding workflows, integrating Claude, Cursor, and agentic development patterns into production systems.
Related Resources
The Definitive Marketplace Tech Stack Guide for 2025
Choose the right tech stack for your marketplace. Learn proven architectures from Next.js to PostgreSQL, when to use each technology, and how to scale from MVP to 1M+ users with real cost projections.
Database Architecture Patterns for Marketplaces
Learn how to design scalable marketplace database schemas. Includes battle-tested patterns for users, listings, transactions, reviews, and messaging systems.
API Design Best Practices for Marketplaces
Learn REST API design patterns for marketplaces. Includes authentication, rate limiting, pagination, webhooks, and complete implementation examples.