Why are Java records preferred over traditional POJOs here?

Explore this question in depth in our interactive lesson on Building a High-Performance Java API.

How does immutability specifically improve concurrent API performance?

Explore this question in depth in our interactive lesson on Building a High-Performance Java API.

What caching strategies work best for the service layer?

Explore this question in depth in our interactive lesson on Building a High-Performance Java API.

How would you handle API versioning in this design?

Explore this question in depth in our interactive lesson on Building a High-Performance Java API.

When should parallel streams replace sequential ones in APIs?

Explore this question in depth in our interactive lesson on Building a High-Performance Java API.

Lesson 8

~20 min150 XP

Building a High-Performance Java API

Introduction

You've learned streams, concurrency, and clean code principles individually — now it's time to forge them together into something real. In this capstone lesson, you'll design and build a high-performance Java API for a product catalog service that handles concurrent requests, processes data with streams, and follows professional coding standards. By the end, you'll have a mental blueprint for building production-grade Java APIs that don't just work, but work fast and cleanly.

Designing the Domain Layer with Clean Code Principles

Every high-performance API starts not with performance tricks, but with a clean, well-structured domain. If your foundation is messy, no amount of optimization will save you. Let's design a product catalog API.

The core principle here is Separation of Concerns: each class should have one reason to change. We'll structure our project into layers:

Domain layer — Plain Java objects (POJOs) representing business entities
Repository layer — Data access abstractions
Service layer — Business logic, stream processing, caching
Controller layer — API endpoints that delegate to services

Start with an immutable domain model. Immutability is a performance superpower in concurrent environments because immutable objects are inherently thread-safe — no synchronization needed.

public record Product(
    String id,
    String name,
    String category,
    BigDecimal price,
    int stockQuantity,
    Instant createdAt
) {
    public Product {
        Objects.requireNonNull(id, "id must not be null");
        Objects.requireNonNull(name, "name must not be null");
        if (price.compareTo(BigDecimal.ZERO) < 0) {
            throw new IllegalArgumentException("price must be non-negative");
        }
    }
}

Java records (introduced in Java 16) give us immutability, equals(), hashCode(), and toString() for free. The compact constructor validates invariants at creation time, following the "fail fast" principle — if bad data enters your system, you want to know immediately, not three layers deep in a stack trace.

Important: Validate at the boundary. Your domain objects should reject invalid state, and your API layer should validate user input before it ever reaches the domain.

A common mistake is creating "anemic" domain models — objects that are just bags of getters and setters with all logic living elsewhere. Instead, consider putting behavior that naturally belongs to the entity on the entity itself.

Why are Java records particularly well-suited for domain objects in concurrent APIs?

Building the Repository with a Thread-Safe In-Memory Store

For our high-performance API, we need a data store that supports concurrent reads and writes without becoming a bottleneck. This is where java.util.concurrent shines.

A naive approach would use a HashMap wrapped in synchronized blocks — but that creates a contention bottleneck where every thread waits for a single lock. Instead, we use ConcurrentHashMap, which uses fine-grained locking (lock striping) to allow multiple threads to read and write simultaneously to different segments of the map.

public class ProductRepository {
    private final ConcurrentMap<String, Product> store = new ConcurrentHashMap<>();

    public Optional<Product> findById(String id) {
        return Optional.ofNullable(store.get(id));
    }

    public Product save(Product product) {
        store.put(product.id(), product);
        return product;
    }

    public boolean delete(String id) {
        return store.remove(id) != null;
    }

    public Collection<Product> findAll() {
        return Collections.unmodifiableCollection(store.values());
    }

    public Product update(String id, UnaryOperator<Product> updater) {
        return store.computeIfPresent(id, (key, existing) -> updater.apply(existing));
    }
}

Notice the update method uses computeIfPresent — this is an atomic operation on ConcurrentHashMap. The lambda executes while the entry is locked, preventing lost updates when two threads modify the same product simultaneously. This is far superior to a "read, modify, write" pattern which creates a race condition.

The findAll() method returns an unmodifiable view, preventing callers from accidentally mutating the internal state. This is a defensive programming practice that becomes critical in concurrent systems.

Using store.get(id) followed by store.put(id, modifiedProduct) is safe for concurrent updates on ConcurrentHashMap.

Stream-Powered Query Engine

Now let's build the service layer — this is where Java Streams transform our raw data into the rich query results an API consumer expects. The goal: support filtering, sorting, pagination, and aggregation, all expressed as a fluent stream pipeline.

First, define a query object that encapsulates search parameters:

public record ProductQuery(
    String category,
    BigDecimal minPrice,
    BigDecimal maxPrice,
    String sortBy,
    boolean ascending,
    int page,
    int pageSize
) {
    public ProductQuery {
        if (page < 0) throw new IllegalArgumentException("page must be >= 0");
        if (pageSize < 1 || pageSize > 100)
            throw new IllegalArgumentException("pageSize must be between 1 and 100");
    }
}

Now the service method that processes queries:

public class ProductService {
    private final ProductRepository repository;

    public ProductService(ProductRepository repository) {
        this.repository = repository;
    }

    public List<Product> search(ProductQuery query) {
        Stream<Product> stream = repository.findAll().stream();

        // Apply filters conditionally
        if (query.category() != null) {
            stream = stream.filter(p -> p.category().equalsIgnoreCase(query.category()));
        }
        if (query.minPrice() != null) {
            stream = stream.filter(p -> p.price().compareTo(query.minPrice()) >= 0);
        }
        if (query.maxPrice() != null) {
            stream = stream.filter(p -> p.price().compareTo(query.maxPrice()) <= 0);
        }

        // Sort
        Comparator<Product> comparator = resolveComparator(query.sortBy());
        if (!query.ascending()) {
            comparator = comparator.reversed();
        }
        stream = stream.sorted(comparator);

        // Paginate
        return stream
            .skip((long) query.page() * query.pageSize())
            .limit(query.pageSize())
            .toList();
    }

    private Comparator<Product> resolveComparator(String sortBy) {
        return switch (sortBy) {
            case "price" -> Comparator.comparing(Product::price);
            case "name"  -> Comparator.comparing(Product::name, String.CASE_INSENSITIVE_ORDER);
            case "date"  -> Comparator.comparing(Product::createdAt);
            default      -> Comparator.comparing(Product::id);
        };
    }
}

This is a clean, composable design. Each operation (filter, sort, paginate) is a discrete step in the pipeline. Streams are lazy — the filtering, sorting, and skipping don't happen until .toList() is called, and Java can optimize the pipeline internally.

For aggregation endpoints (e.g., "average price by category"), Collectors are your best friend:

public Map<String, DoubleSummaryStatistics> getPriceStatsByCategory() {
    return repository.findAll().stream()
        .collect(Collectors.groupingBy(
            Product::category,
            Collectors.summarizingDouble(p -> p.price().doubleValue())
        ));
}

This single stream expression gives you count, sum, min, max, and average price per category — all in one pass through the data.

In the search method above, which stream operation forces a full evaluation of all filtered elements before it can produce any output?

Concurrency with Virtual Threads and Async Processing

Real-world APIs handle many requests simultaneously. Java 21's virtual threads (Project Loom) revolutionize how we write concurrent code — they let you write simple, blocking-style code that scales to millions of concurrent tasks.

Here's how to set up a lightweight HTTP server using virtual threads:

public class ApiServer {
    private final ProductService productService;
    private final ExecutorService executor;

    public ApiServer(ProductService productService) {
        this.productService = productService;
        // Each request gets its own virtual thread — lightweight, no thread pool sizing needed
        this.executor = Executors.newVirtualThreadPerTaskExecutor();
    }

    public void start(int port) throws IOException {
        HttpServer server = HttpServer.create(new InetSocketAddress(port), 0);
        server.setExecutor(executor);

        server.createContext("/api/products", this::handleProducts);
        server.createContext("/api/products/stats", this::handleStats);

        server.start();
        System.out.println("API server running on port " + port);
    }
}

With platform threads (traditional threads), you'd need to carefully tune a thread pool — too few threads and requests queue up; too many and you exhaust memory (each platform thread costs ~1MB of stack space). Virtual threads eliminate this problem: they're managed by the JVM, cost only a few hundred bytes, and the JVM schedules them onto a small pool of carrier threads automatically.

But concurrency isn't just about handling requests — it's also about performing work in parallel within a request. Suppose you need to enrich product data from multiple sources:

public Product enrichProduct(Product product) throws InterruptedException, ExecutionException {
    try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
        Supplier<String> descriptionTask = scope.fork(() -> 
            fetchDescription(product.id()));
        Supplier<List<String>> reviewsTask = scope.fork(() -> 
            fetchReviews(product.id()));
        Supplier<Double> ratingTask = scope.fork(() -> 
            fetchAverageRating(product.id()));

        scope.join();           // Wait for all tasks
        scope.throwIfFailed();  // Propagate any exceptions

        return new EnrichedProduct(
            product,
            descriptionTask.get(),
            reviewsTask.get(),
            ratingTask.get()
        );
    }
}

Structured concurrency (preview in Java 21) ensures that all forked tasks are treated as a unit: if one fails, the others are cancelled. If the parent thread is interrupted, child tasks are cleaned up. This prevents the common problems with raw CompletableFuture chains — leaked threads, orphaned tasks, and tangled error handling.

Key insight: Virtual threads excel at I/O-bound work (network calls, database queries). For CPU-bound work (heavy computation), parallel streams or a fixed-size ForkJoinPool are still more appropriate because you want to match thread count to CPU cores.

Caching for Performance: The ConcurrentHashMap Pattern

Caching is the single biggest performance lever in most APIs. When your stream-based query engine sorts 10,000 products for every request, a cache that stores recent results can reduce response times by orders of magnitude.

Let's build a time-based cache using ConcurrentHashMap:

public class QueryCache<K, V> {
    private final ConcurrentMap<K, CacheEntry<V>> cache = new ConcurrentHashMap<>();
    private final Duration ttl;

    public QueryCache(Duration ttl) {
        this.ttl = ttl;
    }

    private record CacheEntry<V>(V value, Instant expiresAt) {
        boolean isExpired() {
            return Instant.now().isAfter(expiresAt);
        }
    }

    public Optional<V> get(K key) {
        CacheEntry<V> entry = cache.get(key);
        if (entry == null || entry.isExpired()) {
            cache.remove(key);  // Lazy eviction
            return Optional.empty();
        }
        return Optional.of(entry.value());
    }

    public void put(K key, V value) {
        cache.put(key, new CacheEntry<>(value, Instant.now().plus(ttl)));
    }

    public V computeIfAbsent(K key, Function<K, V> loader) {
        return get(key).orElseGet(() -> {
            V value = loader.apply(key);
            put(key, value);
            return value;
        });
    }
}

Now integrate it into the service:

public class ProductService {
    private final ProductRepository repository;
    private final QueryCache<ProductQuery, List<Product>> queryCache;

    public ProductService(ProductRepository repository) {
        this.repository = repository;
        this.queryCache = new QueryCache<>(Duration.ofSeconds(30));
    }

    public List<Product> search(ProductQuery query) {
        return queryCache.computeIfAbsent(query, this::executeSearch);
    }

    private List<Product> executeSearch(ProductQuery query) {
        // ... the stream pipeline from the previous section
    }
}

Because ProductQuery is a record, it gets correct equals() and hashCode() for free — identical queries will hit the cache. This is another reason records are so powerful for API design.

There's a subtle issue with the computeIfAbsent method above: under high concurrency, two threads with the same cache-miss key could both execute the loader function. For most read-heavy APIs, this thundering herd duplication is acceptable — the result is the same and the cost is just one extra computation. But if you need strict single-computation guarantees, use ConcurrentHashMap.computeIfAbsent directly (which locks per-key) but be aware it blocks other threads waiting for the same key.

Cache invalidation is one of the two hard problems in computer science (the other being naming things). For our API, TTL-based expiration is the simplest sound strategy. For write-heavy systems, consider invalidating on writes: call cache.remove() in your save() and delete() methods.

Java records automatically generate equals() and hashCode() methods, which makes them ideal as cache ___ in a ConcurrentHashMap.

Putting It All Together: The Wiring and Startup

The final piece is composing all layers together. Clean code demands that we use constructor injection — each component receives its dependencies explicitly rather than creating them internally. This makes the code testable, configurable, and transparent.

public class Application {
    public static void main(String[] args) throws IOException {
        // Wire dependencies — composition root
        ProductRepository repository = new ProductRepository();
        ProductService service = new ProductService(repository);
        ApiServer server = new ApiServer(service);

        // Seed sample data
        seedProducts(repository, 10_000);

        // Start server
        server.start(8080);

        // Register shutdown hook for graceful cleanup
        Runtime.getRuntime().addShutdownHook(new Thread(() -> {
            System.out.println("Shutting down gracefully...");
            server.stop();
        }));
    }

    private static void seedProducts(ProductRepository repo, int count) {
        var random = new Random(42);  // Fixed seed for reproducibility
        var categories = List.of("electronics", "books", "clothing", "home", "sports");

        IntStream.range(0, count)
            .mapToObj(i -> new Product(
                UUID.randomUUID().toString(),
                "Product-" + i,
                categories.get(random.nextInt(categories.size())),
                BigDecimal.valueOf(random.nextDouble(5.0, 500.0)).setScale(2, RoundingMode.HALF_UP),
                random.nextInt(0, 1000),
                Instant.now().minus(Duration.ofDays(random.nextInt(365)))
            ))
            .forEach(repo::save);
    }
}

Notice how seedProducts uses a stream pipeline (IntStream.range → mapToObj → forEach) instead of an imperative loop. This is idiomatic modern Java — declarative, readable, and expressive.

The shutdown hook is a production necessity. Without it, in-flight requests get abruptly killed when the JVM terminates. A graceful shutdown stops accepting new requests, waits for current ones to complete, then releases resources.

Here's a checklist for production readiness that you should mentally walk through for any API you build:

Health endpoint — /api/health returning server status for load balancers
Request validation — reject malformed input at the controller layer before it reaches business logic
Error handling — consistent error response format (never leak stack traces to clients)
Logging — structured logging of request/response metadata for observability
Metrics — track response times, cache hit rates, and error rates

This architecture gives you a clean separation where each layer can be tested independently. The repository can be tested with concurrent writes. The service can be tested with a mock repository. The server can be tested with integration tests against the real stack.

Key Takeaways

Immutable domain models (records) eliminate entire categories of concurrency bugs and provide correct cache key behavior for free
ConcurrentHashMap with atomic operations (computeIfPresent, computeIfAbsent) prevents race conditions without coarse-grained locking
Stream pipelines provide a declarative, composable query engine — but remember that sorted() forces full materialization, making caching essential for performance
Virtual threads let you write simple blocking code that scales to thousands of concurrent requests; use structured concurrency for parallel fan-out within a single request
Constructor injection and layered architecture make your API testable, maintainable, and ready for production — performance and clean code are not opposing forces

In a high-performance Java API, the lesson emphasizes starting with a clean, immutable domain model using Java records before applying any performance optimizations. It also stresses the importance of Separation of Concerns through distinct layers (Domain, Repository, Service, Controller). **Exercise:** Explain why immutability in the domain layer is described as a "performance superpower" in the context of a concurrent API. In your answer, describe what specific concurrency problem immutable objects avoid, how this relates to the layered architecture presented in the lesson (particularly the Service layer where business logic and caching occur), and why the lesson argues that starting with clean design principles is more important than jumping straight to performance tricks. Use the `Product` record example to illustrate your reasoning.

🔒Upgrade to submit written responses and get AI feedback

Go deeper

Why are Java records preferred over traditional POJOs here?🔒
How does immutability specifically improve concurrent API performance?🔒
What caching strategies work best for the service layer?🔒
How would you handle API versioning in this design?🔒
When should parallel streams replace sequential ones in APIs?🔒