You've learned streams, concurrency, and clean code principles individually — now it's time to forge them together into something real. In this capstone lesson, you'll design and build a high-performance Java API for a product catalog service that handles concurrent requests, processes data with streams, and follows professional coding standards. By the end, you'll have a mental blueprint for building production-grade Java APIs that don't just work, but work fast and cleanly.
Every high-performance API starts not with performance tricks, but with a clean, well-structured domain. If your foundation is messy, no amount of optimization will save you. Let's design a product catalog API.
The core principle here is Separation of Concerns: each class should have one reason to change. We'll structure our project into layers:
Start with an immutable domain model. Immutability is a performance superpower in concurrent environments because immutable objects are inherently thread-safe — no synchronization needed.
public record Product(
String id,
String name,
String category,
BigDecimal price,
int stockQuantity,
Instant createdAt
) {
public Product {
Objects.requireNonNull(id, "id must not be null");
Objects.requireNonNull(name, "name must not be null");
if (price.compareTo(BigDecimal.ZERO) < 0) {
throw new IllegalArgumentException("price must be non-negative");
}
}
}
Java records (introduced in Java 16) give us immutability, equals(), hashCode(), and toString() for free. The compact constructor validates invariants at creation time, following the "fail fast" principle — if bad data enters your system, you want to know immediately, not three layers deep in a stack trace.
Important: Validate at the boundary. Your domain objects should reject invalid state, and your API layer should validate user input before it ever reaches the domain.
A common mistake is creating "anemic" domain models — objects that are just bags of getters and setters with all logic living elsewhere. Instead, consider putting behavior that naturally belongs to the entity on the entity itself.
For our high-performance API, we need a data store that supports concurrent reads and writes without becoming a bottleneck. This is where java.util.concurrent shines.
A naive approach would use a HashMap wrapped in synchronized blocks — but that creates a contention bottleneck where every thread waits for a single lock. Instead, we use ConcurrentHashMap, which uses fine-grained locking (lock striping) to allow multiple threads to read and write simultaneously to different segments of the map.
public class ProductRepository {
private final ConcurrentMap<String, Product> store = new ConcurrentHashMap<>();
public Optional<Product> findById(String id) {
return Optional.ofNullable(store.get(id));
}
public Product save(Product product) {
store.put(product.id(), product);
return product;
}
public boolean delete(String id) {
return store.remove(id) != null;
}
public Collection<Product> findAll() {
return Collections.unmodifiableCollection(store.values());
}
public Product update(String id, UnaryOperator<Product> updater) {
return store.computeIfPresent(id, (key, existing) -> updater.apply(existing));
}
}
Notice the update method uses computeIfPresent — this is an atomic operation on ConcurrentHashMap. The lambda executes while the entry is locked, preventing lost updates when two threads modify the same product simultaneously. This is far superior to a "read, modify, write" pattern which creates a race condition.
The findAll() method returns an unmodifiable view, preventing callers from accidentally mutating the internal state. This is a defensive programming practice that becomes critical in concurrent systems.
Now let's build the service layer — this is where Java Streams transform our raw data into the rich query results an API consumer expects. The goal: support filtering, sorting, pagination, and aggregation, all expressed as a fluent stream pipeline.
First, define a query object that encapsulates search parameters:
public record ProductQuery(
String category,
BigDecimal minPrice,
BigDecimal maxPrice,
String sortBy,
boolean ascending,
int page,
int pageSize
) {
public ProductQuery {
if (page < 0) throw new IllegalArgumentException("page must be >= 0");
if (pageSize < 1 || pageSize > 100)
throw new IllegalArgumentException("pageSize must be between 1 and 100");
}
}
Now the service method that processes queries:
public class ProductService {
private final ProductRepository repository;
public ProductService(ProductRepository repository) {
this.repository = repository;
}
public List<Product> search(ProductQuery query) {
Stream<Product> stream = repository.findAll().stream();
// Apply filters conditionally
if (query.category() != null) {
stream = stream.filter(p -> p.category().equalsIgnoreCase(query.category()));
}
if (query.minPrice() != null) {
stream = stream.filter(p -> p.price().compareTo(query.minPrice()) >= 0);
}
if (query.maxPrice() != null) {
stream = stream.filter(p -> p.price().compareTo(query.maxPrice()) <= 0);
}
// Sort
Comparator<Product> comparator = resolveComparator(query.sortBy());
if (!query.ascending()) {
comparator = comparator.reversed();
}
stream = stream.sorted(comparator);
// Paginate
return stream
.skip((long) query.page() * query.pageSize())
.limit(query.pageSize())
.toList();
}
private Comparator<Product> resolveComparator(String sortBy) {
return switch (sortBy) {
case "price" -> Comparator.comparing(Product::price);
case "name" -> Comparator.comparing(Product::name, String.CASE_INSENSITIVE_ORDER);
case "date" -> Comparator.comparing(Product::createdAt);
default -> Comparator.comparing(Product::id);
};
}
}
This is a clean, composable design. Each operation (filter, sort, paginate) is a discrete step in the pipeline. Streams are lazy — the filtering, sorting, and skipping don't happen until .toList() is called, and Java can optimize the pipeline internally.
For aggregation endpoints (e.g., "average price by category"), Collectors are your best friend:
public Map<String, DoubleSummaryStatistics> getPriceStatsByCategory() {
return repository.findAll().stream()
.collect(Collectors.groupingBy(
Product::category,
Collectors.summarizingDouble(p -> p.price().doubleValue())
));
}
This single stream expression gives you count, sum, min, max, and average price per category — all in one pass through the data.
Real-world APIs handle many requests simultaneously. Java 21's virtual threads (Project Loom) revolutionize how we write concurrent code — they let you write simple, blocking-style code that scales to millions of concurrent tasks.
Here's how to set up a lightweight HTTP server using virtual threads:
public class ApiServer {
private final ProductService productService;
private final ExecutorService executor;
public ApiServer(ProductService productService) {
this.productService = productService;
// Each request gets its own virtual thread — lightweight, no thread pool sizing needed
this.executor = Executors.newVirtualThreadPerTaskExecutor();
}
public void start(int port) throws IOException {
HttpServer server = HttpServer.create(new InetSocketAddress(port), 0);
server.setExecutor(executor);
server.createContext("/api/products", this::handleProducts);
server.createContext("/api/products/stats", this::handleStats);
server.start();
System.out.println("API server running on port " + port);
}
}
With platform threads (traditional threads), you'd need to carefully tune a thread pool — too few threads and requests queue up; too many and you exhaust memory (each platform thread costs ~1MB of stack space). Virtual threads eliminate this problem: they're managed by the JVM, cost only a few hundred bytes, and the JVM schedules them onto a small pool of carrier threads automatically.
But concurrency isn't just about handling requests — it's also about performing work in parallel within a request. Suppose you need to enrich product data from multiple sources:
public Product enrichProduct(Product product) throws InterruptedException, ExecutionException {
try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
Supplier<String> descriptionTask = scope.fork(() ->
fetchDescription(product.id()));
Supplier<List<String>> reviewsTask = scope.fork(() ->
fetchReviews(product.id()));
Supplier<Double> ratingTask = scope.fork(() ->
fetchAverageRating(product.id()));
scope.join(); // Wait for all tasks
scope.throwIfFailed(); // Propagate any exceptions
return new EnrichedProduct(
product,
descriptionTask.get(),
reviewsTask.get(),
ratingTask.get()
);
}
}
Structured concurrency (preview in Java 21) ensures that all forked tasks are treated as a unit: if one fails, the others are cancelled. If the parent thread is interrupted, child tasks are cleaned up. This prevents the common problems with raw CompletableFuture chains — leaked threads, orphaned tasks, and tangled error handling.
Key insight: Virtual threads excel at I/O-bound work (network calls, database queries). For CPU-bound work (heavy computation), parallel streams or a fixed-size
ForkJoinPoolare still more appropriate because you want to match thread count to CPU cores.
Caching is the single biggest performance lever in most APIs. When your stream-based query engine sorts 10,000 products for every request, a cache that stores recent results can reduce response times by orders of magnitude.
Let's build a time-based cache using ConcurrentHashMap:
public class QueryCache<K, V> {
private final ConcurrentMap<K, CacheEntry<V>> cache = new ConcurrentHashMap<>();
private final Duration ttl;
public QueryCache(Duration ttl) {
this.ttl = ttl;
}
private record CacheEntry<V>(V value, Instant expiresAt) {
boolean isExpired() {
return Instant.now().isAfter(expiresAt);
}
}
public Optional<V> get(K key) {
CacheEntry<V> entry = cache.get(key);
if (entry == null || entry.isExpired()) {
cache.remove(key); // Lazy eviction
return Optional.empty();
}
return Optional.of(entry.value());
}
public void put(K key, V value) {
cache.put(key, new CacheEntry<>(value, Instant.now().plus(ttl)));
}
public V computeIfAbsent(K key, Function<K, V> loader) {
return get(key).orElseGet(() -> {
V value = loader.apply(key);
put(key, value);
return value;
});
}
}
Now integrate it into the service:
public class ProductService {
private final ProductRepository repository;
private final QueryCache<ProductQuery, List<Product>> queryCache;
public ProductService(ProductRepository repository) {
this.repository = repository;
this.queryCache = new QueryCache<>(Duration.ofSeconds(30));
}
public List<Product> search(ProductQuery query) {
return queryCache.computeIfAbsent(query, this::executeSearch);
}
private List<Product> executeSearch(ProductQuery query) {
// ... the stream pipeline from the previous section
}
}
Because ProductQuery is a record, it gets correct equals() and hashCode() for free — identical queries will hit the cache. This is another reason records are so powerful for API design.
There's a subtle issue with the computeIfAbsent method above: under high concurrency, two threads with the same cache-miss key could both execute the loader function. For most read-heavy APIs, this thundering herd duplication is acceptable — the result is the same and the cost is just one extra computation. But if you need strict single-computation guarantees, use ConcurrentHashMap.computeIfAbsent directly (which locks per-key) but be aware it blocks other threads waiting for the same key.
Cache invalidation is one of the two hard problems in computer science (the other being naming things). For our API, TTL-based expiration is the simplest sound strategy. For write-heavy systems, consider invalidating on writes: call
cache.remove()in yoursave()anddelete()methods.
The final piece is composing all layers together. Clean code demands that we use constructor injection — each component receives its dependencies explicitly rather than creating them internally. This makes the code testable, configurable, and transparent.
public class Application {
public static void main(String[] args) throws IOException {
// Wire dependencies — composition root
ProductRepository repository = new ProductRepository();
ProductService service = new ProductService(repository);
ApiServer server = new ApiServer(service);
// Seed sample data
seedProducts(repository, 10_000);
// Start server
server.start(8080);
// Register shutdown hook for graceful cleanup
Runtime.getRuntime().addShutdownHook(new Thread(() -> {
System.out.println("Shutting down gracefully...");
server.stop();
}));
}
private static void seedProducts(ProductRepository repo, int count) {
var random = new Random(42); // Fixed seed for reproducibility
var categories = List.of("electronics", "books", "clothing", "home", "sports");
IntStream.range(0, count)
.mapToObj(i -> new Product(
UUID.randomUUID().toString(),
"Product-" + i,
categories.get(random.nextInt(categories.size())),
BigDecimal.valueOf(random.nextDouble(5.0, 500.0)).setScale(2, RoundingMode.HALF_UP),
random.nextInt(0, 1000),
Instant.now().minus(Duration.ofDays(random.nextInt(365)))
))
.forEach(repo::save);
}
}
Notice how seedProducts uses a stream pipeline (IntStream.range → mapToObj → forEach) instead of an imperative loop. This is idiomatic modern Java — declarative, readable, and expressive.
The shutdown hook is a production necessity. Without it, in-flight requests get abruptly killed when the JVM terminates. A graceful shutdown stops accepting new requests, waits for current ones to complete, then releases resources.
Here's a checklist for production readiness that you should mentally walk through for any API you build:
/api/health returning server status for load balancersThis architecture gives you a clean separation where each layer can be tested independently. The repository can be tested with concurrent writes. The service can be tested with a mock repository. The server can be tested with integration tests against the real stack.
ConcurrentHashMap with atomic operations (computeIfPresent, computeIfAbsent) prevents race conditions without coarse-grained lockingsorted() forces full materialization, making caching essential for performanceIn a high-performance Java API, the lesson emphasizes starting with a clean, immutable domain model using Java records before applying any performance optimizations. It also stresses the importance of Separation of Concerns through distinct layers (Domain, Repository, Service, Controller). **Exercise:** Explain why immutability in the domain layer is described as a "performance superpower" in the context of a concurrent API. In your answer, describe what specific concurrency problem immutable objects avoid, how this relates to the layered architecture presented in the lesson (particularly the Service layer where business logic and caching occur), and why the lesson argues that starting with clean design principles is more important than jumping straight to performance tricks. Use the `Product` record example to illustrate your reasoning.