"Taming the Virtual Thread Torrent: Build a Flood Gate for Java Rate Limiting" with A N M Bazlur Rahman - JVM Weekly vol. 149

Today, we have another guest posts!

and

Oct 16, 2025

I’m still on the Tech Week tour - this week checking in from Los Angeles. Highly recommend the event: “agents” want to jump out of your fridge, and everyone’s calendars are bursting… mine included. That’s why today’s post is a guest feature. Java Champion A. N. M. Bazlur Rahman takes over to walk you through virtual threads - practical, no smoke and mirrors, just implementation of Rate Limiting.

Taming the Virtual Thread Torrent: Build a Flood Gate for Java Rate Limiting

In the era of Project Loom, virtual threads have revolutionized Java concurrency, promising lightweight, scalable parallelism without the overhead of traditional platform threads. But with great power comes great responsibility. Imagine unleashing thousands of virtual threads to handle incoming requests, only to flood a downstream API with requests, triggering rate limits and cascading failures.

We can solve it using a concept called “flood gate”: a simple yet powerful semaphore-based limiter that tames the torrent.

This article discusses the potential of implementing a VirtualThreadLimiter, a bounded executor that enforces backpressure on virtual threads. We’ll build it step-by-step using the provided code, explore its mechanics, and integrate it into a real-world example. Whether you’re refactoring a microservice or experimenting with high-throughput systems, this flood gate will keep things flowing smoothly.

Why Rate Limiting Matters in the Virtual Thread World

Virtual threads, introduced in JDK 21 as a stable feature, are cheap to create and schedule, allowing millions to run without exhausting OS resources. They’re ideal for I/O-bound tasks like HTTP calls or database queries. However, without bounds, they can overwhelm external services. For example, a database connection might only handle 100 connections at a time, or another microservice might have its own rate limiting, leading to 429 errors, throttled responses, or even bans.

Traditional solutions like ThreadPoolExecutor work by default because we configure with a limited number of threads. So it works naturally and limits the concurrent processing without putting us to any work. But it doesn’t play nice with Loom’s model, because the concept of threadpooling makes no sense with virtual threads. They are usually used for short-lived tasks, and we can have as many as we want.

And that’s where we can resort to semaphores, a classic concurrency primitive that acts as a permit-based gate. Paired with Executors.newVirtualThreadPerTaskExecutor(), we get unbounded creation with controlled execution, a pure elegance.

Our VirtualThreadLimiter uses a semaphore to cap “in-flight” tasks (e.g., max 100 concurrent API calls). It supports three backpressure policies: BLOCK (wait it out), TIMEOUT (fail after a deadline), and SHED (reject immediately for fast failure). This flexibility makes it adaptable to batch jobs, interactive APIs, or real-time services.

Let’s build it.

The Code: Dissecting the VirtualThreadLimiter

Here’s the heart of our flood gate. I’ll annotate key sections as we go.

The Enum: BackpressurePolicy

First, the policies that dictate our gate’s behavior:

/**
 * Defines how the system should react when rate limits are reached.
 */
public enum BackpressurePolicy {
    /**
     * Block the caller until capacity becomes available.
     * Best for batch ingestion where latency is less critical.
     */
    BLOCK,
    /**
     * Attempt with a timeout, then fail fast.
     * Balanced approach for most APIs.
     */
    TIMEOUT,
    /**
     * Immediately reject if no capacity is available.
     * Best for low-latency APIs that need fast failure.
     */
    SHED
}

Think of these as traffic lights:

BLOCK: Red light—full stop until green.
TIMEOUT: Yellow—hurry up or bail.
SHED: No entry—turn around now.

The Limiter Class: Your Flood Gate in Action

Now, the main class. It wraps a virtual thread executor with a semaphore for concurrency control.

package ca.bazlur.throttlecraft.core;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.util.concurrent.*;

/**
* Bounded executor that uses virtual threads with a semaphore to control concurrency.
* This is the key component for enforcing backpressure on outbound calls.
*/

public final class VirtualThreadLimiter implements AutoCloseable {

   private static final Logger log = LoggerFactory.getLogger(VirtualThreadLimiter.class);
   private static final int DEFAULT_TIME_LIMIT_MS = 60_000;
   private final ExecutorService exec = Executors.newVirtualThreadPerTaskExecutor();
   private final Semaphore permits;
   private final int maxInFlight;
   private final BackpressurePolicy policy;
   private final long timeoutMs;

   public VirtualThreadLimiter(int maxInFlight, BackpressurePolicy policy, long timeoutMs) {
       this.maxInFlight = maxInFlight;
       // Use fair=true if you want FIFO under contention
       this.permits = new Semaphore(maxInFlight /*, true */);
       this.policy = policy;
       this.timeoutMs = timeoutMs;
   }

   public VirtualThreadLimiter(int maxInFlight, BackpressurePolicy policy) {
       this(maxInFlight, policy, policy == BackpressurePolicy.SHED ? 0 : DEFAULT_TIME_LIMIT_MS);
   }

   public VirtualThreadLimiter(int maxInFlight) {
       this(maxInFlight, BackpressurePolicy.BLOCK, DEFAULT_TIME_LIMIT_MS);
   }

   public <T> CompletableFuture<T> submit(Callable<T> task) {
       // Fast-path: shed without creating a VT
       if (policy == BackpressurePolicy.SHED && !permits.tryAcquire()) {
           throw new RejectedExecutionException(
                   “Backpressured: no permits (policy=SHED, max=” + maxInFlight + “)”
           );
       }

       return CompletableFuture.supplyAsync(() -> {
           boolean acquired = false;
           try {
               if (policy == BackpressurePolicy.BLOCK) {
                   permits.acquire();
                   acquired = true;
               } else if (policy == BackpressurePolicy.TIMEOUT) {
                   acquired = permits.tryAcquire(timeoutMs, TimeUnit.MILLISECONDS);
               } else {
                   // SHED path that passed fast-path above
                   acquired = true; // we already acquired in fast-path OR will acquire here if you remove fast-path
               }

               if (!acquired) {
                   throw new RejectedExecutionException(
                           “Backpressured: no permits (policy=” + policy + “, max=” + maxInFlight + “)”
                   );
               }

               log.trace(”Executing task (available permits: {})”, permits.availablePermits());
               return task.call();
           } catch (InterruptedException ie) {
               Thread.currentThread().interrupt(); // preserve interrupt
               throw new CompletionException(”Interrupted while acquiring permit”, ie);
           } catch (Exception e) {
               // If task.call() was interrupted and threw InterruptedException, we already preserved the flag above.
               throw new CompletionException(e);
           } finally {
               if (acquired) permits.release();
           }
       }, exec);
   }

   public CompletableFuture<Void> submit(Runnable task) {
       return submit(() -> {
           task.run();
           return null;
       });
   }

   public int availablePermits() {
       return permits.availablePermits();
   }
   
   public int activeCount() {
       return maxInFlight - permits.availablePermits();
   }

   @Override
   public void close() {
       log.info(”Shutting down VirtualThreadLimiter (active: {})”, activeCount());
       exec.shutdown();
       try {
           if (!exec.awaitTermination(5, TimeUnit.SECONDS)) {
               exec.shutdownNow();
           }
       } catch (InterruptedException e) {
           exec.shutdownNow();
           Thread.currentThread().interrupt();
       }
   }
}

Key Breakdown

Initialization:
- newVirtualThreadPerTaskExecutor(): Creates a new virtual thread per task -unbounded scalability.
- Semaphore(maxInFlight): The gate holds maxInFlight permits (e.g., 50). Default unfair (faster, but no FIFO); flip to true for fairness under contention.
submit(Callable<T>): The magic method.
- Fast-Path for SHED: Before spinning up a virtual thread, tryAcquire() checks availability. If none, reject instantly—no thread waste.
- Async Wrapper: CompletableFuture.supplyAsync(..., exec) offloads to a virtual thread, keeping the caller non-blocking.
- Acquire Logic:
  - BLOCK: acquire()—blocks the virtual thread until a permit frees up.
  - TIMEOUT: tryAcquire(timeoutMs)—waits or fails.
  - SHED: Already handled; just execute.
- Error Handling: Wraps exceptions in CompletionException; preserves interrupts for clean shutdowns.
- Finally Release: Ensures permits return, preventing deadlocks.
Helpers:
- activeCount(): Tracks in-flight tasks for monitoring.
- close(): Graceful shutdown with timeout; essential for Spring Boot @PreDestroy or similar.

This design is Loom-native: Virtual threads manage blocking acquisitions without pinning carriers, allowing the JVM scheduler to yield efficiently.

Hands-On: Integrating into a Microservice

Let’s put it to work. Suppose you’re building a user service that fetches profiles from an external API (e.g., a mock JSONPlaceholder). Without limits, 1,000 concurrent requests could spam the endpoint.

import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.util.List;
import java.util.concurrent.CompletableFuture;

public class UserService {
    private final VirtualThreadLimiter limiter = new VirtualThreadLimiter(10, BackpressurePolicy.TIMEOUT, 5_000);
    private final HttpClient httpClient = HttpClient.newHttpClient();

    public CompletableFuture<List<User>> fetchBatchUsers(int[] userIds) {
        return CompletableFuture.allOf(
            Arrays.stream(userIds)
                .mapToObj(id -> limiter.submit(() -> fetchSingleUser(id)))
                .toArray(CompletableFuture[]::new)
        ).thenApply(v -> Arrays.stream(userIds)
            .mapToObj(this::fetchSingleUser)  // Simplified; in reality, collect results
            .collect(Collectors.toList())
        );
    }

    private User fetchSingleUser(int id) {
        HttpRequest request = HttpRequest.newBuilder()
            .uri(URI.create(”https://jsonplaceholder.typicode.com/users/” + id))
            .build();
        return httpClient.send(request, HttpResponse.BodyHandlers.ofString())
            .body()
            .lines()
            .map(line -> new User(id, line))  // Parse JSON to User
            .findFirst()
            .orElseThrow();
    }

    // Cleanup
    public void shutdown() {
        limiter.close();
    }
}

Conclusion

With VirtualThreadLimiter, virtual threads get the benefits of throughput, but they’re smart. This flood gate semaphore turns potential overload into graceful backpressure, letting your apps scale without the drama.

Code licensed under Apache 2.0. Full repo: https://github.com/rokon12/throttle-craft

BTW: I had access to Bazlur’s new book Modern Concurrency in Java: Virtual Threads, Structured Concurrency, and Beyond in the early access - and it’s excellent. It’s the first truly cohesive, in-depth treatment of Virtual Threads I’ve seen in a book form. Instead of stopping at API snippets or hand-picked benchmarks, the book builds a clear narrative: Loom’s operating principles, managing virtual threads in practice, but also structured concurrency and scoped values. It reads like a complete “Project Loom in a nutshell.”

What stood out most is the pragmatic take on virtual threads, with early real-world use already peeking through. Structured concurrency and scoped values are still evolving, but this is exactly the guidance developers need to form the right mental model now. Also, the chapter on reactive programming is a nice bridge for today’s codebases (a Kotlin coroutines comparison would be also a nice addition, but the Java-first focus is fair).

Tiny nit: the title “Modern Concurrency” is broad in Java lore; this book is really about the modern-modern era - Loom and virtual threads. Still, readers will get it. If you’re shipping on Java 21+, this belongs on your desk.

BTW, early next week I’ll be at JDD in Kraków with not one but two sessions. First, I’ll talk about Java performance for everyday developers. Then I’ll join a panel with Jarek Pałka and Andrzej Grzesik to discuss the “AIpocalypse” in Java.

I’m honored and can’t wait 🤩

A guest post by

Bazlur Rahman

Helping Java developers to master top coding and collaboration skills Published Books: https://t.co/bhTecLgWGB #develoepr, #author #jugleader

Discussion about this post

Ready for more?