Futures and Promises | Scala Programming Guide

- Published on

What is a Future?
In Scala, a Future represents a value that may not be available yet but will be available at some point in the future—or might fail trying. Unlike blocking operations that freeze your thread until the result appears, Futures let you write non-blocking, asynchronous code that keeps your program responsive.
Think of a Future[T] as a container that will eventually hold either:
- A value of type
T(success) - An exception (failure)
- Nothing yet (still computing)
This is fundamentally different from imperative programming where you block and wait. With Futures, you describe what should happen when the result arrives, and your thread moves on to other work.
Creating Futures
The simplest way to create a Future is using the Future constructor with an execution context. When you write Future { ... }, you're handing off a block of code to a thread pool for asynchronous execution. The block doesn't run on your current thread—it gets queued up and runs later on a worker thread. This is the fundamental mechanism that makes Futures non-blocking: your thread never waits. Instead, it submits the work and continues immediately. The work happens "in the background" on a different thread, and when it completes (or fails), you can attach code to react to that completion. This separation of concerns—execution from composition—is what makes asynchronous programming in Scala elegant.
import scala.concurrent._
import scala.concurrent.duration._
// Example: A stock ticker aggregator that fetches prices from multiple services
// (This is a real-world scenario rather than textbook toy code)
case class StockPrice(symbol: String, price: Double, timestamp: Long)
// Basic Future creation
val priceFromService1: Future[StockPrice] = Future {
// Simulating a network call to Service 1
Thread.sleep(500) // In reality, this would be a real HTTP request
StockPrice("AAPL", 150.25, System.currentTimeMillis())
}
// Another Future for a different service
val priceFromService2: Future[StockPrice] = Future {
Thread.sleep(700)
StockPrice("AAPL", 150.30, System.currentTimeMillis())
}
// The Future {} block executes asynchronously
// The thread that created these Futures doesn't wait for them
The key insight: when you write Future { ... }, you're not executing the block immediately on the current thread. Instead, that block gets submitted to an ExecutionContext, which is a thread pool that will run it later.
ExecutionContext — The Thread Pool Behind Futures
An ExecutionContext is the bridge between Futures and actual execution: it's a thread pool (or more generally, a dispatcher) that assigns work to available threads and manages their lifecycle. When you write Future { ... }, that block doesn't execute on your current thread—it's submitted to an ExecutionContext, which queues it for execution on one of its managed threads. This is crucial to understand because ExecutionContexts are finite resources: a thread pool has a maximum number of threads. If you create millions of Futures without care, you'll exhaust your thread pool, starve other work, or cause OutOfMemoryErrors. Furthermore, ExecutionContexts are not all equal: you use different pools for CPU-intensive work (bounded to processor count) versus I/O-bound work (unbounded, since I/O-blocked threads can multiply). Using the wrong pool for a workload can cause thread starvation: CPU-intensive work exhausting an I/O pool's threads, preventing I/O work from progressing. Scala provides a default global ExecutionContext (convenient but risky for production), and you should create custom contexts for different workload types. ExecutionContext is how you control concurrency granularity—the threading strategy that determines throughput, latency, and resource usage.
Think of an ExecutionContext like the dispatcher at a taxi company. Taxis (threads) are a finite resource. When a call comes in (Future), the dispatcher queues it up and assigns it to the next available taxi. If you send thousands of calls simultaneously, the dispatcher can run out of taxis, and callers must wait. Similarly, if you send CPU-intensive work to a general-purpose pool designed for I/O, you'll starve other operations. Different operations need different pools: short bursts of computation fit on a CPU-bound pool (threads equal to processor count), while I/O operations can use many more threads (they block waiting for network/disk, so multiple threads keep the CPU busy). Choosing the right pool for the right workload is critical for scalability.
The ExecutionContext accomplishes several things simultaneously: it prevents thread explosion (creating a thread per Future would exhaust system resources), it allows proper resource isolation (CPU work doesn't block I/O work), it enables testing (inject a single-threaded context in tests), and it provides performance control (proper pool sizing directly affects throughput and latency). Without ExecutionContext, each Future would either create a new thread (wasteful) or fight for a tiny shared pool (starved). With it, you have fine-grained control over concurrency strategy.
Let's see how to use them:
// Implicit global ExecutionContext (not recommended for production)
import scala.concurrent.ExecutionContext.Implicits.global
// Better: create your own ExecutionContext for fine-grained control
import java.util.concurrent.{Executors, ForkJoinPool}
// For CPU-intensive work (stock price calculations)
val cpuPool = ExecutionContext.fromExecutor(
new ForkJoinPool(Runtime.getRuntime.availableProcessors())
)
// For I/O-bound work (network calls)
val ioPool = ExecutionContext.fromExecutor(
Executors.newFixedThreadPool(20)
)
// Now you can explicitly choose which pool to use
val priceFromService3: Future[StockPrice] = Future {
Thread.sleep(600)
StockPrice("GOOGL", 2800.50, System.currentTimeMillis())
}(ioPool) // Use the I/O thread pool
// This matters! Using the wrong pool can cause thread starvation
// - CPU-intensive work on an I/O pool wastes threads
// - I/O work on a tiny CPU pool causes blocking
Why ExecutionContext matters:
- Finite thread pool prevents thread explosion (creating thousands of threads crashes your app)
- Different pools for different workload types prevents interference
- Testability: you can inject a single-threaded context in tests
- Performance: proper pool sizing is crucial for throughput
Common pitfall: Blocking an ExecutionContext thread. If you call Thread.sleep() inside a Future running on a limited pool, you're wasting a thread while it sleeps. The pool can run out of threads, and other Futures starve. In production, avoid blocking operations entirely—if you must sleep, use scheduled operations instead. If you must wait for something, use async libraries that don't block threads.
Callbacks: onComplete, onSuccess, onFailure (and Why They're Limited)
Once you have a Future, you can attach callbacks to react when it completes. Callbacks seem like the obvious way to work with async results: just say "when it completes, run this code." For simple cases with a single callback, this works fine. But the power becomes a limitation when you need to chain operations together. Callbacks compose poorly: each new operation requires another nested callback, creating deeply indented, hard-to-follow code. Worse, error handling becomes fragmented—you must add error callbacks at each nesting level, and control flow becomes hard to trace. This is why functional composition (map, flatMap) is superior for most async work.
// onComplete: handles both success and failure
priceFromService1.onComplete {
case Success(price) => println(s"Got price: ${price.price}")
case Failure(ex) => println(s"Error fetching price: ${ex.getMessage}")
}
// onSuccess: only called on success (ignores failures)
priceFromService2.onSuccess {
case price => println(s"Service 2 price: ${price.price}")
}
// onFailure: only called on failure
priceFromService3.onFailure {
case ex => println(s"Service 3 failed: ${ex.getMessage}")
}
The Problem with Callbacks:
Callbacks seem convenient, but they lead to the infamous "callback hell":
// AVOID THIS PATTERN:
def aggregatePricesWithCallbacks(): Unit = {
priceFromService1.onComplete {
case Success(p1) =>
priceFromService2.onComplete {
case Success(p2) =>
priceFromService3.onComplete {
case Success(p3) =>
// Now we have all three prices...
val avgPrice = (p1.price + p2.price + p3.price) / 3
println(s"Average: $avgPrice")
case Failure(ex3) => println(s"Service 3 failed: ${ex3.getMessage}")
}(ioPool)
case Failure(ex2) => println(s"Service 2 failed: ${ex2.getMessage}")
}(ioPool)
case Failure(ex1) => println(s"Service 1 failed: ${ex1.getMessage}")
}(ioPool)
}
// This is ugly, hard to read, and error handling is scattered everywhere
This is why Scala provides better composition mechanisms: map, flatMap, and for-comprehensions.
Composing Futures: map, flatMap, and for-comprehensions
Instead of callbacks, compose Futures using functional operations. This is where Futures shine: instead of nesting callbacks, you chain operations together like you would with collections. map transforms a successful result. flatMap chains Futures where one depends on the previous result. For-comprehensions provide syntactic sugar for complex chains. The key insight is lazy evaluation with automatic error propagation: when you compose Futures with map/flatMap, the transformations don't run until the source Future completes. If any Future fails, the entire chain fails automatically—no need to handle errors at each level. This makes async code read like sync code, but with non-blocking execution underneath.
Think of Future composition like assembly lines in a factory. Each stage (map, flatMap) processes the output from the previous stage. If any stage fails (produces an Exception), the entire line shuts down—you don't need to check for errors at each stage. The error automatically propagates to the end. This is in stark contrast to callbacks where you must manually check for errors and propagate them yourself. Future composition is also lazy: if you build a long chain but never run it (never block on it, never attach an onComplete), nothing executes. This allows you to build complex pipelines without immediately committing to execution.
Here's the deeper benefit: composition preserves non-blocking semantics. When you write future.map(f).flatMap(g).map(h), none of the transformations f, g, h run on the thread that composed them. They run on ExecutionContext threads when the source Future completes. Your main thread never blocks. It simply describes the transformation pipeline and moves on. This is what enables responsive systems: you queue up work and let background threads handle it.
// map: transform the value inside a successful Future
val price1Doubled: Future[Double] = priceFromService1.map { price =>
// This block only runs if priceFromService1 succeeds
price.price * 2
}
// flatMap: chain Futures together (when the next Future depends on the previous)
val enrichedPrice: Future[String] = priceFromService1.flatMap { price1 =>
// Use the result from price1 to create another Future
Future {
Thread.sleep(200)
s"${price1.symbol}: $${price1.price}"
}(ioPool)
}
// for-comprehension: syntactic sugar for multiple flatMap/map operations
val aggregatedData: Future[Map[String, Double]] = for {
p1 <- priceFromService1
p2 <- priceFromService2
p3 <- priceFromService3
} yield {
// This block runs only if all three Futures succeed
Map(
"service1" -> p1.price,
"service2" -> p2.price,
"service3" -> p3.price
)
}
// More readable than the callback version!
// Error handling is automatic: if ANY Future fails, the whole for-comprehension fails
aggregatedData.onComplete {
case Success(data) => println(s"Aggregated prices: $data")
case Failure(ex) => println(s"Aggregation failed: ${ex.getMessage}")
}
Key insight: Composition with map/flatMap is lazy and non-blocking:
- The transformations don't run until the source Future completes
- Your thread isn't blocked waiting
- Error handling cascades naturally
Why this matters: Consider a for-comprehension waiting on five network calls. Without composition, you'd write five nested callbacks with error handling at each level, and the code would be nearly unreadable. With composition, you write five lines in a for-comprehension, and errors automatically propagate. The async machinery is completely hidden. Additionally, the ExecutionContext threads running your Future blocks never block themselves (assuming your code doesn't call sleep or I/O). They stay active, processing other work while your transformation waits. This is how a small thread pool handles thousands of concurrent operations: threads never block; they always have work to do.
Combining Futures: sequence, traverse, and firstCompletedOf
When you have multiple Futures and need to combine them, Scala provides factory methods for common patterns. These avoid manual loops that are easy to get wrong. sequence waits for all Futures to complete, failing if any fails. traverse is like sequence but also applies a transformation to each element—more efficient than map then sequence because it doesn't create intermediate collections. firstCompletedOf races multiple Futures and returns the first winner. These are essential for coordinating multiple async operations without manual callback nesting.
// Scenario: fetch prices from a list of services
val services = List(
Future { Thread.sleep(300); StockPrice("AAPL", 150.0, System.currentTimeMillis()) }(ioPool),
Future { Thread.sleep(400); StockPrice("AAPL", 150.1, System.currentTimeMillis()) }(ioPool),
Future { Thread.sleep(200); StockPrice("AAPL", 150.2, System.currentTimeMillis()) }(ioPool)
)
// Future.sequence: convert List[Future[T]] into Future[List[T]]
// Waits for ALL Futures to complete
val allPrices: Future[List[StockPrice]] = Future.sequence(services)
allPrices.onComplete {
case Success(prices) =>
println(s"All prices fetched: ${prices.map(_.price)}")
case Failure(ex) =>
println(s"At least one service failed: ${ex.getMessage}")
}
// Future.traverse: like sequence, but also applies a function
// Convert List[T] into Future[List[U]] by applying T => Future[U] to each element
def fetchPriceForSymbol(symbol: String): Future[StockPrice] = Future {
Thread.sleep(scala.util.Random.nextInt(500))
StockPrice(symbol, scala.util.Random.nextDouble() * 200, System.currentTimeMillis())
}(ioPool)
val symbols = List("AAPL", "GOOGL", "MSFT", "AMZN")
val allSymbolPrices: Future[List[StockPrice]] = Future.traverse(symbols)(fetchPriceForSymbol)
allSymbolPrices.onComplete {
case Success(prices) =>
val avg = prices.map(_.price).sum / prices.length
println(s"Average price across symbols: $$${avg}")
case Failure(ex) => println(s"Failed to fetch symbols: ${ex.getMessage}")
}
// Future.firstCompletedOf: whichever Future finishes first wins
val fastestService: Future[StockPrice] = Future.firstCompletedOf(services)
fastestService.onComplete {
case Success(price) =>
println(s"Fastest service returned: ${price.price}")
case Failure(ex) =>
println(s"All services failed: ${ex.getMessage}")
}
When to use each:
sequence: all results needed, any failure fails the whole thingtraverse: transform while combining (more efficient than map then sequence)firstCompletedOf: racing multiple sources (first-one-wins semantics)
Error Recovery: recover, recoverWith, fallbackTo
Futures can fail, and you need graceful error handling. Unlike throwing exceptions (which is synchronous), Futures fail asynchronously. The three main recovery strategies are: recover returns a default value when a Future fails (good for providing fallback data); recoverWith tries another Future (good for service fallback chains); fallbackTo uses an alternative Future (similar to recoverWith, different API). All three preserve the non-blocking semantics: they don't block or throw exceptions. Instead, they transform the failed Future into a successful one with fallback data.
Think of recovery like fallback systems in an airplane. If the primary engine fails, a backup engine engages automatically. If the backup fails, another backup engages. This is what recovery chains do: each failed step tries the next option until one succeeds or all options exhaust.
def fetchPriceWithFallback(symbol: String): Future[StockPrice] = {
val primary = Future {
Thread.sleep(100)
if (scala.util.Random.nextBoolean()) {
throw new Exception("Service 1 temporarily unavailable")
}
StockPrice(symbol, 150.0, System.currentTimeMillis())
}(ioPool)
// recover: return a default value when Future fails
primary.recover {
case ex: Exception =>
println(s"Service 1 failed, using cached price")
StockPrice(symbol, 145.0, System.currentTimeMillis() - 60000)
}
}
def fetchPriceWithRecoveryChain(symbol: String): Future[StockPrice] = {
val service1 = Future {
throw new Exception("Service 1 down")
}(ioPool)
// recoverWith: try another Future when the first fails
service1.recoverWith {
case _: Exception =>
println("Service 1 failed, trying Service 2")
Future {
Thread.sleep(200)
StockPrice(symbol, 150.5, System.currentTimeMillis())
}(ioPool)
}
}
def fetchPriceWithFallback2(symbol: String): Future[StockPrice] = {
val service1 = Future {
throw new Exception("Service 1 down")
}(ioPool)
val service2 = Future {
Thread.sleep(100)
StockPrice(symbol, 149.9, System.currentTimeMillis())
}(ioPool)
// fallbackTo: use another Future if the first fails
service1.fallbackTo(service2)
}
// Combining recovery patterns
val robustPrice: Future[StockPrice] = fetchPriceWithRecoveryChain("AAPL")
.map { price =>
// Validate the price
if (price.price <= 0) {
throw new IllegalArgumentException("Invalid price")
}
price
}
.recover {
case ex: IllegalArgumentException =>
println(s"Price validation failed: ${ex.getMessage}")
StockPrice("AAPL", 150.0, System.currentTimeMillis())
}
robustPrice.onComplete {
case Success(price) => println(s"Final price: ${price.price}")
case Failure(ex) => println(s"All recovery attempts failed: ${ex.getMessage}")
}
Promises — Manually Completing a Future
A Promise lets you manually control when a Future completes. It's useful when you need to complete a Future from a callback-based API, from a thread, or from complex control flow. A Promise is the "writable" side of a Future: you create a Promise, hand its Future to consumers, and later complete it with a value or error. This decouples the creator of the Future from the producer of the value, which is crucial for bridging async APIs. Without Promises, you'd have no way to complete a Future from a callback-based library—you could only create Futures with the Future { ... } constructor, which requires the work to happen immediately.
Here's the deep reason Promises exist: Futures are immutable values that can be passed around, chained, and shared. But you need a way to set the Future's result from arbitrary code (callbacks, threads, wherever). A Promise is that mechanism. You can create a Promise, return its Future to the world, then complete it asynchronously from a callback. This allows async APIs with callbacks to be lifted into the Future/Promise ecosystem seamlessly. Without Promises, callback-based code would never integrate with Future composition.
Promises are one-time operations: you can only complete them once. The first call to success() or failure() completes the Promise. Subsequent calls throw IllegalStateException. This is intentional: Promises must be deterministic. If you could complete a Promise multiple times, the Future's value would be ambiguous—which completion matters? By restricting to one completion, Promises are clear and predictable.
import scala.concurrent.Promise
// Scenario: wrapping a callback-based HTTP library
def fetchPriceFromCallbackAPI(symbol: String): Future[StockPrice] = {
val promise = Promise[StockPrice]()
// Simulating a callback-based API
callbackBasedHttpGet(
url = s"https://api.example.com/price/$symbol",
onSuccess = { response =>
val price = StockPrice(symbol, response.toDouble, System.currentTimeMillis())
promise.success(price) // Complete the Future with a value
},
onError = { error =>
promise.failure(new Exception(s"HTTP error: $error")) // Complete with failure
}
)
promise.future // Return the Future immediately
}
// Another example: completing a Promise from a thread
def computePriceAsync(symbol: String): Future[StockPrice] = {
val promise = Promise[StockPrice]()
val thread = new Thread {
override def run(): Unit = {
try {
Thread.sleep(500)
val price = StockPrice(symbol, 150.25, System.currentTimeMillis())
promise.success(price)
} catch {
case ex: Exception => promise.failure(ex)
}
}
}
thread.start()
promise.future
}
// Using Promise.complete for pattern matching
def processAPIResponse(response: String): Future[StockPrice] = {
val promise = Promise[StockPrice]()
Try {
val parts = response.split(":")
StockPrice(parts(0), parts(1).toDouble, System.currentTimeMillis())
} match {
case Success(price) => promise.success(price)
case Failure(ex) => promise.failure(ex)
}
promise.future
}
// Promises are one-time operations: you can only complete them once
val p = Promise[Int]()
p.success(42)
p.success(43) // Throws IllegalStateException: Promise already completed
When to use Promises:
- Bridging callback-based APIs to Futures
- Completing a Future from a different thread
- Complex control flow where you need manual completion
Common pitfall: Creating Promises when composition would work. If you can use Future { ... } or map/flatMap, do that instead. Promises are for the rare cases where you must manually complete the Future. Overusing Promises defeats the purpose of Futures (elegantly composable async operations).
Blocking — The Escape Hatch and Its Dangers
Sometimes you must block and wait for a result. Await.result lets you do this, but it's dangerous. Blocking is a last resort, not a design strategy. The dangers are real: blocking exhausts ExecutionContext threads (threads sit idle instead of doing work), can cause deadlocks (circular dependencies where operations wait for each other), slows tests dramatically, and violates the async principle that made Futures attractive in the first place. Use Await only at the application entry point (where the main thread must wait to start), in integration tests (where you're testing across async boundaries), or in final aggregation steps of batch jobs. Never use it in request handlers, event loops, or hot paths.
import scala.concurrent.Await
import scala.concurrent.duration._
val priceFuture: Future[StockPrice] = priceFromService1
// Block and wait for at most 5 seconds
try {
val price = Await.result(priceFuture, 5.seconds)
println(s"Got price: ${price.price}")
} catch {
case ex: TimeoutException => println("Timeout waiting for price")
case ex: Exception => println(s"Future failed: ${ex.getMessage}")
}
// Await.ready: block until completion, returns the same Future
val completedFuture = Await.ready(priceFuture, 5.seconds)
completedFuture.value.foreach {
case Success(price) => println(s"Price: ${price.price}")
case Failure(ex) => println(s"Error: ${ex.getMessage}")
}
Dangers of Blocking:
Thread starvation: If you block all threads in your pool, nothing else runs. Imagine a thread pool with 10 threads, all blocked waiting for results. New work arrives but has nowhere to run—it backs up indefinitely. This is especially bad in ExecutionContexts: if you block threads meant for async work, the entire system grinds to a halt.
Deadlocks: Complex interdependencies can cause circular waits. Future A waits for result of Future B on the same ExecutionContext thread. Future B needs the same thread to complete, but the thread is blocked waiting. They deadlock forever. This happens most often with deep composition chains on limited thread pools.
Performance: Blocks threads instead of doing useful work. A blocked thread burns CPU without making progress. This is wasteful in multithreaded systems where the whole point is keeping threads busy with real work.
Testing problems: Makes tests fragile and slow. A test that blocks for 5 seconds to wait for a result is 5 seconds longer than it needs to be. Tests that block are also flaky—timing issues cause intermittent failures.
When blocking is acceptable:
- Integration tests (where you're testing synchronously)
- Main entry point of an application (where you need to wait for startup)
- Final result aggregation in a batch job
- Never in high-throughput request handlers
Practical Example: Parallel API Aggregation
Let's build a real-world example: a stock ticker aggregator that fetches data from three services, handles failures gracefully, and combines results:
object StockTickerAggregator {
import scala.concurrent._
import scala.concurrent.duration._
import scala.util.{Success, Failure}
implicit val ec: ExecutionContext =
ExecutionContext.fromExecutor(Executors.newFixedThreadPool(10))
case class AggregatedPrice(
symbol: String,
prices: Map[String, Double],
averagePrice: Double,
timestamp: Long
)
// Service 1: expensive but usually accurate
def fetchFromPremiumService(symbol: String): Future[Double] = Future {
Thread.sleep(800)
if (scala.util.Random.nextDouble() > 0.1) {
150.25 + scala.util.Random.nextDouble() * 2
} else {
throw new Exception("Premium service timeout")
}
}
// Service 2: fast but occasionally wrong
def fetchFromFastService(symbol: String): Future[Double] = Future {
Thread.sleep(200)
if (scala.util.Random.nextDouble() > 0.05) {
150.0 + scala.util.Random.nextDouble() * 3
} else {
throw new Exception("Fast service unavailable")
}
}
// Service 3: unreliable but cheap
def fetchFromCheapService(symbol: String): Future[Double] = Future {
Thread.sleep(500)
if (scala.util.Random.nextDouble() > 0.3) {
150.5 + scala.util.Random.nextDouble() * 2.5
} else {
throw new Exception("Cheap service error")
}
}
def aggregatePrice(symbol: String): Future[AggregatedPrice] = {
// Fetch from all three services with fallback strategies
val premium = fetchFromPremiumService(symbol)
.recover { case _ => 150.0 } // Default fallback
val fast = fetchFromFastService(symbol)
.recover { case _ => 150.2 }
val cheap = fetchFromCheapService(symbol)
.recover { case _ => 149.8 }
// Combine results once all three complete
for {
p <- premium
f <- fast
c <- cheap
} yield {
val prices = Map(
"premium" -> p,
"fast" -> f,
"cheap" -> c
)
val avg = prices.values.sum / 3
AggregatedPrice(symbol, prices, avg, System.currentTimeMillis())
}
}
def main(args: Array[String]): Unit = {
// Fetch for multiple symbols concurrently
val symbols = List("AAPL", "GOOGL", "MSFT")
val allAggregations: Future[List[AggregatedPrice]] =
Future.traverse(symbols)(aggregatePrice)
// Block only at the very end (main thread must wait)
try {
val results = Await.result(allAggregations, 5.seconds)
results.foreach { agg =>
println(
s"${agg.symbol}: avg=$${agg.averagePrice}, " +
s"premium=$${agg.prices("premium")}, " +
s"fast=$${agg.prices("fast")}, " +
s"cheap=$${agg.prices("cheap")}"
)
}
} catch {
case ex: TimeoutException =>
println("Aggregation timed out after 5 seconds")
case ex: Exception =>
println(s"Aggregation failed: ${ex.getMessage}")
}
ec.shutdown()
}
}
This example demonstrates:
- Multiple Futures running in parallel (no artificial sequencing)
- Graceful fallbacks for unreliable services
- Composition with for-comprehensions
- Combining results with Future.traverse
- Blocking only at the entry point