Scala Programming Guidesscala-ecosystemscala-best-practices

Performance Tuning and Optimization | Scala Programming Guide

By Dmitri Meshin

Published on: December 29, 2025

Sharing

JVM Performance Fundamentals

JIT Compilation and Warm-up

The JVM's Just-In-Time compiler is what makes Java (and Scala) fast in production, but it requires warm-up time to achieve peak performance. Initially, the JVM interprets bytecode—which is slow. As code is executed repeatedly (typically after 10,000 calls), frequently-executed methods become "hot" and the JIT compiler compiles them to native machine code, applying sophisticated optimizations. This has huge implications: benchmark results from cold starts are meaningless, performance improves over time as the application runs, and production workloads that reach steady state are far faster than microbenchmarks might suggest. For latency-sensitive applications, warm-up can be critical: requests that arrive after the application has been running for hours will be faster than those that arrive right after startup. Understanding JIT is essential for writing performant code and interpreting performance measurements correctly.

// JIT needs warm-up time to optimize
// First calls use interpreter (slow)
// After ~10,000 calls, methods become "hot" and JIT compiles them

def fibonacci(n: Int): Long = {
  if (n <= 1) n else fibonacci(n - 1) + fibonacci(n - 2)
}

// Warm-up phase
(1 to 10000).foreach(_ => fibonacci(30))  // Slow initially

// Now JIT has optimized this
val optimized = fibonacci(30)  // Much faster

// Implication: Always warm up before benchmarking!

Garbage Collection

Garbage collection is both a blessing and a challenge: it frees you from manual memory management but introduces pauses that can destroy latency-sensitive applications. Understanding JVM garbage collection helps you minimize these pauses and design applications that play well with the GC. The JVM heap is divided into generations: young generation (where most objects are created and die quickly) and old generation (where long-lived objects reside). Young GC is fast; full GC is slow. By minimizing object allocation and reducing the number of objects that survive the young generation, you can dramatically reduce GC pause times. For high-frequency trading, real-time systems, or low-latency services, understanding and tuning GC becomes critical. This section shows how to measure GC behavior, understand what's happening, and design code that cooperates with the garbage collector rather than fighting it.

// Monitor GC with JVM flags:
// -XX:+PrintGCDetails -XX:+PrintGCDateStamps
// -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC

// Young generation (fast): Most objects die here
// Old generation (slow): Long-lived objects

// Creating too many objects triggers Young GC
def allocateHeavily(): Unit = {
  val lists = (1 to 1000000).map { i =>
    List.range(1, 100)  // Each iteration allocates new list
  }
  // Young GC pauses here as temporary lists fill Eden
}

// Better: Reuse objects or process in streaming fashion
def allocateLightly(): Unit = {
  val list = List.range(1, 100)
  (1 to 1000000).foreach { _ =>
    // Reuse same list, not allocating new ones
    list.map(_ * 2)
  }
}

// Minimize allocation in latency-critical code paths
class HighThroughputProcessor {
  private val buffer = Array.ofDim[Byte](8192)  // Reused buffer

  def process(stream: InputStream): Unit = {
    var bytesRead = 0
    while ({
      bytesRead = stream.read(buffer)
      bytesRead >= 0
    }) {
      // Process buffer without allocating new arrays
      processBuffer(buffer, bytesRead)
    }
  }

  private def processBuffer(data: Array[Byte], length: Int): Unit = {
    // ... process without allocating
  }
}

Benchmarking with JMH

Java Microbenchmark Harness provides accurate performance measurement. Understanding JMH is essential because naive benchmarking leads to completely misleading results: HotSpot optimizations, garbage collection, and JVM warmup mean that code timed once at the start of a program runs vastly differently than code timed after millions of executions.

JMH handles the hard problems: ensuring adequate warmup so HotSpot optimizations kick in, running multiple forks to reduce interference between benchmarks, controlling garbage collection, preventing dead code elimination, and providing statistical significance testing. The annotations are straightforward—@Benchmark marks a method as a benchmark, @State controls object lifecycle, @Param lets you test multiple configurations—but understanding what they control is crucial for valid measurements.

The profilers integrated with JMH let you see why a benchmark behaves the way it does: CPU time vs. wall-clock time tells you about GC pressure, allocation flamegraphs show unexpected object creation, lock contention profiles reveal synchronization bottlenecks. Running jmh:run in SBT is convenient, but for detailed analysis you'll want to export results and analyze them with tools like JMH Visualizer. Remember that benchmarks measure specific scenarios—always verify that your benchmark matches your real-world workload.

// build.sbt
addSbtPlugin("pl.project13.scala" % "sbt-jmh" % "0.4.7")

// In jmh config
enablePlugins(JmhPlugin)

// src/jmh/scala/benchmarks/StringProcessingBenchmark.scala
package benchmarks

import org.openjdk.jmh.annotations._
import org.openjdk.jmh.infra.Blackhole
import scala.util.Random

// Benchmark class
@State(Scope.Benchmark)  // Reused across benchmark threads
class StringProcessingBenchmark {
  // Warm-up: 5 iterations of 10 seconds each
  @Warmup(iterations = 5, time = 10, timeUnit = TimeUnit.SECONDS)
  // Measurement: 5 iterations of 10 seconds each
  @Measurement(iterations = 5, time = 10, timeUnit = TimeUnit.SECONDS)
  // Run in 4 threads
  @Fork(value = 4)
  // Avoid dead code elimination
  @BenchmarkMode(Mode.AverageTime)
  @OutputTimeUnit(TimeUnit.MICROSECONDS)
  def stringConcatenation(bh: Blackhole): Unit = {
    var result = ""
    (1 to 100).foreach { i =>
      result += i.toString
    }
    bh.consume(result)  // Prevent JVM from optimizing away computation
  }

  @Benchmark
  @BenchmarkMode(Mode.AverageTime)
  @OutputTimeUnit(TimeUnit.MICROSECONDS)
  def stringBuilderConcatenation(bh: Blackhole): Unit = {
    val sb = new StringBuilder()
    (1 to 100).foreach { i =>
      sb.append(i.toString)
    }
    bh.consume(sb.toString())
  }

  @Benchmark
  @BenchmarkMode(Mode.AverageTime)
  @OutputTimeUnit(TimeUnit.MICROSECONDS)
  def listAlloc(bh: Blackhole): Unit = {
    val list = (1 to 1000).toList
    bh.consume(list)
  }

  @Benchmark
  @BenchmarkMode(Mode.AverageTime)
  @OutputTimeUnit(TimeUnit.MICROSECONDS)
  def vectorAlloc(bh: Blackhole): Unit = {
    val vector = (1 to 1000).toVector
    bh.consume(vector)
  }

  // Parameter-based benchmarks
  @Param(Array("100", "1000", "10000"))
  var size: Int = _

  @Benchmark
  def mapOperation(bh: Blackhole): Unit = {
    val result = (1 to size).map(_ * 2).sum
    bh.consume(result)
  }
}

// Run with: sbt "jmh:run"
// Result: stringConcatenation is ~100x slower than StringBuilder!
// Average time for stringConcatenation: ~2.5 microseconds
// Average time for stringBuilderConcatenation: ~0.025 microseconds

Value Classes to Avoid Boxing

Value classes provide zero-cost wrappers:

// Problem: Generic types box primitives
def processNumbers(nums: List[Int]): Int = {
  // Each Int in List[Int] is boxed as Integer object
  nums.map(_ * 2).sum
}

// Solution: Value class
class UserId(val value: String) extends AnyVal {
  // This is compiled away - no object allocation at runtime!
  def formatted: String = s"USR-${value.toUpperCase}"
}

val id = UserId("abc123")  // No allocation!
id.formatted               // Direct method call, no indirection

// Value class benefits:
// - No memory overhead (compiled to primitive/String directly)
// - No boxing/unboxing overhead
// - Only works with single val parameter
// - Can't be subclassed

// Example: High-performance domain types
class OrderId(val value: String) extends AnyVal
class CustomerId(val value: String) extends AnyVal
class Money(val cents: Long) extends AnyVal {
  def +(other: Money): Money = new Money(cents + other.cents)
  def toDecimal: BigDecimal = BigDecimal(cents) / 100
}

// These are free abstractions - zero runtime cost!
val order1 = OrderId("ORD-001")
val order2 = OrderId("ORD-002")
val amount = Money(9999)  // $99.99 in cents

// Cannot accidentally mix types - compile-time safety
// amount + order1  // Compile error!

Specialization for Generics

The @specialized annotation generates specialized versions for primitive types:

// Without specialization: boxing overhead
class Container[T](value: T) {
  def getValue: T = value
}

val intContainer = new Container(42)
val x = intContainer.getValue  // Boxing: Int -> Integer -> Int

// With specialization: generates specialized bytecode for each primitive
@specialized
class FastContainer[@specialized T](value: T) {
  def getValue: T = value
}

val intContainer2 = new FastContainer(42)
val y = intContainer2.getValue  // No boxing!

// @specialized works with type parameters in methods too
class Processor {
  @specialized
  def process[@specialized T](items: Array[T]): Unit = {
    var i = 0
    while (i < items.length) {
      println(items(i))
      i += 1
    }
  }
}

// Note: @specialized generates separate bytecode for:
// - Int, Long, Double, Float, Boolean, Byte, Char, Short, Unit
// This increases compiled code size, so use judiciously

Tail Recursion for Stack Safety

@tailrec verifies that recursion is optimized:

// NOT tail recursive - grows stack
def sum(nums: List[Int]): Long = nums match {
  case Nil => 0
  case head :: tail => head + sum(tail)  // Recursive call after addition
}

sum((1 to 10000).toList)  // StackOverflowError!

// Tail recursive - can be optimized to loop
@tailrec
def sumTailRecursive(nums: List[Int], accumulator: Long = 0): Long = {
  nums match {
    case Nil => accumulator
    case head :: tail => sumTailRecursive(tail, accumulator + head)
  }
}

sumTailRecursive((1 to 1000000).toList)  // No stack overflow!

// @tailrec catches accidental non-tail-recursion at compile time
@tailrec
def badRecursion(n: Int): Int = {
  if (n <= 0) 0
  else badRecursion(n - 1) + 1  // ERROR: Could not optimize tail call
}

// Practical example: deep tree traversal
sealed trait JsonValue
case class JsonObject(pairs: List[(String, JsonValue)]) extends JsonValue
case class JsonArray(items: List[JsonValue]) extends JsonValue
case class JsonString(value: String) extends JsonValue
case class JsonNumber(value: Double) extends JsonValue
case object JsonNull extends JsonValue

// Find all string values in JSON structure
@tailrec
def findStrings(
  node: JsonValue,
  accumulator: List[String] = List()
): List[String] = {
  node match {
    case JsonString(s) =>
      s :: accumulator
    case JsonObject(pairs) =>
      // Cannot use tail recursion here due to list traversal
      pairs.flatMap(_._2).foldLeft(accumulator) { (acc, value) =>
        findStrings(value, acc)
      }
    case JsonArray(items) =>
      items.foldLeft(accumulator) { (acc, value) =>
        findStrings(value, acc)
      }
    case _ => accumulator
  }
}

Collection Performance

Different collections have different performance characteristics:

// Scenario 1: Frequent random access
val array = Array(1, 2, 3, 4, 5)        // O(1) access - BEST
val list = List(1, 2, 3, 4, 5)          // O(n) access - BAD
val vector = Vector(1, 2, 3, 4, 5)      // O(log n) access - OK

// If you need random access, use Array or Vector
def sum(numbers: Vector[Int]): Int = {
  var total = 0
  var i = 0
  while (i < numbers.length) {
    total += numbers(i)  // Fast random access
    i += 1
  }
  total
}

// Scenario 2: Frequent prepending
val prepended = 0 :: list              // O(1) for List
val prepended2 = 0 +: vector           // O(n) for Vector - SLOW
// Use List for prepending

// Scenario 3: Iteration only
val iterated = array.map(_ * 2)         // O(n) - fine
val iterated2 = list.map(_ * 2)         // O(n) - fine
val iterated3 = vector.map(_ * 2)       // O(n) - fine
// All are equivalent for iteration

// Scenario 4: Building collection
val builder1 = scala.collection.mutable.ListBuffer()  // Efficient for List
builder1 += 1
builder1 += 2
val result1 = builder1.toList

val builder2 = scala.collection.mutable.ArrayBuffer()  // Efficient for Array
builder2 += 1
builder2 += 2
val result2 = builder2.toArray

// Scenario 5: Large immutable collection creation
// Use Vector for structural sharing with reasonable performance
// Use List only if you primarily access head
// Use Array only if you need mutable access or zero-copy

// Performance comparison benchmark results:
// Access: Vector ~200ns, List ~5000ns, Array ~10ns
// Append: Vector ~20ns, List ~200ns, Array ~n/a (grows heap)
// Prepend: List ~10ns, Vector ~500ns

Profiling Tools

Java Flight Recorder (JFR)

# Start JVM with JFR enabled
java -XX:+UnlockCommercialFeatures \
     -XX:+FlightRecorder \
     -XX:StartFlightRecording=duration=60s,filename=recording.jfr \
     -jar myapp.jar

# Analyze recording
jmc # Open Java Mission Control GUI with recording

# Command-line analysis
jcmd <pid> JFR.dump filename=recording.jfr
jfr print recording.jfr

async-profiler for CPU Profiling

# Install: https://github.com/jvm-profiling-tools/async-profiler
# Compile Scala with -g flag for debug info

# Profile CPU usage (60 seconds)
./profiler.sh -d 60 -e cpu -f flamegraph.html <pid>

# Profile allocation
./profiler.sh -d 60 -e alloc -f flamegraph.html <pid>

# Profile lock contention
./profiler.sh -d 60 -e lock -f flamegraph.html <pid>

# Generate report
./profiler.sh -d 60 -e cpu -f jfr <pid>
jfr print profile.jfr

Common Performance Pitfalls

1. String Concatenation in Loops

// SLOW: O(n²) due to new String creation per iteration
def slowConcat(words: List[String]): String = {
  var result = ""
  words.foreach { word =>
    result += word + ", "
  }
  result
}

// FAST: O(n) single allocation
def fastConcat(words: List[String]): String = {
  val sb = new StringBuilder
  words.foreach { word =>
    sb.append(word).append(", ")
  }
  sb.toString()
}

// Performance: slowConcat on 10000 items takes ~500ms
//            fastConcat on 10000 items takes ~1ms

2. Excessive List Operations

// SLOW: Multiple traversals
val result = list
  .filter(_ > 0)
  .map(_ * 2)
  .filter(_ < 100)
  .map(_ + 1)
// Each operation traverses entire list

// FAST: Single traversal (fusion)
val result = list.collect {
  case x if x > 0 && x * 2 < 100 => x * 2 + 1
}

// Or use iterators for large lists
val result = list.iterator
  .filter(_ > 0)
  .map(_ * 2)
  .filter(_ < 100)
  .map(_ + 1)
  .toList

3. Boxing/Unboxing Overhead

// SLOW: List[Int] boxes each integer
val numbers: List[Int] = (1 to 1000000).toList
val sum = numbers.sum  // Unboxes each element

// FAST: Use Array or mutable.ArrayBuffer
val numbers: Array[Int] = (1 to 1000000).toArray
var sum = 0
var i = 0
while (i < numbers.length) {
  sum += numbers(i)
  i += 1
}

// Performance: List.sum takes ~50ms, Array loop takes ~2ms

4. Repeated Collection Conversions

// SLOW: Multiple conversions
def process(list: List[Int]): Set[Int] = {
  val array = list.toArray
  val set: Set[Int] = array.toSet
  set
}

// FAST: Direct conversion
def process(list: List[Int]): Set[Int] = {
  list.toSet
}

// SLOW: Converting back and forth
val list = mySet.toList
val filtered = list.filter(_ > 10).toSet

// FAST: Stay in one type
val filtered = mySet.filter(_ > 10)

5. Unnecessary Option/Either Wrapping

// SLOW: Boxing small values
def find(items: List[Int]): Option[Int] = {
  items.find(_ > 0)  // Allocates Some/None object
}

// FASTER: Direct return in loops
def find(items: List[Int]): Int = {
  for (item <- items) {
    if (item > 0) return item
  }
  -1  // sentinel value
}

// Note: Modern Scala with primitive specialization can make this moot,
// but the general principle of minimizing allocations remains.

6. Lock Contention

// SLOW: Holding lock during I/O
class SlowService {
  private val lock = new Object()
  private var cache = Map[String, String]()

  def lookup(key: String): String = lock.synchronized {
    if (cache.contains(key)) {
      cache(key)
    } else {
      val result = expensiveIoOperation()  // Long I/O while holding lock!
      cache = cache + (key -> result)
      result
    }
  }
}

// FAST: Release lock before I/O
class FastService {
  private val lock = new Object()
  private var cache = Map[String, String]()

  def lookup(key: String): String = {
    // Check cache without holding lock
    val cached = lock.synchronized { cache.get(key) }
    cached match {
      case Some(value) => value
      case None =>
        val result = expensiveIoOperation()  // No lock!
        lock.synchronized {
          cache = cache + (key -> result)
        }
        result
    }
  }
}

7. Creating Objects in Hot Paths

// SLOW: New object per call
def processData(data: String): Result = {
  val parser = new JsonParser()  // Allocates per call
  parser.parse(data)
}

// FAST: Reuse object
class Processor {
  private val parser = new JsonParser()

  def processData(data: String): Result = {
    parser.parse(data)  // Reuses same parser
  }
}

// SLOW: Closure captures variables
(1 to 1000000).map { i =>
  val multiplier = 2  // Captured in closure
  val result = expensiveComputation(i)
  result * multiplier  // Closure overhead
}

// FAST: Avoid captures
val multiplier = 2
(1 to 1000000).map { i =>
  val result = expensiveComputation(i)
  result * multiplier  // Direct variable access
}

APPENDIX A: Scala 2 vs Scala 3 Migration Cheatsheet

Scala 3 (Dotty) modernized the language with significant improvements. This appendix provides a side-by-side migration guide.

Syntax Changes

Feature	Scala 2	Scala 3	Notes
Implicit parameters	`def foo(implicit x: Int)`	`def foo(using x: Int)`	Clearer intent, no ambiguity
Implicit conversions	`implicit def strToInt(s: String): Int = s.toInt`	`given Conversion[String, Int] = _.toInt`	More explicit, controlled
Extension methods	`implicit class StringOps(s: String)`	`extension (s: String) def foo: String = ...`	Built-in syntax, cleaner
Context bounds	`def foo[T: Ordering]`	`def foo[T: Ordering]` (same, but powered by given)	Same syntax, new semantics
Union types	Manual sealed trait hierarchy	`Int \| String \| Boolean`	Direct type syntax
Intersection types	Manual trait mixing	`Printable & Serializable`	Direct type syntax
Tuple syntax	`(1, "a", true)`	`(1, "a", true)` (same, better inference)	More consistent
Named arguments	`foo(x = 1, y = 2)`	`foo(x = 1, y = 2)` (same)	Works with more cases
Case classes	`case class User(name: String, age: Int)`	`case class User(name: String, age: Int)` (same)	Can now also be enums
Pattern matching	`x match { case _ => ... }`	Same syntax, `	` patterns added
Match expressions	`val x = y match { ... }`	Same, more composable with GADTs	Type-level benefits
For comprehensions	`for (x <- xs) yield f(x)`	Same	Works with more types
Indentation-based syntax	Braces required	Optional braces	Pythonic alternative
Trailing commas	Only in edge cases	Generally allowed	More flexible formatting
Do-notation (unstable)	N/A	`do { x <- foo; y <- bar; x + y }`	Async syntax improvements

Implicit to Given Migration

The most significant change - implicits are now "givens":

// Scala 2: Old implicit
implicit val intOrdering: Ordering[Int] = Ordering.Int

implicit def listOrdering[T: Ordering]: Ordering[List[T]] = {
  implicitly[Ordering[T]]  // Access other implicits
}

def sort[T](xs: List[T])(implicit ord: Ordering[T]): List[T] = {
  xs.sorted(ord)
}

val sorted = sort(List(3, 1, 2))  // Implicit passed automatically

// Scala 3: New given syntax
given Ordering[Int] = Ordering.Int

given [T: Ordering]: Ordering[List[T]] with {
  def compare(x: List[T], y: List[T]): Int = ???
}

def sort[T](xs: List[T])(using ord: Ordering[T]): List[T] = {
  xs.sorted(ord)
}

val sorted = sort(List(3, 1, 2))  // Using passed automatically

Migration strategy:

Replace implicit with given at definition site
Replace implicit x: T with using x: T in parameter lists
Use implicitly[T] -> summon[T] for explicit summoning
Replace implicit class with extension

// Scala 2
implicit class RichString(s: String) {
  def shout: String = s.toUpperCase + "!"
}

"hello".shout  // "HELLO!"

// Scala 3
extension (s: String)
  def shout: String = s.toUpperCase + "!"

"hello".shout  // "HELLO!"

New Features in Scala 3

Enums (Algebraic Data Types)

Scala 3 introduces built-in enum syntax that makes defining algebraic data types far more concise than the sealed trait pattern. Enums capture the intent directly: you're defining a type with a fixed set of possible values. The compiler understands enum structure and can provide better error messages and exhaustiveness checking. Enums are particularly powerful for domain modeling where you want to ensure all cases are explicitly handled. This improves code readability and reduces boilerplate compared to Scala 2's sealed trait approach.

// Scala 2: Verbose sealed trait pattern
sealed trait Color
object Color {
  case object Red extends Color
  case object Green extends Color
  case object Blue extends Color
  case class Custom(hex: String) extends Color
}

// Scala 3: Enum syntax
enum Color:
  case Red
  case Green
  case Blue
  case Custom(hex: String)

// Enums with methods
enum Bool:
  case True
  case False

  def negate: Bool = this match
    case True => False
    case False => True

// Matching
color match
  case Color.Red => "red"
  case Color.Green => "green"
  case Color.Blue => "blue"
  case Color.Custom(hex) => hex

Opaque Types

Opaque types let you create distinct types that compile away to their underlying type at runtime, giving you type safety without performance overhead. This is perfect for creating newtype-like distinctions: you want UserId to be distinct from Int at compile time, but at runtime it's just an Int with no wrapper object. Unlike case class wrappers, opaque types have zero runtime cost. They're ideal for strongly-typed domain modeling, preventing errors like passing a UserId where an OrderId is expected, while maintaining the performance characteristics of the underlying type.

// Scala 3
opaque type UserId = String

object UserId:
  def apply(value: String): UserId = value

def getUserName(id: UserId): String = ???

// At compile time: UserId is different type
val id1: UserId = UserId("user-123")
val id2: UserId = UserId("user-456")
// id1 + id2  // ERROR: Cannot add UserId values

// At runtime: UserId is erased to String (zero cost abstraction!)

Structural Types

Structural types let you define interfaces based on structure (duck typing) rather than explicit inheritance, enabling ad-hoc type compatibility. If a type has the required methods, it's compatible—without needing explicit declaration. This is useful for integrating external libraries or writing generic code that doesn't assume a specific type hierarchy. However, use structural types sparingly: they sacrifice some type safety and clarity compared to explicit trait-based interfaces. They're best reserved for situations where you truly need duck typing or must work with unrelated types that have similar interfaces.

// Scala 3
def quack(x: { def quack(): String }): String = {
  x.quack()  // Works on any type with quack method
}

class Duck:
  def quack(): String = "Quack!"

class FakeDuck:
  def quack(): String = "Fake quack!"

quack(Duck())      // Works
quack(FakeDuck())  // Works

Inlined Code

The inline modifier tells the compiler to expand a function's body at every call site, eliminating the function call overhead and enabling compile-time metaprogramming. This is useful for performance-critical code where function call overhead matters, and for generating code at compile time based on type information. Inlining also allows better optimization because the compiler can see the surrounding context. However, it increases code size (binary bloat), so use it judiciously. Scala 3 makes inlining safer and more predictable than C++'s inline hints, and it's the foundation for compile-time code generation and macro-like functionality.

// Scala 3
inline def powerOf2(n: Int): Int = {
  n match
    case 1 => 2
    case 2 => 4
    case 3 => 8
    case _ => sys.error("Unknown power")
}

// Compiler inlines and optimizes
val x = powerOf2(1 + 1)  // Becomes: val x = 4

// Conditional compilation
inline def debug(msg: String): Unit = {
  if scala.compiletime.testing.isCompiletime then
    println(s"COMPILE TIME: $msg")
}

Match Types

Match types enable type-level pattern matching, allowing you to compute types based on other types at compile time. This is advanced territory: you define rules that transform input types into output types. Match types are useful for building type-safe abstractions that adapt their behavior based on input types, such as libraries that provide different APIs for different container types. They're particularly valuable in functional programming libraries where the API must adapt based on the concrete monad or functor being used. This is an advanced feature that most developers don't need, but it enables incredibly expressive type-safe abstractions.

// Scala 3
type Elem[T] = T match
  case String => Char
  case Array[t] => t
  case List[t] => t
  case _ => T

val s: Elem[String] = 'a'        // Char
val a: Elem[Array[Int]] = 0      // Int
val l: Elem[List[String]] = "hi" // String
val other: Elem[Int] = 42        // Int

Removed/Changed Features in Scala 2

DelayedInit Removed

Scala 2 constructor-side effects won't work:

// Scala 2
class App extends DelayedInit {
  def delayedInit(body: => Unit): Unit = println("Starting"); body
  println("Main code")  // Executes via delayedInit
}

// Scala 3: Use @main annotation
@main def hello(name: String): Unit =
  println(s"Hello, $name!")

Procedure Syntax Removed

Scala 2 allowed omitting Unit return type:

// Scala 2
def foo { println("bar") }  // Returns Unit

// Scala 3: Must be explicit
def foo: Unit = println("bar")

Automatic Tuple Unapplying

Scala 2 allowed implicit tuple unpacking in patterns:

// Scala 2
val pair = (1, "a")
pair match {
  case (a, b) => println(a)  // Automatic unpacking
}

// Scala 3: Still works, but more explicit
pair match {
  case Tuple2(a, b) => println(a)
}

View Bounds Deprecated

// Scala 2 (deprecated)
def foo[T <% Int](x: T): Int = x  // Uses implicit conversion

// Scala 3: Use implicit conversion instead
def foo[T](x: T)(using Conversion[T, Int]): Int = ???

Migration Tools

Scalafix

Automated code migrations:

# Install scalafmt for automatic formatting
brew install scalafmt
scalafmt --scala-version 3.3.1 src/

# Use scalafix for targeted migrations
sbt "scalafix RemoveUnusedImports"
sbt "scalafix MissingOptionDefault"

Manual Migration Checklist

Update scalaVersion in build.sbt to 3.x
Replace all implicit with given or extension
Replace implicit parameters with using
Convert sealed trait + case objects to enum
Remove implicit class → use extension
Replace implicitly[T] with summon[T]
Update dependent libraries to Scala 3 versions
Test with -Xmigration flag for warnings
Check for Java interop issues

Common Migration Issues

// Issue 1: Implicit resolution order changed
// Scala 2 priority: local scope, imports, companion objects
// Scala 3: More systematic (lexical scope first)

// Issue 2: Anonymous function syntax
// Scala 2: (x: Int) => x + 1
// Scala 3: (x: Int) => x + 1 (same)
// But: x => x + 1 (only works in limited contexts now)

// Issue 3: Wildcard imports
// Scala 2: import foo._
// Scala 3: import foo.* (both work, * preferred)

// Issue 4: Function type syntax
// Scala 2: Function2[Int, Int, Int]
// Scala 3: (Int, Int) => Int (preferred, clearer)

APPENDIX B: Common Pitfalls and How to Avoid Them

1. Null Pointer Exceptions from Uninitialized Variables

The Problem:

// DANGEROUS: var without initialization
var config: AppConfig = _  // _ means null
val result = config.getDatabaseUrl()  // NullPointerException!

Why It Happens: Scala allows var with uninitialized values (defaults to null for reference types), but doesn't track initialization, leading to silent nulls.

The Fix:

// Option 1: Always initialize
var config: AppConfig = loadConfigOrDefault()

// Option 2: Use Option (better)
var config: Option[AppConfig] = None
config = Some(loadConfig())

// Option 3: Use lazy val (best for single initialization)
lazy val config: AppConfig = loadConfig()

// Option 4: Use Try for error handling
val config: Try[AppConfig] = Try(loadConfig())

2. Incorrect Pattern Matching with Variables

The Problem:

// WRONG: binds x to the literal 5, not variable x
val x = 10
val y = 5 match {
  case x => x + 1  // x shadows outer x, returns 6
}
println(x)  // Still 10, but confusing!

// WRONG: using variable in pattern thinking it's value
val threshold = 10
List(5, 15, 8, 12) match {
  case List(a, threshold, c, d) =>  // threshold binds to 15, not compared!
    println(s"$a $threshold $c $d")  // Matches unconditionally
}

Why It Happens: Variables in patterns always bind to values, not compare. To compare values, use backticks.

The Fix:

// Use backticks for value comparison
val threshold = 10
List(5, 15, 8, 12).foreach {
  case x if x > `threshold` =>  // backticks access outer variable
    println(s"$x exceeds threshold")
  case x =>
    println(s"$x is within threshold")
}

// For pattern matching with guard
val x = 10
val result = 5 match {
  case n if n == x => n + 1  // Guard checks value
  case n => n  // n binds to value, not variable
}

3. Modifying Collections While Iterating

The Problem:

// WRONG: Concurrent modification exception
val list = scala.collection.mutable.ListBuffer(1, 2, 3, 4, 5)
list.foreach { item =>
  if (item % 2 == 0) {
    list -= item  // java.util.ConcurrentModificationException
  }
}

// WRONG: Silent incorrect behavior
val set = scala.collection.mutable.Set(1, 2, 3, 4, 5)
for (item <- set) {
  if (item % 2 == 0) {
    set -= item  // Unpredictable iteration
  }
}

Why It Happens: Modifying underlying collection during iteration violates iterator contracts.

The Fix:

// Option 1: Filter into new collection
val list = scala.collection.mutable.ListBuffer(1, 2, 3, 4, 5)
val odds = list.filter(_ % 2 != 0)

// Option 2: Collect mutations, apply after
val list = scala.collection.mutable.ListBuffer(1, 2, 3, 4, 5)
val toRemove = list.filter(_ % 2 == 0)
toRemove.foreach(list -= _)

// Option 3: Use immutable, assign back
var list = List(1, 2, 3, 4, 5)
list = list.filter(_ % 2 != 0)

// Option 4: Use iterator carefully
val set = scala.collection.mutable.Set(1, 2, 3, 4, 5)
val iter = set.iterator
while (iter.hasNext) {
  val item = iter.next()
  if (item % 2 == 0) {
    iter.remove()  // Proper removal during iteration
  }
}

4. Off-by-One Errors in Range Operations

The Problem:

// WRONG: Often forgets range is exclusive at upper bound
for (i <- 0 to 10) println(i)   // Prints 0-10 (inclusive)
for (i <- 0 until 10) println(i)  // Prints 0-9 (exclusive)

// WRONG: Creating wrong size array
val array = Array.ofDim[Int](5 to 10)  // Error: requires Int

Why It Happens: Two similar operators with different semantics; easy to confuse.

The Fix:

// Use descriptive naming
val count = 5
val indices = 0 until count  // 0 to count-1
val inclusive = 0 to (count - 1)  // Same, more explicit

// When in doubt, print
println(s"${(0 until 5).toList}")  // [0,1,2,3,4]
println(s"${(0 to 5).toList}")     // [0,1,2,3,4,5]

// Array sizing is usually better done explicitly
val n = 10
val array = new Array[Int](n)  // Size n, indices 0 to n-1

5. Shared Mutable State in Closures

The Problem:

// WRONG: All functions reference the same mutable var
def makeCounters(n: Int): List[() => Int] = {
  var count = 0
  (1 to n).map { _ =>
    () => { count += 1; count }  // All share same 'count'!
  }.toList
}

val counters = makeCounters(3)
println(counters(0)())  // 1
println(counters(1)())  // 2 (expected 1!)
println(counters(2)())  // 3 (expected 1!)

Why It Happens: Closures capture variables by reference, not value. All closures share the same mutable variable.

The Fix:

// Option 1: Capture by value using local scope
def makeCounters(n: Int): List[() => Int] = {
  (1 to n).map { index =>
    var count = 0  // Each closure gets its own 'count'
    () => { count += 1; count }
  }.toList
}

// Option 2: Use case classes for immutable state
def makeCounters(n: Int): List[() => Int] = {
  (1 to n).map { initialValue =>
    var currentValue = initialValue
    () => { currentValue += 1; currentValue }
  }.toList
}

// Option 3: Use functional approach without mutable state
def makeCounters(n: Int): List[() => Int] = {
  (1 to n).map { index =>
    val initialState = index
    () => initialState  // Immutable, captures value
  }.toList
}

6. Type Erasure with Generics

The Problem:

// WRONG: Cannot distinguish List[Int] from List[String] at runtime
def process(list: List[Int]): Unit = ???
def process(list: List[String]): Unit = ???  // ERROR: Duplicate signature!

// WRONG: Type information lost at runtime
def isIntList(obj: Any): Boolean = {
  obj match {
    case list: List[Int] => true  // WARNING: unchecked cast
    case _ => false
  }
}

// WRONG: Unsafe casting assumptions
val mixed: List[Any] = List(1, "two", 3)
val ints = mixed.asInstanceOf[List[Int]]  // Unsafe!
println(ints(1).toUpperCase)  // ClassCastException at runtime

Why It Happens: JVM erases generic type parameters at runtime for efficiency. List[Int] becomes just List in bytecode.

The Fix:

// Option 1: Use type tags/manifests (Scala 2)
import scala.reflect.ClassTag

def process[T: ClassTag](list: List[T]): Unit = {
  println(implicitly[ClassTag[T]])  // Can access type at runtime
}

// Option 2: Use different method names
def processInts(list: List[Int]): Unit = ???
def processStrings(list: List[String]): Unit = ???

// Option 3: Pattern match on container type, not element type
def isListOfAny(obj: Any): Boolean = {
  obj match {
    case _: List[_] => true  // Matches any List
    case _ => false
  }
}

// Option 4: Use wrapper type to preserve information
case class IntList(values: List[Int])
case class StringList(values: List[String])

def process(list: IntList): Unit = ???
def process(list: StringList): Unit = ???  // OK: Different types

7. Forgetting List is Persistent (Immutable)

The Problem:

// WRONG: Expecting mutation
var list = List(1, 2, 3)
list.map(_ * 2)  // Returns new List, doesn't modify original
println(list)    // Still [1, 2, 3], not [2, 4, 6]!

// WRONG: Inefficient rebuilding
var list = List(1, 2, 3)
list = list :+ 4  // Creates entire new list, O(n)
list = list :+ 5  // Again, O(n)
list = list :+ 6  // Again, O(n)
// Total: O(n²) for n appends

Why It Happens: List is immutable; operations return new lists. Append is inefficient for immutable List.

The Fix:

// Option 1: Use ListBuffer for building, then convert
val builder = scala.collection.mutable.ListBuffer(1, 2, 3)
builder += 4
builder += 5
val list = builder.toList  // O(n) single conversion

// Option 2: Build in one expression
val list = List(1, 2, 3).flatMap(x => List(x, x * 2))

// Option 3: Use Vector for append-friendly immutable collection
var list = Vector(1, 2, 3)
list = list :+ 4  // O(log n) for append
list = list :+ 5
list = list :+ 6
// Total: O(n log n) for n appends

// Option 4: Just use Array if mutability needed
val array = scala.collection.mutable.ArrayBuffer(1, 2, 3)
array += 4
array += 5

8. Comparing Objects with == Instead of Structure

The Problem:

// WRONG: References compared, not values
val a = new String("hello")
val b = new String("hello")
println(a == b)  // For String, true (String overrides ==)

// WRONG: Custom class without overriding ==
class User(val name: String) {
  // No hashCode/equals override
}
val user1 = new User("Alice")
val user2 = new User("Alice")
println(user1 == user2)  // false! Reference equality

val set = Set(user1)
set.contains(user2)  // false, different reference

Why It Happens: Default == compares references (identity) for custom classes. Case classes auto-generate structural equality, but regular classes don't.

The Fix:

// Option 1: Use case class (auto-generates equals, hashCode)
case class User(name: String, email: String)

val user1 = User("Alice", "alice@example.com")
val user2 = User("Alice", "alice@example.com")
println(user1 == user2)  // true!

// Option 2: Manually override equals and hashCode
class User(val name: String, val email: String) {
  override def equals(obj: Any): Boolean = obj match {
    case other: User =>
      this.name == other.name && this.email == other.email
    case _ => false
  }

  override def hashCode: Int = {
    java.util.Objects.hash(name, email)
  }
}

// Option 3: Use structural typing with ==
val a = ("Alice", "alice@example.com")
val b = ("Alice", "alice@example.com")
println(a == b)  // true! Tuples have structural equality

9. Shadowing Variables Accidentally

The Problem:

// WRONG: Inner scope shadows outer variable
val x = 10
def process(): Unit = {
  val x = 20  // Shadows outer x
  println(x)  // 20, but intended outer value?
}

// WRONG: In pattern matching
val value = 5
val result = List(1, 2, 3) match {
  case List(value, _, _) =>  // Shadows outer 'value'
    value + 10  // Uses pattern-matched value (1)
}
println(result)  // 11, but expected 15?

Why It Happens: Scala allows variable shadowing to simulate late binding and scope-local reasoning, but this can be confusing.

The Fix:

// Enable compiler warning for shadowing
// In build.sbt: scalacOptions += "-Wshadow"

// Rename to avoid shadowing
val outerX = 10
def process(): Unit = {
  val localX = 20
  println(localX)
}

// Use different names in patterns
val threshold = 5
val result = List(1, 2, 3) match {
  case List(first, _, _) =>  // Clear different name
    first + 10
}

// Use backticks to reference outer variable in pattern
val value = 5
val result = List(1, 2, 3) match {
  case List(`value`, _, _) =>  // Compares to outer 'value'
    value + 10
  case List(other, _, _) =>    // Binds to other
    other + 10
}

10. Ignoring Future/IO Exceptions

The Problem:

// WRONG: Exception silently swallowed
val future = Future {
  throw new Exception("Oops!")
}
// Future failed, but no one knows!

// WRONG: onComplete without proper error handling
import scala.concurrent.Future
val f = Future(riskyOperation())
f.onComplete {
  case scala.util.Success(value) => println(value)
  case scala.util.Failure(_) => println("Failed")  // Silent failure
}
// Program might exit before handling result

Why It Happens: Futures are asynchronous; exceptions don't propagate normally. onComplete callback might not execute before program exits.

The Fix:

// Option 1: Use Await for synchronous code
import scala.concurrent.Future, scala.concurrent.Await
import scala.concurrent.duration._

val future = Future(riskyOperation())
try {
  val result = Await.result(future, 5.seconds)
  println(result)
} catch {
  case e: TimeoutException => println("Operation timed out")
  case e: Exception => println(s"Operation failed: ${e.getMessage}")
}

// Option 2: Use map/flatMap for composition
val result = Future(riskyOperation())
  .map(processSuccess)
  .recover { case e => processError(e) }

// Option 3: Use IO monad (cats-effect)
import cats.effect._

val io = IO(riskyOperation())
  .handleError(e => s"Error: ${e.getMessage}")

io.unsafeRunSync()

// Option 4: Scala 3's async syntax
import scala.concurrent.{Future, Await}
import scala.async.Async.{async, await}

val result = async {
  val r1 = await(futureA)
  val r2 = await(futureB)
  r1 + r2  // Natural syntax
}

11. Accumulating Large Lists in Recursion

The Problem:

// WRONG: O(n²) list concatenation
def sumLists(lists: List[List[Int]]): List[Int] = {
  lists match {
    case Nil => Nil
    case head :: tail =>
      head ++ sumLists(tail)  // List.++ is O(n), called n times
  }
}

// WRONG: Building string via concatenation
def buildReport(items: List[String]): String = {
  var result = ""
  items.foreach { item =>
    result = result + "\n" + item  // O(n²)
  }
  result
}

Why It Happens: List concatenation copies the entire left list each time. Accumulating with concatenation is inefficient.

The Fix:

// Option 1: Use accumulator with prepend (O(n))
def sumLists(lists: List[List[Int]]): List[Int] = {
  @scala.annotation.tailrec
  def go(lists: List[List[Int]], acc: List[Int]): List[Int] = {
    lists match {
      case Nil => acc.reverse  // Reverse at end
      case head :: tail =>
        go(tail, head.reverse ++ acc)  // Prepend reversed
    }
  }
  go(lists, Nil)
}

// Option 2: Use flat method (more idiomatic)
def sumLists(lists: List[List[Int]]): List[Int] = {
  lists.flatten
}

// Option 3: Use ListBuffer for building
def sumLists(lists: List[List[Int]]): List[Int] = {
  val buffer = scala.collection.mutable.ListBuffer[Int]()
  lists.foreach { list =>
    buffer ++= list
  }
  buffer.toList
}

// Option 4: Use StringBuilder for strings
def buildReport(items: List[String]): String = {
  val sb = new StringBuilder()
  items.foreach { item =>
    sb.append(item).append("\n")
  }
  sb.toString()
}

12. Not Handling Partial Functions Safely

The Problem:

// WRONG: MatchError if not all cases covered
val f: PartialFunction[Int, String] = {
  case 1 => "one"
  case 2 => "two"
}

f(3)  // MatchError: no case match for 3

// WRONG: Assuming isDefinedAt without checking
val evens: PartialFunction[Int, String] = {
  case n if n % 2 == 0 => "even"
}

List(1, 2, 3, 4, 5).map(evens)  // MatchError for odd numbers!

Why It Happens: Partial functions are only defined for some inputs. Calling them with undefined inputs throws MatchError.

The Fix:

// Option 1: Check isDefinedAt
val f: PartialFunction[Int, String] = {
  case 1 => "one"
  case 2 => "two"
}

if (f.isDefinedAt(3)) {
  println(f(3))
} else {
  println("Not defined")
}

// Option 2: Use applyOrElse
val result = f.applyOrElse(3, (_: Int) => "unknown")

// Option 3: Use collect (filters to defined cases)
val results = List(1, 2, 3, 4, 5).collect {
  case n if n % 2 == 0 => s"$n is even"
}
// Only processes 2, 4 (even numbers)

// Option 4: Use lift to convert to Option
val lifted = f.lift
lifted(1)  // Some("one")
lifted(3)  // None

// Option 5: Use match expressions (not PartialFunctions)
def describe(n: Int): String = n match {
  case 1 => "one"
  case 2 => "two"
  case _ => "other"
}

13. Lazy Evaluation Gotchas

The Problem:

// WRONG: Lazy val evaluated multiple times due to error
lazy val config: Config = {
  println("Loading config...")
  loadConfigFromFile()  // Throws exception
}

try {
  val c = config  // First access, loads config, throws
} catch { case _ => }

try {
  val c = config  // Retries loading? No, still throws
} catch { case _ => }

// WRONG: Lazy val causing deadlock in circular dependency
lazy val a: Int = b + 1
lazy val b: Int = a + 1
val x = a  // Infinite recursion!

Why It Happens: Lazy vals evaluate once and cache result. If initialization throws, that exception is cached and re-thrown. Circular lazy dependencies don't cause infinite loops; they cause StackOverflowError.

The Fix:

// Option 1: Use Try for error-safe lazy loading
import scala.util.Try

lazy val config: Try[Config] = Try(loadConfigFromFile())

val result = config match {
  case scala.util.Success(cfg) => cfg
  case scala.util.Failure(e) => Config.default()
}

// Option 2: Use Option for optional lazy values
lazy val config: Option[Config] = {
  try {
    Some(loadConfigFromFile())
  } catch {
    case e => None
  }
}

// Option 3: Avoid circular lazy dependencies
// Break cycles with immediate values or methods
val a: Int = 10
def b: Int = a + 1  // Method, not lazy val
def c: Int = b + 1

// Option 4: Use IO monad for delayed computation
import cats.effect.IO

val config: IO[Config] = IO(loadConfigFromFile())
// Deferred - only executes when explicitly run

14. Implicit Type Class Inference Failures

The Problem:

// WRONG: Implicit not found when needed
def format[T](value: T): String = {
  implicitly[Formatter[T]].format(value)
}

implicit val intFormatter: Formatter[Int] = new Formatter[Int] {
  def format(x: Int): String = x.toString
}

format(42)  // Works
format("hello")  // Compile error: could not find Formatter[String]

// WRONG: Ambiguous implicits
implicit val formatter1: Formatter[Int] = ???
implicit val formatter2: Formatter[Int] = ???

format(42)  // Compile error: ambiguous implicit

Why It Happens: Implicit resolution is sensitive to scope and specificity. Compiler can't disambiguate when multiple candidates exist.

The Fix:

// Option 1: Provide all necessary instances
implicit val intFormatter: Formatter[Int] = ???
implicit val stringFormatter: Formatter[String] = ???

// Option 2: Use trait with default instances
trait Formatter[T] {
  def format(value: T): String
}

object Formatter {
  implicit val intFormatter: Formatter[Int] = ???
  implicit val stringFormatter: Formatter[String] = ???
}

// Option 3: Be specific about which implicit to use
implicit val preferredFormatter: Formatter[Int] = ???

// Option 4: Use extension methods to be explicit
implicit class FormattableOps[T](value: T) {
  def formatted(implicit fmt: Formatter[T]): String = fmt.format(value)
}

42.formatted  // Explicit method call

// Option 5: Move imports to control scope
// Import only needed implicits in scope

15. Memory Leaks from Retained References

The Problem:

// WRONG: Cache never evicts, grows indefinitely
class DataCache {
  private val cache = scala.collection.mutable.Map[String, Data]()

  def get(key: String): Data = {
    cache.getOrElseUpdate(key, loadData(key))
  }
}

// WRONG: Static reference prevents GC
object EventTracker {
  private val allEvents = scala.collection.mutable.ListBuffer[Event]()

  def trackEvent(event: Event): Unit = {
    allEvents += event  // Never cleared
  }
}

// WRONG: Listener not unregistered
class Subject {
  private val listeners = scala.collection.mutable.ListBuffer[Observer]()

  def addListener(l: Observer): Unit = listeners += l

  def notifyListeners(): Unit = {
    listeners.foreach(_.update())
  }
}

val subject = new Subject()
subject.addListener(largeObject)
// largeObject held indefinitely even if no longer needed

Why It Happens: References in mutable collections are never cleared. GC can only collect objects with no live references.

The Fix:

// Option 1: Use weak references for caches
import scala.ref.WeakReference

class DataCache {
  private val cache = scala.collection.mutable.Map[String, WeakReference[Data]]()

  def get(key: String): Data = {
    cache.get(key) match {
      case Some(ref) =>
        ref.get match {
          case Some(data) => data
          case None =>
            val data = loadData(key)
            cache(key) = new WeakReference(data)
            data
        }
      case None =>
        val data = loadData(key)
        cache(key) = new WeakReference(data)
        data
    }
  }
}

// Option 2: Use caffeine or Guava caches with eviction
// (Requires external library)

// Option 3: Explicitly unregister listeners
class Subject {
  private val listeners = scala.collection.mutable.ListBuffer[Observer]()

  def addListener(l: Observer): Unit = listeners += l

  def removeListener(l: Observer): Unit = listeners -= l

  def notifyListeners(): Unit = {
    listeners.foreach(_.update())
  }
}

val subject = new Subject()
val observer = new MyObserver()
subject.addListener(observer)

// ... use observer ...

subject.removeListener(observer)  // Explicit cleanup

// Option 4: Use WeakHashMap for listener storage
class Subject {
  // Keys are garbage collected automatically
  private val listeners = scala.collection.concurrent.TrieMap[Observer, Boolean]()

  def addListener(l: Observer): Unit = listeners(l) = true
  def notifyListeners(): Unit = {
    listeners.keys.foreach(_.update())
  }
}

// Option 5: Use Try-with-resources pattern (Scala 3)
def processLargeDataset(filename: String): Unit = {
  Using.resource(scala.io.Source.fromFile(filename)) { source =>
    source.getLines().foreach(process)
  }  // Resource automatically closed here
}