Functions — First-Class Citizens | Scala Programming Guide

- Published on

Defining Functions
In Scala, functions are first-class values—you can pass them around like integers or strings. This is fundamental to functional programming. Let's start with the basics:
// Simple function: takes two integers, returns their sum
def add(a: Int, b: Int): Int = {
a + b // Last expression is the return value
}
// Function can be one-liner (no braces needed)
def multiply(a: Int, b: Int): Int = a * b
// Function returning Unit (like void)
def logMessage(msg: String): Unit = {
println(msg)
}
// Call functions normally
val sum = add(10, 20) // 30
val product = multiply(5, 3) // 15
logMessage("Hello Scala!")
Syntax note: def defines a function. The signature is def name(params): ReturnType = body. The return type is required at the top level (though Scala can infer it in some contexts). Note that there's no return keyword—the last expression in the function body is the return value. This aligns with Scala's expression-oriented nature: everything is an expression.
When a function has only one expression, you can omit the braces: def multiply(a: Int, b: Int): Int = a * b. When you have multiple statements, use braces and let the last expression be the return value.
Parameters, Return Types, Default Values, Named Arguments
Functions can have default values for parameters, making them more flexible and reducing boilerplate:
// Default values for parameters
def greetArtist(name: String, genre: String = "unknown"): String = {
s"$name creates $genre music"
}
greetArtist("M83") // "M83 creates unknown music"
greetArtist("M83", "synthwave") // "M83 creates synthwave music"
// Named arguments (call in any order)
greetArtist(genre = "synthpop", name = "Chvrches")
// "Chvrches creates synthpop music"
// Multiple parameter lists (currying—we'll explore this more later)
def buildQuery(host: String)(port: Int)(database: String): String = {
s"postgresql://$host:$port/$database"
}
buildQuery("localhost")(5432)("musicdb")
// "postgresql://localhost:5432/musicdb"
// But you can also call it partially—we'll return to this
Named arguments make code more readable. Instead of calculate(42, 100, true) where you must remember the order, you can write calculate(value = 42, maxSize = 100, debug = true). The order doesn't matter, and readers immediately understand what each argument means.
Default values reduce the number of overloads you need to write. Instead of three versions of greetArtist (one for just name, one for name and genre, one for everything), you write one with a default for genre.
Varargs: Variable Arguments
Functions can accept a variable number of arguments:
// *args syntax collects arguments into a Seq (sequence)
def addMany(numbers: Int*): Int = {
numbers.sum // Seq has a sum method
}
addMany(1, 2, 3, 4, 5) // 15
addMany(10, 20) // 30
addMany() // 0 (empty sequence)
// Combining varargs with other parameters
def reportScores(title: String, scores: Int*): String = {
s"$title: average is ${scores.sum.toDouble / scores.length}"
}
reportScores("Monthly Stats", 85, 92, 78, 88)
// "Monthly Stats: average is 85.75"
// If you have a collection and want to pass it as varargs, use *
val numbers = List(1, 2, 3)
addMany(numbers: _*) // Unpacks the list into individual arguments
The Int* syntax means "zero or more Ints." Inside the function, numbers is a Seq[Int]. This is useful for functions like max, min, sum, etc., where you might have varying numbers of arguments.
Nested Functions
Functions can be defined inside other functions. This is useful for encapsulation and keeping helper logic local:
def processPlaylist(name: String): String = {
// Helper function defined inside
def formatDuration(seconds: Int): String = {
val minutes = seconds / 60
val secs = seconds % 60
f"$minutes:$secs%02d" // Format with leading zero
}
// Use the helper
val duration = formatDuration(245)
s"Playlist: $name, Duration: $duration"
}
println(processPlaylist("Night Driving"))
// "Playlist: Night Driving, Duration: 4:05"
// The helper function is scoped—can't be called from outside processPlaylist
// formatDuration(100) <- ERROR: not in scope
By nesting formatDuration, you keep your namespace clean and show readers: "This helper is only used here." If the helper were defined at the top level, readers might wonder if it's used elsewhere. Nesting makes the intent clear: this is internal bookkeeping for this specific function.
Anonymous Functions (Lambdas)
You can define functions without names, useful for short callbacks:
// Anonymous function syntax: (params) => body
val double = (x: Int) => x * 2
double(5) // 10
// Multi-line anonymous functions use braces
val greet = (name: String) => {
val greeting = s"Hello, $name!"
greeting
}
greet("Alice") // "Hello, Alice!"
// No parameters
val randomNumber = () => scala.util.Random.nextInt(100)
randomNumber() // Random value each time
// Scala often infers the parameter type
val numbers = List(1, 2, 3, 4, 5)
val doubled = numbers.map((x: Int) => x * 2) // Type specified
val doubled = numbers.map(x => x * 2) // Type inferred from context
Anonymous functions are functions without names. They're especially useful as arguments to higher-order functions. For example, map expects a function—you can pass a named function, or inline an anonymous function. Inlining often reads better because the logic is right where it's used.
The Underscore Placeholder
When a function's parameter is used exactly once, Scala lets you use _ as a placeholder:
// These are equivalent:
val doubled1 = numbers.map((x) => x * 2)
val doubled2 = numbers.map(_ * 2)
val filtered1 = numbers.filter((x) => x > 3)
val filtered2 = numbers.filter(_ > 3)
// With multiple parameters, each _ is a different parameter
val pairs = List((1, 2), (3, 4), (5, 6))
val sums = pairs.map((a, b) => a + b)
val sums = pairs.map(_ + _) // First _ is the tuple, then destructured
// Note: _ + _ for a tuple doesn't work as expected; be explicit:
val sums = pairs.map { case (a, b) => a + b }
The underscore is convenient for very simple expressions. Using _ * 2 is clear and concise. Using _ + _ for two parameters is also clear. But for more complex logic, be explicit: x => x.someMethod() is clearer than chaining underscores.
The underscore is a placeholder for the next parameter. Multiple underscores represent multiple parameters in order: the first _ is the first param, the second _ is the second param, etc. This works for simple expressions; for destructuring complex types, explicit parameters are clearer.
Currying: Functions Taking Functions
Currying is when a function that takes multiple arguments is rewritten as a chain of single-argument functions. In Scala, this is a practical tool, not just theory. It allows you to partially apply functions, fixing some arguments and leaving others free for later. This is powerful for creating specialized functions and for enabling implicit parameters.
// Non-curried version
def add(a: Int, b: Int): Int = a + b
// Curried version (returns a function)
def addCurried(a: Int): Int => Int = {
(b: Int) => a + b
}
// Use it
val addFive = addCurried(5) // Returns a function
val result = addFive(10) // Call the returned function → 15
// Or call immediately
addCurried(5)(10) // 15
// Scala has syntactic sugar for this:
def addCurriedSugar(a: Int)(b: Int): Int = a + b
addCurriedSugar(5)(10) // Same as above
// Why? Currying allows you to "fix" some arguments:
val add10 = addCurriedSugar(10)(_: Int)
add10(5) // 15
add10(20) // 30
When you curry a function, you're creating a chain of functions, each taking one argument. addCurried(5) returns a function that takes an Int and adds 5 to it. This function can be stored, passed around, and called later. This is useful for creating specialized functions from general ones.
Currying shines with implicit parameters (Chapter 6) and higher-order functions. Consider:
// A function that applies a transformation to each element
def mapWithPrefixAndSuffix(prefix: String)(suffix: String)(items: List[String]): List[String] = {
items.map(item => s"$prefix$item$suffix")
}
// Fix the prefix and suffix, vary the list
val withBrackets = mapWithPrefixAndSuffix("[")("]")(_: List[String])
withBrackets(List("A", "B", "C")) // List("[A]", "[B]", "[C]")
val withArrows = mapWithPrefixAndSuffix("-> ")(":end")(_: List[String])
withArrows(List("play", "pause")) // List("-> play:end", "-> pause:end")
Here, by currying and partially applying, we create specialized functions for common prefixes and suffixes. Each call to mapWithPrefixAndSuffix with a specific prefix and suffix returns a function that can be reused many times. This avoids repetition and makes code more composable.
By-Name Parameters: Lazy Evaluation
Normally, function arguments are evaluated immediately before the function is called. By-name parameters flip this: the argument is evaluated each time you use it inside the function. This enables lazy evaluation and is the foundation of control flow abstractions in Scala.
def printTwice(message: String): Unit = {
println(message)
println(message)
}
def expensiveComputation(): String = {
Thread.sleep(1000) // Slow operation
"Done!"
}
// This runs the expensive computation once, prints the result twice
printTwice(expensiveComputation())
With by-name parameters, the argument is evaluated each time you use it:
// Note the => before the type
def printTwiceByName(message: => String): Unit = {
println(message) // Computes here
println(message) // Computes again
}
// The expensive computation runs twice!
printTwiceByName(expensiveComputation())
// This is powerful for control flow:
def ifDebug(condition: Boolean)(block: => Unit): Unit = {
if (condition) block // Only evaluates if condition is true
}
// The expensive computation only runs if debug is true
ifDebug(debugEnabled) {
println("Expensive debug info: " + expensiveComputation())
}
By-name parameters enable lazy evaluation. Instead of always computing the argument and passing the value, Scala creates a function that computes the argument and passes that. Inside the function, each use of the by-name parameter triggers the computation.
This is powerful for control flow. Imagine ifDebug—without by-name parameters, the expensive computation would run regardless of whether debug was enabled (because Java evaluates arguments before the function). With by-name parameters, the computation is wrapped in a function that only runs if the condition is true. This is how Scala implements conditional logging without performance overhead when logging is disabled.
By-name parameters are the foundation for DSLs and control flow abstractions in Scala. We'll see them used extensively in later chapters.
Methods vs Functions: The Subtle Difference
In Scala, methods and functions are similar but not identical. Understanding this distinction is important when working with higher-order functions:
// A method is defined with def, belongs to a class/object
class Calculator {
def add(a: Int, b: Int): Int = a + b
}
// A function is a value of type Function2[Int, Int, Int]
val addFunction: (Int, Int) => Int = (a, b) => a + b
// Syntactically similar, semantically different:
val calc = new Calculator
calc.add(5, 3) // Method call
addFunction(5, 3) // Function call
// A function can be passed around; a method can't (directly)
def applyTwice(f: (Int, Int) => Int, x: Int, y: Int): Int = {
f(x, y) + f(x, y)
}
applyTwice(addFunction, 5, 3) // Works: pass the function
// applyTwice(calc.add, 5, 3) // ERROR: calc.add is a method, not a function
// To convert a method to a function, use _
applyTwice(calc.add(_, _), 5, 3) // Now it works
Methods are procedures defined in a class or object. They operate on the receiver object (implicit this). Functions are values—they can be stored in variables, passed as arguments, returned from functions. Methods can't be passed directly; you must convert them to functions using the _ syntax (which creates an eta-expansion—a wrapper function around the method).
In practice: This distinction rarely matters in everyday code. Just know that methods belong to objects, functions are values. When you need to pass something around, make it a function or use _ to convert a method to a function.
Tail Recursion and @tailrec
Recursion is beautiful but can overflow the call stack for large inputs. Scala optimizes tail recursion—when a function's last operation is a recursive call. The compiler rewrites tail-recursive functions into loops, eliminating stack overhead.
// Non-tail-recursive (bad for large n)
def factorial(n: Int): Int = {
if (n <= 1) 1
else n * factorial(n - 1) // Recursive call, then multiply
}
// Tail-recursive (optimized by compiler)
def factorialTail(n: Int, accumulator: Int = 1): Int = {
if (n <= 1) accumulator
else factorialTail(n - 1, n * accumulator) // Recursive call is last operation
}
// Annotate with @tailrec to assert and verify tail recursion:
import scala.annotation.tailrec
@tailrec
def factorialTailAnnotated(n: Int, accumulator: Int = 1): Int = {
if (n <= 1) accumulator
else factorialTailAnnotated(n - 1, n * accumulator)
}
// If you remove the recursive call from the tail position, compile fails:
@tailrec
def badTail(n: Int): Int = {
if (n <= 1) 1
else n * badTail(n - 1) // ERROR: not in tail position
}
// Use tail recursion for algorithms that process sequences:
@tailrec
def sum(numbers: List[Int], acc: Int = 0): Int = {
if (numbers.isEmpty) acc
else sum(numbers.tail, acc + numbers.head)
}
sum(List(1, 2, 3, 4, 5)) // 15
In the non-tail-recursive factorial, the recursive call happens, then the result is multiplied. The call stack must remember each frame so it can perform the multiplication after the recursion returns. For factorial(1000), you'd have 1000 stack frames—stack overflow.
In tail-recursive factorialTail, the recursive call is the last operation. The compiler optimizes this into a loop—the same machine code you'd get from writing a while loop manually. The @tailrec annotation tells the compiler: "I intend this to be tail-recursive. If it's not, fail at compile time." This catches mistakes where you accidentally break tail recursion.
In practice: Tail recursion matters for algorithms processing large datasets. For most code, use loops or collection methods (.map, .fold, etc.). But when recursion is natural and the data is large, make sure it's tail-recursive and annotate it with @tailrec.