The Foldable type class captures the concept of data structures that we can iterate over. Lists are foldable, as are Vectors and Streams.
Foldables give us a way to process values embedded in a structure as if they existed in a sequential order, as we’ve seen with list folding.
Foldable gives us great use cases for monoids and the Eval monad.
In general, a fold function allows users to transform one algebraic data type to another.
The typical use case is to accumulate a value as we traverse. We supply an accumulator value and a binary function to combine it with an item in the sequence.
The function produces another accumulator, allowing us to recurse down the sequence. When we reach the end, the final accumulator is our result.
Cats’ Foldable abstracts the two operations foldLeft and foldRight into a type class:
trait Foldable[F[_]] { self =>
def foldLeft[A, B](fa: F[A], b: B)
(f: (B, A) => B): B
def foldRight[A, B](fa: F[A], lb: Eval[B])
(f: (A, Eval[B]) => Eval[B]): Eval[B]
//...
}
Cats provides out-of-the-box instances of Foldable for a handful of Scala data types: List, Vector, Stream, Option, and Map.
import cats.Foldable
val ints = List(1, 2, 3)
import cats.instances.list._
Foldable[List].foldLeft(ints, 0)(_ + _)
//res0: Int = 6
The Foldable instance for Map allows us to fold over its values (as opposed to its keys).
Because Map has two type parameters, we have to fix one of them to create the single-parameter type constructor we need to summon the Foldable:
import cats.std.map._
type StringMap[A] = Map[String, A]
val stringMap = Map("a" -> "b", "c" -> "d")
Foldable[StringMap].foldLeft(stringMap, "nil")(_ + "," + _)
//res1: String = nil,b,d
Foldable defines foldRight differently from foldLeft, in terms of the Eval monad:
def foldRight[A, B](fa: F[A], lb: Eval[B])
(f: (A, Eval[B]) => Eval[B]): Eval[B]
Using Eval means folding with Foldable is always stack safe, even when the collection’s default definition of foldRight is not.
For example, the default implementation for Stream is not stack safe. We can see the stack depth changing as we iterate across the stream.
import cats.Eval
import cats.Foldable
def stackDepth: Int = Thread.currentThread.getStackTrace.length
The longer the stream, the larger the stack requirements for the fold. A sufficiently large stream will trigger a StackOverflowException:
(1 to 5).toStream.foldRight(0) {
(item, accum) => println(stackDepth)
item + accum
}
//60
//58
//56
//54
//52
//res2: Int = 15
However since Eval's map and flatMap are trampolined, Foldable's foldRight maintains the same stack depth throughout:
import cats.std.stream._
val s = (1 to 5).toStream
val accum: Eval[Int] = Eval.now(0)
val result: Eval[Int] =
Foldable[Stream].foldRight(s, accum) {
(item: Int, accum: Eval[Int]) =>
println(stackDepth)
accum.map(_ + item)
}
result.value
//55
//55
//55
//55
//55
//res3: Int = 15
Cats’ Foldable provides us with a host of useful methods defined on top of foldLeft.
Many of these are facsimiles of familiar methods from the standard library, including find, exists, forall, toList, isEmpty, and nonEmpty:
Foldable[Option].nonEmpty(Option(42))
//res4: Boolean = true
Foldable[List].find(List(1, 2, 3))(_ % 2 == 0)
//res5: Option[Int] = Some(2)
In addition to these familiar methods, Cats provides two methods that make use of monoids:
fold (and its alias combineAll) combines all elements in the sequence using their Monoid
foldMap maps a user-supplied function over the sequence and combines the results using a Monoid
For example, we can use combineAll to sum over a List[Int]:
import cats.std.int._ // import Monoid[Int]
Foldable[List].fold(List(1, 2, 3))
//res6: Int = 6
Alternatively, we can use foldMap to convert each Int to a String and concatenate them:
import cats.std.string._ // import Monoid[String]
Foldable[List].foldMap(List(1, 2, 3))(_.toString)
//???
The interesting thing about fold and foldMap is that they use a Monoid instead of a function to give us the final result.
One very important aspect to understand here is that it is the fold function that requires the elements of Foldable to have a Monoid instance, while Foldable itself does not have that restriction.
trait Foldable[F[_]] { self =>
//foldLeft, foldRight etc
def fold[A](fa: F[A])
(implicit A: Monoid[A]): A =
foldLeft(fa, A.empty) { (acc, a) =>
A.combine(acc, a)
}
def foldMap[A, B](fa: F[A])
(f: A => B)
(implicit B: Monoid[B]): B =
foldLeft(fa, B.empty) { (b, a) =>
B.combine(b, f(a))
}
}
The same goes for foldM, which implements left associative monadic folding using an implicit Monad:
trait Foldable[F[_]] { self =>
//foldLeft, foldRight, fold, foldMap etc
def foldM[G[_], A, B](fa: F[A], z: B)
(f: (B, A) => G[B])
(implicit G: Monad[G]): G[B] =
foldLeft(fa, G.pure(z)) { (gb, a) =>
G.flatMap(gb)(b => f(b, a))
}
}
import cats.implicits._
def binSmalls(acc: Int, x: Int): Option[Int] =
if (x > 9) none[Int] else (acc + x).some
(Foldable[List].foldM(List(2, 8, 3, 1), 0) {binSmalls})
//???
(Foldable[List].foldM(List(2, 10, 3, 1), 0) {binSmalls})
//???
Finally, we can compose Foldables to support deep traversal of nested sequences:
import cats.std.vector._
val deepFold = Foldable[List].compose(Foldable[Vector])
val ints = List(Vector(1, 2, 3), Vector(4, 5, 6))
deepFold fold ints
//res7: Int = 21
In functional programming it is very common to encode "effects" as data types - common effects include Option for possibly missing values, Xor and Validated for possible errors, and Future for asynchronous computations.
These effects tend to show up in functions working on a single piece of data - for instance parsing a single String into an Int, validating a login, or asynchronously fetching website information for a user.
def parseInt(s: String): Option[Int] = ???
import cats.data.Xor
import scala.concurrent.Future
trait SecError
trait Token
def validateLogin(cred: Token): Xor[SecError, Unit] = ???
trait Profile
trait User
def userInfo(user: User): Future[Profile] = ???
Each function asks only for the data it actually needs; in the case of userInfo, a single User.
We could write a function that takes a list of users and fetches profiles for all of them, but that would be a bit strange.
If we just wanted to fetch the profile of just one user, we would either have to wrap it in a List or write a separate function that takes in a single user anyways.
More fundamentally, functional programming is about building lots of small, independent pieces and composing them to make larger and larger pieces - does this hold true in this case?
Given just User => Future[Profile], what should we do if we want to fetch profiles for a List[User]? We could try familiar combinators like map:
def profilesFor(users: List[User]) = users.map(userInfo)
profilesFor: (users: List[User])List[Future[Profile]]
Note the return type List[Future[Profile]]. This makes sense given the type signatures, but seems unwieldy.
We now have a list of asynchronous values, and to work with those values we must then use the combinators on Future for every single one.
It would be nicer instead if we could get the aggregate result in a single Future, say a Future[List[Profile]].
As it turns out, the Future companion object has a traverse method on it.
However, that method is specialized to standard library collections and Futures - there exists a much more generalized form that would allow us to do things like parse a List[String] or validate credentials for a List[User].
Enter Traverse.
Traverse depends on Applicative (and thus Functor) as well as Foldable:
trait Traverse[F[_]] extends Functor[F] with Foldable[F] { self =>
def traverse[G[_]: Applicative, A, B]
(fa: F[A])(f: A => G[B]): G[F[B]]
def sequence[G[_]: Applicative, A]
(fga: F[G[A]]): G[F[A]] = traverse(fga)(ga => ga)
//...
}
sequence threads all the G effects through the F structure to invert the structure from F[G[_]] to G[F[_]].
traverse allows you to transform elements inside the structure like a Functor, producing applicative effects along the way, and lift those instances of applicative structure outside of the Traversable structure.
Given a function which returns a G effect, traverse threads this effect through the running of this function on all the values in F, returning an F[A] in a G context.
In our above examples, F is List, and G is Option, Xor, or Future.
For the profile example, traverse says given a List[User] and a function User => Future[Profile], it can give you a Future[List[Profile]].
More generally, F[_] is some sort of context which may contain a value (or several). While List tends to be among the most general cases, there also exist Traversable instances for Option, Xor, and Validated (among others).
One way to think of traverse is as a generalization of sequence:
List(1,2,3).traverse(_.some)
//res8 = ???
List(1,2,3).map(_.some).traverse(identity)
//res9 = ???
List(1,2,3).map(_.some).sequence
//res10 = ???
Another is as a generalization of map:
import cats.syntax.traverse._
List(1,2,3).traverse[Id, Int]((x: Int) => x + 1)
//res11 = ???
Type signature of traverse in relation to map and flatMap:
def map(f: A => B) : F[A] => F[B]
def traverse(f: A => G[B]): F[A] => G[F[B]]
def flatMap(f: A => F[B]): F[A] => F[B]
We’re still mapping a function over some embedded value(s), like map, but similar to flatMap, the function is itself generating more structure.
However, unlike flatMap, the generated structure is of a different type than the embedded structure.
Finally, note the implied relationship between Foldable and Traverse.
Foldable does not extend Monoid, but has methods that rely on monoidal values (e.g. def fold[A: Monoid](fa: F[A]))
Traverse does not extend Applicative, but has methods that rely on applicative structures (e.g. def traverse[G[_]: Applicative, A, B](fa: F[A])(f: A => G[B]))
Traverse can also express foldMap and by extension foldLeft and foldRight.
Suppose that our G is a type constructor Const that takes any type to Int, so that Const[A] throws away its type argument A and just gives us an Int:
import cats.data.Const
Const(1)
//res12: Const[Int,Nothing] = Const(1)
Const(1) map { (_: String) + "!" }
//res13: Const[Int,String] = Const(1)
When A forms a Semigroup, an Apply is derived, and when A form a Monoid, an Applicative is derived automatically.
import cats.syntax.apply._
Const(2).retag[String => String] ap Const(1).retag[String]
//res14: Const[Int,String] = Const(3)
With a Const functor we can turn any Monoid into an Applicative, so with a Monoid[M] we should be able to get a foldMap from our traverse.
If we instantiate G to be Const[M, Nothing], traverse begins to look a lot like foldMap from Foldable
def traverse[A,B]
(fa: F[A])
(f: A => Const[M, Nothing]): Const[M, F[Nothing]]
def foldMap [A,B]
(fa: F[A])
(f: A => M): M
def foldMap[A, M: Monoid, F[_]:Traverse]
(fa: F[A])
(f: A => M): Const[M,F[Nothing]] =
fa traverseU { (a: A) => Const((f(a))) }
}
foldMap(List('a', 'b', 'c')) { c: Char => c.toInt }
//res15: Const[Int,List[Nothing]] = Const(294)
foldMap(Nil) { c: Char => c.toInt }
//res16: Const[Int,List[Nothing]] = Const(0)
Note that we are using traverseU, which is the Unapply variant of traverse.
If we let our aggregator exit the Const then we get foldMap exactly:
def foldMap[A, M: Monoid, F[_]:Traverse]
(fa: F[A])
(f: A => M): M =
{
val x = fa traverseU { (a: A) => Const((f(a))) }
x.getConst
}
foldMap(List('a', 'b', 'c')) { c: Char => c.toInt }
//res17: Int = 294
Here is the cats implementation of traverse for List:
def traverse[G[_], A, B]
(fa: List[A])
(f: A => G[B])
(implicit G: Applicative[G]): G[List[B]] =
foldRight[A, G[List[B]]](fa, Always(G.pure(List.empty))) {
(a, glb) => G.map2Eval(f(a), glb)(_ :: _)
}.value
Note that map2Eval is similar to map2 but uses Eval to allow for laziness in the second argument.
Here is a standalone implementation using ap:
def myTraverse[G[_], A, B]
(fa: List[A])
(f: A => G[B])
(implicit G: Applicative[G]): G[List[B]] =
fa.foldRight[G[List[B]]](G.pure(List.empty)) {
val cons = (h: B, t: List[B]) => h :: t
(a, glb) => G.ap(f(a).map(cons.curried))(glb)
}
Recall the ap type signature: ap(f: F[A => B]): F[A] => F[B]. What is being applied here?
Let's do a specific example where A and B are Int, and G is Option:
val l = List(1,2,3)
val cons = (h: Int, t: List[Int]) => h :: t
Apply[Option].ap(None.map(cons.curried))(l.some)
//???
def f(i: Int): Option[Int] = if(i % 2 == 0) Some(i) else None
val baz = (a: Int, glb: Option[List[Int]]) =>
Apply[Option].ap(f(a).map(cons.curried))(glb)
baz(1, Some(l))
//???
baz(4, Some(l))
//???
Quick State monad refresher:
type IntState[A] = State[Int, A]
def pure[A](a: A): IntState[A] = State(s => (s, a ))
def set(s: Int): IntState[Unit] = State(_ => (s, ()))
val l = List(1,2,3)
val a = myTraverse(l)(pure)
a.run(0).value
//???
val b = myTraverse(l)(set)
b.run(0).value
//???
We'll do a more involved example of traversing with State in the tutorial.
Traverse LawsTraverse[T[_]] has two laws.
There are many ways to state the laws, we present one way here, but you may see them in different configurations elsewhere.
The identity law for Traverse can be stated as:
sequence[Id,A](xs) == xs
The identity law says that traversing in the identity applicative (type Id[X] = X) has no effect.
The composition law for Traverse can be stated as:
sequence[Lambda[a => F[G[a]]],A](xs) ==
map(sequence[F,G[A]](xs))(sequence[G,A])
The fusion law says that traversal in F[_] followed by traversal in G[_] can be fused into one traversal in the composite applicative F[G[_]].
Last week we tried to implement flatMap with a monad composition:
def flatMap[A, B](fa: M1[M2[A]])(f: A => M1[M2[B]]): M1[M2[B]]
We saw that we'd end up with something of type M1[M2[M1[M2[B]]]]. This is normally irreducible,
However if M2 is traversable then we can use its sequence method to swap layers and get something of type M1[M1[M2[M2[B]]]], then use the joins from M1 and M2 to get a result of type M1[M2[B]].
The tutorial contains an exercise on combining two monads into one composite monad when the inner monad has a Traverse instance.
While the types always allow this, the result fails to meet the monad laws unless the traversable monad is also a commutative monad.
Two examples of commutative monads are Reader and Option.
Two examples of monads that are not commutative are List and State.
See the Wikipedia article on the "distributive law between monads" and a more in-depth article on n-lab.
Going back to our earlier Future example, we can write an Applicative instance for Future that runs each Future concurrently.
Then when we traverse a List[A] with an A => Future[B], we perform a scatter-gather operation
Each A creates a concurrent computation that will produce a B (the scatter), and as the futures complete they will be gathered back into a list.
import cats.Semigroup
import cats.data.{NonEmptyList, OneAnd, Xor}
import cats.data.{Validated, ValidatedNel}
import cats.std.list._
import cats.syntax.traverse._
def parseIntXor(s: String): Xor[NumberFormatException, Int] =
Xor.catchOnly[NumberFormatException](s.toInt)
val x1 = List("1", "2", "3").traverseU(parseIntXor)
//Right(List(1, 2, 3))
val x2 = List("1", "abc", "def").traverseU(parseIntXor)
//Left(NumberFormatException: For input string: "abc")
Traversal behavior is closely tied with the behavior of the underlying Applicative.
def parseIntValidated(s: String):
ValidatedNel[NumberFormatException, Int] =
Validated.catchOnly[NumberFormatException](s.toInt)
.toValidatedNel
List("1", "2", "3").traverseU(parseIntValidated)
//Valid(List(1, 2, 3))
val v3 = List("1", "abc", "def").traverseU(parseIntValidated)
//Invalid(OneAnd(NumberFormatException: For input string:
//"abc",List(NumberFormatException: For input string: "def")))
Another interesting effect we can use is Reader. Recall that a Reader[E, A] is a type alias for Kleisli[Id, E, A] which is a wrapper around E => A.
If we fix E to be some sort of environment or configuration, we can use the Reader applicative in our traverse.
Imagine we have a data pipeline that processes a bunch of data, each piece of data being categorized by a topic.
Given a specific topic, we produce a Job that processes that topic:
import cats.data.Reader
trait Context
trait Topic
trait Result
type Job[A] = Reader[Context, A]
def processTopic(topic: Topic): Job[Result] = ???
We'd like to aggregate many topics (a List[Topic]) and compose the resulting Jobs together into one Job[List[Result]].
Since Reader has an Applicative instance, we can do this by traversing over the list with processTopic:
def processTopics(topics: List[Topic]) =
topics.traverse(processTopic)
//returns Job[List[Result]]
When our new job's run method is called, it will go through each topic and run its topic-specific job, collecting results as it goes.
Note that our job's run method has type signature Context => Result, so it requires a Context before producing the value we want.
For example, in Spark the information needed to run a Spark job (where the master node is, memory allocated, etc.) resides in a SparkContext.
Going back to the above example, we can see how one may define topic-specific Spark jobs (type Job[A] = Reader[SparkContext, A]) and then run several Spark jobs on a collection of topics via traverse.
We then get back a Job[List[Result]], which is equivalent to SparkContext => List[Result].
When finally passed a SparkContext, we can run the job and get our results back.
Moreover, the fact that our aggregate job is not tied to any specific SparkContext allows us to pass in a SparkContext pointing to a production cluster, or (using the exact same job) pass in a test SparkContext that just runs locally across threads. This makes testing large jobs easy.
Finally, this encoding ensures that all the jobs for each topic run on the exact same cluster. At no point do we manually pass in or thread a SparkContext through - that is taken care for us by the (applicative) effect of Reader and therefore by traverse.
The Essence of the Iterator Pattern, by Jeremy Gibbons and Bruno Oliveira. Published in Mathematically-Structured Functional Programming, 2006.
An Investigation of the Laws of Traversals, by Mauro Jaskelioff and Ondrej Rypacek, published in Mathematically-Structured Functional Programming, 2012.
Typeclassopedia is a great resource on Monad, Applicative, Traverse and other type classes.
| Lecture 12a: Traversable Functors | 1 |
|---|---|
| Foldable | 2 |
| - | 3 |
| - | 4 |
| - | 5 |
| - | 6 |
| - | 7 |
| - | 8 |
| - | 9 |
| - | 10 |
| - | 11 |
| Folding with Monoids | 12 |
| - | 13 |
| - | 14 |
| - | 15 |
| - | 16 |
| - | 17 |
| - | 18 |
| - | 19 |
| - | 20 |
| Traverse | 21 |
| - | 22 |
| - | 23 |
| - | 24 |
| - | 25 |
| - | 26 |
| - | 27 |
| - | 28 |
| - | 29 |
| - | 30 |
| - | 31 |
| - | 32 |
| - | 33 |
| - | 34 |
| - | 35 |
| - | 36 |
| - | 37 |
| - | 38 |
| - | 39 |
| Traversing lists | 40 |
| - | 41 |
| Exercise | 42 |
| Exercise | 43 |
Traverse Laws |
44 |
| Identity | 45 |
| Composition | 46 |
| Traversable Monads | 47 |
| - | 48 |
| - | 49 |
| Example: Futures | 50 |
| Example: Parsers | 51 |
| - | 52 |
| Example: Readers | 53 |
| - | 54 |
| - | 55 |
| - | 56 |
| - | 57 |
| - | 58 |
| Links | 59 |
| Table of Contents | t |
|---|---|
| Exposé | ESC |
| Full screen slides | e |
| Presenter View | p |
| Source Files | s |
| Slide Numbers | n |
| Toggle screen blanking | b |
| Show/hide slide context | c |
| Notes | 2 |
| Help | h |