函数式编程与并行计算

并行计算的方式

  • 进程
  • 线程
  • 协程(coroutine)/纤程(fiber)
  • Future/Task
  • Async/Await
  • Actor
  • … …

Oh, no! 我该怎么选?

请注意:这里不是在讨论“具体的技术实现”,显然任何形式的并行处理都一定是基于进程和线程实现的。这里要讨论的问题是,什么“思维方式”才可以在多种场景下始终如一的应用?显然在分布式/流式场景下,底层的实现不需要特别讨论(尤其在使用已有框架时),而更需要思考的是“数据的处理方式”,这里的处理方式即“思维方式”的映射。

From Akka:

We believe that writing correct concurrent & distributed, resilient and elastic applications is too hard. Most of the time it’s because we are using the wrong tools and the wrong level of abstraction.

理想世界中,所有的数据类型都应该可以被map,而现实中不是,所以我们需要把本来不存在map操作的数据结构变成可以应用map操作的数据结构。

Functor可以简单视作对于map的抽象

Formally, a functor is a type F[A] with an operation map with type (A => B) => F[B].

trait Functor[F[_]]:
  extension [A, B](x: F[A])
    def map(f: A => B): F[B]

Functor Laws:

Functors guarantee the same semantics whether we sequence many small operations one by one, or combine them into a larger function before mapping. To ensure this is the case the following laws must hold:

  • Identity: calling map with the identity function is the same as doing nothing:

fa.map(a => a) == fa

  • Composition: mapping with two functions f and g is the same as mapping with f and then mapping with g:

fa.map(g(f(_))) == fa.map(f).map(g)

map之外,当然还有flatMap

一个单子(Monad)说白了不过就是自函子范畴上的一个幺半群而已(🐶

Monadic behaviour is formally captured in two operations: pure and flatMap:

trait Monad[F[_]] extends Functor[F]:

  def pure[A](x: A): F[A]

  extension [A, B](x: F[A])
    def flatMap(f: A => F[B]): F[B]

    def map(f: A => B): F[B] = x.flatMap(f.andThen(pure))

Monad Laws:

pure and flatMap must obey a set of laws that allow us to sequence operations freely without unintended glitches and side-effects:

  • Left identity: calling pure and transforming the result with func is the same as calling func:

pure(a).flatMap(func) == func(a)

  • Right identity: passing pure to flatMap is the same as doing nothing:

m.flatMap(pure) == m

  • Associativity: flatMapping over two functions f and g is the same as flatMapping over f and then flatMapping over g:

m.flatMap(f).flatMap(g) == m.flatMap(x => f(x).flatMap(g))

说好的并行计算呢?

???

这里只讲了“语义”上的“并行”,绝大多数情况下,我们都不需要手动实现一些并行机制,而更多的是使用已有的工具。

如果你需要为新的数据结构实现新的并行机制,可以参考:

Thanks