F# 编程/缓存

F#：缓存

缓存通常用于重用已计算的数据。F# 提供了许多内置技术来缓存数据以供将来使用。

偏函数

F# 会自动缓存任何不带参数的函数的值。当 F# 遇到一个不带参数的函数时，F# 只会评估该函数一次，并在每次访问该函数时重复使用它的值。比较以下内容

let isNebraskaCity_bad city =
    let cities =
        printfn "Creating cities Set"
        ["Bellevue"; "Omaha"; "Lincoln"; "Papillion"]
        |> Set.ofList
        
    cities.Contains(city)
    
let isNebraskaCity_good =
    let cities =
        printfn "Creating cities Set"
        ["Bellevue"; "Omaha"; "Lincoln"; "Papillion"]
        |> Set.ofList
        
    fun city -> cities.Contains(city)

这两个函数接受和返回相同的值，但它们的行为却大不相同。以下是 fsi 中输出的比较

isNebraskaCity_bad

isNebraskaCity_good

> let isNebraskaCity_bad city =
    let cities =
        printfn "Creating cities Set"
        ["Bellevue"; "Omaha"; "Lincoln"; "Papillion"]
        |> Set.of_list
        
    cities.Contains(city);;

val isNebraskaCity_bad : string -> bool

> isNebraskaCity_bad "Lincoln";;
Creating cities Set
val it : bool = true

> isNebraskaCity_bad "Washington";;
Creating cities Set
val it : bool = false

> isNebraskaCity_bad "Bellevue";;
Creating cities Set
val it : bool = true

> isNebraskaCity_bad "St. Paul";;
Creating cities Set
val it : bool = false

> let isNebraskaCity_good =
    let cities =
        printfn "Creating cities Set"
        ["Bellevue"; "Omaha"; "Lincoln"; "Papillion"]
        |> Set.of_list
        
    fun city -> cities.Contains(city);;
Creating cities Set

val isNebraskaCity_good : (string -> bool)

> isNebraskaCity_good "Lincoln";;
val it : bool = true

> isNebraskaCity_good "Washington";;
val it : bool = false

> isNebraskaCity_good "Bellevue";;
val it : bool = true

> isNebraskaCity_good "St. Paul";;
val it : bool = false

isNebraskaCity_bad 的实现强制 F# 在每次调用时重新创建内部集合。另一方面，isNebraskaCity_good 是一个初始化为函数 fun city -> cities.Contains(city) 的值，因此它只创建一次内部集合，并在所有后续调用中重复使用它。

注意： 在内部，isNebraskaCity_bad 编译为一个静态函数，该函数在每次调用时都会构造一个集合。isNebraskaCity_good 编译为一个静态只读属性，其值在静态构造函数中初始化。

这种区别通常很微妙，但它可能会对应用程序的性能产生巨大影响。

记忆化

"记忆化" 是一个花哨的词，意思是计算出的值存储在查找表中，而不是在每次后续调用时重新计算。运行时间长的纯函数（即没有副作用的函数）是记忆化的良好候选者。考虑计算斐波那契数的递归定义

> #time;;

--> Timing now on

> let rec fib n =
    if n = 0I then 0I
    elif n = 1I then 1I
    else fib (n - 1I) + fib(n - 2I);;
Real: 00:00:00.000, CPU: 00:00:00.000, GC gen0: 0, gen1: 0, gen2: 0

val fib : Math.bigint -> Math.bigint

> fib 35I;;
Real: 00:00:23.557, CPU: 00:00:23.515, GC gen0: 2877, gen1: 3, gen2: 0
val it : Math.bigint = 9227465I

超过 fib 35I，函数的运行时间变得难以忍受。对 fib 函数的每次递归调用都会丢弃所有对 fib(n - 1I) 和 fib(n - 2I) 的中间计算，使其运行时间复杂度约为 O(2^n)。如果我们将所有这些中间计算保存在查找表中会怎样？以下是 fib 函数的记忆化版本

> #time;;

--> Timing now on

> let rec fib =
    let dict = new System.Collections.Generic.Dictionary<_,_>()
    fun n ->
        match dict.TryGetValue(n) with
        | true, v -> v
        | false, _ -> 
            let temp =
                if n = 0I then 0I
                elif n = 1I then 1I
                else fib (n - 1I) + fib(n - 2I)
            dict.Add(n, temp)
            temp;;

val fib : (Math.bigint -> Math.bigint)

> fib 35I;;
Real: 00:00:00.000, CPU: 00:00:00.000, GC gen0: 0, gen1: 0, gen2: 0
val it : Math.bigint = 9227465I

好多了！这个版本的 fib 函数几乎可以立即运行。事实上，由于我们只计算 fib(n) 的值一次，字典查找是一个 O(1) 操作，因此这个 fib 函数的运行时间为 O(n)。

注意所有记忆化逻辑都包含在 fib 函数中。我们可以编写一个更通用的函数来记忆化任何函数

let memoize f =
    let dict = new System.Collections.Generic.Dictionary<_,_>()
    fun n ->
        match dict.TryGetValue(n) with
        | (true, v) -> v
        | _ ->
            let temp = f(n)
            dict.Add(n, temp)
            temp
            
let rec fib = memoize(fun n ->
    if n = 0I then 0I
    elif n = 1I then 1I
    else fib (n - 1I) + fib (n - 2I) )

注意： 非常重要的是要记住上面的实现不是线程安全的 - 如果字典将被多个线程访问，则应在添加/检索项目之前锁定字典。

此外，虽然字典查找在恒定时间内发生，但字典使用的哈希函数可能需要任意长的时间才能执行（这在字符串中尤其如此，其中哈希字符串所需的时间与其长度成正比）。因此，记忆化的函数完全有可能比非记忆化的函数性能更差。始终分析代码以确定是否需要优化，以及记忆化是否真正提高了性能。

延迟值

F# 的 lazy 数据类型是一个有趣的原语，它会延迟对值的评估，直到实际需要该值为止。一旦计算出来，延迟值就会被缓存以供以后重复使用

> let x = lazy(printfn "I'm lazy"; 5 + 5);;

val x : Lazy<int> = <unevaluated>

> x.Force();; (* Should print "I'm lazy" *)
I'm lazy
val it : int = 10

> x.Force();; (* Value already computed, should not print "I'm lazy" again *)
val it : int = 10

F# 使用一些编译器魔法来避免在声明时评估表达式 (printfn "I'm lazy"; 5 + 5)。延迟值可能是最简单的缓存形式，但它们可以用来创建一些有趣而复杂的数据结构。例如，两个 F# 数据结构是在延迟值之上实现的，即 F# 的延迟列表和 Seq.cache 方法。

延迟列表和缓存的序列表示可能无限数量元素的任意序列。这些元素在第一次访问时被计算并缓存，但在再次枚举序列时不会被重新计算。以下是 fsi 中的演示

> let x = seq { for a in 1 .. 10 -> printfn "Got %i" a; a } |> Seq.cache;;

val x : seq<int>

> let y = Seq.take 5 x;;

val y : seq<int>

> Seq.reduce (+) y;;
Got 1
Got 2
Got 3
Got 4
Got 5
val it : int = 15

> Seq.reduce (+) y;; (* Should not recompute values *)
val it : int = 15

> Seq.reduce (+) x;; (* Values 1 through 5 already computed, should only compute 6 through 10 *)
Got 6
Got 7
Got 8
Got 9
Got 10
val it : int = 55

前一页：度量单位

索引

下一页：主动模式