Haskell/性能示例

性能示例 (解决方案)

Haskell 性能

简介
循序渐进的示例
图归约
惰性
时间和空间分析
严格性
算法复杂度
数据结构
 并行性

编辑本章节

目标：用实际发生的示例逐步解释优化。

紧凑循环

dons：像 C 一样快速编写 Haskell：利用严格性、惰性和递归。

CSV 解析

haskell-cafe：另一个新手性能问题我希望他不介意我在这里发布他的代码，我还是得问他。 -- apfeλmus 08:46, 18 May 2008 (UTC)

type CSV = [[String]]

main = do
                  args <- getArgs
                  file <- readFile (head args)
                  writeFile (head args ++ "2") (processFile (args !! 1) file)

processFile s     = writeCSV . doInteraction s . readCSV
doInteraction line csv = insertLine (show line) (length csv - 1) csv
writeCSV          = (\x -> x ++ "\n") . concat . intersperse "\n" . (map (concat . intersperse "," . (map show)))
insertLine line pos csv = (take pos csv) ++ [readCSVLine line] ++ drop pos csv
readCSVLine       = read . (\x -> "["++x++"]")
readCSV           = map readCSVLine . lines

我认为在邮件列表中还有另一个 cvs 解析线程我认为很合适，但我记不起来了。

空间泄漏

jkff 在 #haskell 中询问了一些分析日志文件的代码。基本上，它是在构建直方图

foldl' (\m (x,y) -> insertWith' x (\[y] ys -> y:ys) [y] m) M.empty
  [(ByteString.copy foo, ByteString.copy bar) | (foo,bar) <- map (match regex) lines]

输入是一个 1GB 的日志文件，程序占用了可用内存，主要是因为 ByteString.copy 没有被强制，整个文件都停留在内存中。

分支预测失败

此代码演示了您的 CPU 可能在预测代码分支（即 if 子句）时成功或失败。该代码生成了一个很大的随机 Int 的 Vector。由此，我们计算所有大于 999 的数字之和。如果数据已排序，CPU 可以轻松推断出下一个分支的可能结果。如果数据未排序，这将更加困难且成功率更低。不幸的是，为一个目的而排序很少是一个好主意。该代码使用 criterion 包进行时间测量。例如，使用 -Wall -keep-llvm-files -O2 -optlo-O3 -fllvm -rtsopts -threaded -eventlog -feager-blackholing -fmax-simplifier-iterations=20 -fsimplifier-phases=3 -fno-liberate-case -fspecialise-aggressively 编译此代码。这个想法来自 github.com/Kobzol/hardware-effects。（tkx68hamburg）

{-# LANGUAGE FlexibleInstances, RankNTypes, BangPatterns, CPP, PartialTypeSignatures, UnicodeSyntax, OverloadedStrings #-}
{-# OPTIONS_GHC -fno-warn-orphans #-}
module Main (main) where

import           Control.DeepSeq
import           Control.Exception (evaluate)
import           Control.Monad.ST
import           Criterion.Main
import qualified Data.Vector.Algorithms.Radix as Radix
import qualified Data.Vector.Unboxed as UV
import           System.Random

#ifdef DEFINE_NFDATA_VECTOR
instance (NFData a) => NFData (V.Vector a)              where
  rnf v = V.foldr (\a b -> rnf a `seq` b) () v
#endif

setupUnsorted :: IO [UV.Vector Int]
setupUnsorted = do
  small <- randomSampleUVVector 1024
  med <- randomSampleUVVector $ 1024*1024
  large <- randomSampleUVVector $ 20*1024*1024
  return [small, med, large]

setupSorted :: IO [UV.Vector Int]
setupSorted = fmap (fmap sortUVec) setupUnsorted

main :: IO ()
main = do
  defaultMainWith
    defaultConfig [
      env setupUnsorted
          (\ ~[small, med, large] -> bgroup "Branch without sort" [
             bench "small" $ whnf core1 small,
             bench "medium" $ whnf core1 med,
             bench "large" $ whnf core1 large
             ]
          ),
      env setupUnsorted
          (\ ~[small, med, large] -> bgroup "Branch without sort using filter" [
             bench "small" $ whnf core1a small,
             bench "medium" $ whnf core1a med,
             bench "large" $ whnf core1a large
             ]
          ),
      env setupUnsorted
          (\ ~[small, med, large] -> bgroup "Branch including sort" [
             bench "small" $ whnf core2 small,
             bench "medium" $ whnf core2 med,
             bench "large" $ whnf core2 large
             ]
          ),
      env setupSorted
          (\ ~[small, med, large] -> bgroup "Branch after sort" [
             bench "small" $ whnf core1 small,
             bench "medium" $ whnf core1 med,
             bench "large" $ whnf core1 large
             ]
          )
    ]

{- INLINE -}
core1 :: UV.Vector Int -> Int
core1 = UV.foldl' condSum 0

{- INLINE -}
core2 :: UV.Vector Int -> Int
core2 = UV.foldl' condSum 0 . sortUVec

{- INLINE -}
core1a :: UV.Vector Int -> Int
core1a = UV.sum . (UV.filter (>999))

{- INLINE -}
condSum :: Int -> Int -> Int
condSum !x !y = if y>999 then x+y else x

randomSampleUVVector :: Int -> IO (UV.Vector Int)
randomSampleUVVector i = evaluate $ force $ UV.fromList (take i (randoms (mkStdGen 0) :: [Int]))

sortUVec :: forall a . (Ord a, Radix.Radix a, UV.Unbox a) => UV.Vector a -> UV.Vector a
sortUVec vec =
  runST
    (do mv <- UV.thaw vec
        Radix.sort mv
        UV.unsafeFreeze mv)

性能示例

练习解答

Haskell 性能

简介 >> 循序渐进的示例 >> 图归约 >> 惰性 >> 时间和空间分析 >> 严格性 >> 算法复杂度 >> 数据结构 >> 并行性

编辑本章节

Haskell

Haskell 基础 >> 基础 Haskell >> 中级 Haskell >> 单子
 高级 Haskell >> 类型玩乐 >> 更广泛的理论 >> Haskell 性能

库参考 >> 通用实践 >> 专业任务

编辑书籍结构