跳转到内容

Rebol 编程/invalid-utf?

来自维基教科书,开放的书籍,开放的世界
INVALID-UTF? data /utf num 

检查 UTF 编码是否正确,如果正确则返回 NONE,否则返回发生错误的位置。

INVALID-UTF? 是一个函数值。

  • data -- (类型:二进制)
  • /utf -- 检查除 UTF-8 之外的编码
    • num -- 位大小 - 正数表示 BE,负数表示 LE(类型:整数)

(特殊属性)

[编辑 | 编辑源代码]
  • 捕获

源代码

[编辑 | 编辑源代码]
invalid-utf?: func [
    {Checks for proper UTF encoding and returns NONE if correct or position where the error occurred.} 
    [catch] 
    data [binary!] 
    /utf "Check encodings other than UTF-8" 
    num [integer!] "Bit size - positive for BE negative for LE" /local 
    ascii 
    utf8+1 
    utf8+2 
    utf8+3 
    utf8rest pos 
    hi lo w c
][
    ascii: make bitset! #{
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF00000000000000000000000000000000
} 
    utf8+1: make bitset! #{
000000000000000000000000000000000000000000000000FCFFFFFF00000000
} 
    utf8+2: make bitset! #{
00000000000000000000000000000000000000000000000000000000FFFF0000
} 
    utf8+3: make bitset! #{
0000000000000000000000000000000000000000000000000000000000001F00
} 
    utf8rest: make bitset! #{
00000000000000000000000000000000FFFFFFFFFFFFFFFF0000000000000000
} 
    switch/default any [num 8] [
        8 [
            unless parse/all/case data [(pos: none) any [
                    pos: ascii | utf8+1 utf8rest | 
                    utf8+2 2 utf8rest | utf8+3 3 utf8rest
                ]] [as-binary pos]
        ] 
        16 [
            pos: data 
            while [not tail? pos] [
                hi: first pos 
                case [
                    none? lo: pick pos 2 [break/return pos] 
                    55296 > w: hi * 256 + lo [pos: skip pos 2] 
                    57343 < w [pos: skip pos 2] 
                    56319 < w [break/return pos] 
                    none? hi: pick pos 3 [break/return pos] 
                    none? lo: pick pos 4 [break/return pos] 
                    56320 > w: hi * 256 + lo [break/return pos] 
                    57343 >= w [pos: skip pos 4]
                ] 
                none
            ]
        ] 
        -16 [
            pos: data 
            while [not tail? pos] [
                lo: first pos 
                case [
                    none? hi: pick pos 2 [break/return pos] 
                    55296 > w: hi * 256 + lo [pos: skip pos 2] 
                    57343 < w [pos: skip pos 2] 
                    56319 < w [break/return pos] 
                    none? lo: pick pos 3 [break/return pos] 
                    none? hi: pick pos 4 [break/return pos] 
                    56320 > w: hi * 256 + lo [break/return pos] 
                    57343 >= w [pos: skip pos 4]
                ] 
                none
            ]
        ] 
        32 [
            pos: data 
            while [not tail? pos] [
                if any [
                    4 > length? pos 
                    negative? c: to-integer pos 
                    1114111 < c
                ] [break/return pos]
            ]
        ] 
        -32 [
            pos: data 
            while [not tail? pos] [
                if any [
                    4 > length? pos 
                    negative? c: also to-integer reverse/part pos 4 reverse/part pos 4 
                    1114111 < c
                ] [break/return pos]
            ]
        ]
    ] [
        throw-error 'script 'invalid-arg num
    ]
]
华夏公益教科书