Julia/字符串和字符介绍

字典和集合	Julia 简介	使用文本文件
	字符串和字符

字符串和字符

字符串

字符串是由一个或多个字符组成的序列，通常用双引号括起来。

"this is a string"

关于字符串，你需要了解两点很重要的事情。

第一，它们是不可变的。一旦创建，就不能更改它们。但是，从现有字符串的部分创建新字符串很容易。

第二，在使用两个特定的字符时，要格外小心：双引号（"）和美元符号（$）。如果要在字符串中包含双引号字符，则必须在它前面加上反斜杠，否则字符串的其余部分将被解释为 Julia 代码，并可能产生有趣的结果。如果要在字符串中包含美元符号 ($)，则也应该在它前面加上反斜杠，因为它用于字符串插值。

julia> demand = "You owe me \$50!"
"You owe me \$50!"

julia> println(demand)
You owe me $50!

julia> demandquote = "He said, \"You owe me \$50!\""
"He said, \"You owe me \$50!\""

字符串也可以用三个双引号括起来。这样做的好处是，你可以在字符串中使用普通的双引号，而无需在它们前面加上反斜杠。

julia> """this is "a" string"""
"this is \"a\" string"

你还会遇到一些特殊的字符串类型，它们由一个或多个字符紧跟着一个左边的双引号。

r" " 表示正则表达式
v" " 表示版本字符串
b" " 表示字节字面量
raw" " 表示不会执行插值的原始字符串

字符串插值

你经常想要在字符串中使用 Julia 表达式的结果。例如，假设你想说

"The value of x is n."

其中 n 是 x 的当前值。任何 Julia 表达式都可以通过 $() 结构插入字符串。

julia> x = 42
42

julia> "The value of x is $(x)."
"The value of x is 42."

如果只使用变量名，则不需要使用括号。

julia> "The value of x is $x."
"The value of x is 42."

要在字符串中包含 Julia 表达式的结果，首先将表达式括起来，然后在前面加上美元符号。

julia> "The value of 2 + 2 is $(2 + 2)."
"The value of 2 + 2 is 4."

子字符串

要从字符串中提取更小的字符串，请使用 getindex(s, range) 或 s[range] 语法。对于基本的 ASCII 字符串，你可以使用与从数组中提取元素相同的技术。

julia> s ="a load of characters"
"a load of characters"

julia> s[1:end]
"a load of characters"

julia> s[3:6]
"load"

julia> s[3:end-6]
"load of char"

这等同于

julia> s[begin+2:end-6]
"load of char"

你可以轻松地遍历字符串。

for char in s
    print(char, "_")
end
a_ _l_o_a_d_ _o_f_ _c_h_a_r_a_c_t_e_r_s_

如果从字符串中提取单个元素，而不是长度为 1 的字符串（即具有相同的起始和结束位置），请注意。

julia> s[1:1]
"a" 

julia> s[1]
'a'

第二个结果不是字符串，而是一个字符（用单引号括起来）。

Unicode 字符串

并非所有字符串都是 ASCII。要访问 Unicode 字符串中的单个字符，你不能总是使用简单的索引，因为某些字符占据多个索引位置。不要仅仅因为某些索引号似乎有效就被误导。

julia> su = "AéB𐅍CD"
"AéB𐅍CD"

julia> su[1]
'A'

julia> su[2]
'é'

julia> su[3]
ERROR: UnicodeError: invalid character index
in slow_utf8_next(::Array{UInt8,1}, ::UInt8, ::Int64) at ./strings/string.jl:67
in next at ./strings/string.jl:92 [inlined]
in getindex(::String, ::Int64) at ./strings/basic.jl:70

不要使用 length(str) 来查找字符串的长度，而要使用 lastindex(str)。

julia> length(su)
6

julia> lastindex(su)
10

isascii() 函数测试字符串是 ASCII 还是包含 Unicode 字符。

julia> isascii(su)
false

在这个字符串中，"第二个"字符 é 占 2 个字节，"第四个"字符 𐅍 占 4 个字节。

for i in eachindex(su)
    println(i, " -> ", su[i])
end

1 -> A
2 -> é
4 -> B
5 -> 𐅍
9 -> C
10 -> D

"第三个"字符 B 从字符串中的第 4 个元素开始。

你也可以使用 pairs() 函数更轻松地做到这一点。

for pair in pairs(su)
    println(pair)
end

1 => A
2 => é
4 => B
5 => 𐅍
9 => C
10 => D

或者，使用 eachindex 迭代器。

for charindex in eachindex(su)
    @show su[charindex]
end
su[charindex] = 'A'
su[charindex] = 'é'
su[charindex] = 'B'
su[charindex] = '𐅍'
su[charindex] = 'C'
su[charindex] = 'D'

还有其他有用的函数可以处理此类字符串，包括 collect()、thisind()、nextind() 和 prevind()。

julia> collect(su)
 6-element Array{Char,1}:
 'A'
 'é'
 'B'
 '𐅍'
 'C'
 'D'

for i in 1:10
    print(thisind(su, i), " ")
end

1 2 2 4 5 5 5 5 9 10

拆分和连接字符串

你可以使用乘法 (*) 运算符将字符串粘贴在一起（这个过程通常称为连接）。

julia> "s" * "t"
"st"

如果你使用过其他编程语言，你可能希望使用加法 (+) 运算符。

julia> "s" + "t"
LoadError: MethodError: `+` has no method matching +(::String, ::String)

- 所以使用 *。

如果可以"乘"字符串，也可以将它们取幂。

julia> "s" ^ 18
"ssssssssssssssssss"

你也可以使用 string()

julia> string("s", "t")
"st"

但是，如果你想进行大量连接，例如在循环中，最好使用字符串缓冲区方法（见下文）。

要拆分字符串，请使用 split() 函数。给定这个简单的字符串

julia> s = "You know my methods, Watson."
"You know my methods, Watson."

对 split() 函数的简单调用将在空格处分割字符串，返回一个包含五个元素的数组。

julia> split(s)
5-element Array{SubString{String},1}:
"You"
"know"
"my"
"methods,"
"Watson."

或者，你可以指定一个包含一个或多个字符的字符串来进行拆分。

julia> split(s, "e")
2-element Array{SubString{String},1}:
"You know my m"
"thods, Watson."

julia> split(s, " m")
3-element Array{SubString{String},1}:
"You know"    
"y"       
"ethods, Watson."

你用来进行拆分的字符不会出现在最终结果中。

julia> split(s, "hod")
2-element Array{SubString{String},1}:
"You know my met"
"s, Watson."

如果要将字符串拆分为单独的单个字符字符串，请使用空字符串 ("")，它会在字符之间拆分字符串。

julia> split(s,"")
28-element Array{SubString{String},1}:
"Y"
"o"
"u"
" "
"k"
"n"
"o"
"w"
" "
"m"
"y"
" "
"m"
"e"
"t"
"h"
"o"
"d"
"s"
","
" "
"W"
"a"
"t"
"s"
"o"
"n"
"."

你也可以使用正则表达式来定义拆分点，从而拆分字符串。使用特殊的正则表达式字符串构造 r" "。在其中，你可以使用具有特殊含义的正则表达式字符。

julia> split(s, r"a|e|i|o|u")
8-element Array{SubString{String},1}:
"Y"
""
" kn"
"w my m"
"th"
"ds, W"
"ts"
"n."

这里，r"a|e|i|o|u" 是一个正则表达式字符串，如果你喜欢正则表达式，你就会知道它匹配任何元音。因此，生成的数组包含在每个元音处分割的字符串。注意结果中的空字符串 - 如果你不想要这些空字符串，在最后添加一个false标记。

julia> split(s, r"a|e|i|o|u", false)
7-element Array{SubString{String},1}:
"Y"   
" kn"  
"w my m"
"th"  
"ds, W" 
"ts"  
"n."

如果你想保留元音，而不是将它们用于拆分工作，则必须深入了解正则表达式字面量字符串的世界。继续阅读。

你可以使用 join() 将数组形式的拆分字符串的元素连接起来。

julia> join(split(s, r"a|e|i|o|u", false), "aiou")
"Yaiou knaiouw my maiouthaiouds, Waioutsaioun."

使用函数拆分

Julia 中的许多函数允许你将函数用作函数调用的一部分。匿名函数很有用，因为你可以创建包含智能选择的功能的函数调用。例如，split() 允许你在分隔符字符的位置提供一个函数。在下一个示例中，分隔符（奇怪的是）被指定为任何 ASCII 码是 8 的倍数的大写字符。

julia> split(join(Char.(65:90)),  c -> Int(c) % 8 == 0)
4-element Array{SubString{String},1}:
 "ABCDEFG"
 "IJKLMNO"
 "QRSTUVW"
 "YZ"

字符对象

在上面，我们从更大的字符串中提取了更小的字符串。

julia> s[1:1]
"a"

但当我们从字符串中提取单个元素时

julia> s[1]
'a'

注意单引号。在 Julia 中，这些用于标记字符对象，因此 'a' 是一个字符对象，但 "a" 是一个长度为 1 的字符串。它们并不等效。

你可以轻松地将字符对象转换为字符串。

julia> string('s') * string('d')
"sd"

或者

julia> string('s', 'd')
"sd"

使用 `\U` 转义序列（大写表示 32 位）输入 32 位 Unicode 字符非常容易。小写转义序列 `\u` 可用于 16 位和 8 位字符。

julia> ('\U1014d', '\u2640', '\u26')
('𐅍','♀','&')

对于字符串，`\Uxxxxxxxx` 和 `\uxxxx` 语法更为严格。

julia> "\U0001014d2\U000026402\u26402\U000000a52\u00a52\U000000352\u00352\x352"
"𐅍2♀2♀2¥2¥2525252"

数字和字符串之间的转换

将整数转换为字符串是 `string()` 函数的工作。关键字 `base` 允许您指定转换的数字基数，您可以使用它将十进制数字转换为二进制、八进制或十六进制字符串。

julia> string(11, base=2)
"1011"

julia> string(11, base=8)
"13"

julia> string(11, base=16)
"b"

julia> string(11)
"11"

julia> a = BigInt(2)^200
1606938044258990275541962092341162602522202993782792835301376

julia> string(a)
"1606938044258990275541962092341162602522202993782792835301376"

julia> string(a, base=16)
"1000000000000000000000000000000000000000000000000"

要将字符串转换为数字，请使用 `parse()`，您还可以指定数字基数（例如二进制或十六进制），如果您希望字符串被解释为使用数字基数。

julia> parse(Int, "100")
100

julia> parse(Int, "100", base=2)
4

julia> parse(Int, "100", base=16)
256

julia> parse(Float64, "100.32")
100.32

julia> parse(Complex{Float64}, "0 + 1im")
0.0 + 1.0im

将字符转换为整数并返回

`Int()` 将字符转换为整数，`Char()` 将整数转换为字符。

julia> Char(8253)
'‽': Unicode U+203d (category Po: Punctuation, other)

julia> Char(0x203d) # the Interrobang is Unicode U+203d in hexadecimal
'‽': Unicode U+203d (category Po: Punctuation, other)

julia> Int('‽')
8253

julia> string(Int('‽'), base=16)
"203d"

要从单个字符字符串转换为代码号（例如 ASCII 或 UTF 代码号），请尝试以下操作

julia> Int("S"[1])
83

快速字母表

julia> string.(Char.("A"[1]:"Z"[1])) |> collect 
26-element Array{String,1}:
 "A"
 "B"
 ...
 "Y"
 "Z"

printf 格式化

如果您非常依赖 C 样式的 `printf()` 功能，您将能够使用 Julia 宏（您可以通过在宏前添加 `@` 符号来调用宏）。该宏在 Printf 包中提供，您需要先加载该包。

julia> using Printf

julia> @printf("pi = %0.20f", float(pi))
pi = 3.14159265358979311600

或者，您可以使用 `sprintf()` 宏创建另一个字符串，该宏也位于 Printf 包中。

julia> @sprintf("pi = %0.20f", float(pi))
"pi = 3.14159265358979311600"

将字符串转换为数组

要从字符串读取到数组，您可以使用 `IOBuffer()` 函数。这与许多 Julia 函数（包括 `printf()`）一起使用。这是一个数据字符串（它可能已从文件中读取）

data="1 2 3 4
5 6 7 8
9 0 1 2"

"1 2 3 4\n5 6 7 8\n9 0 1 2"

现在，您可以使用诸如 `readdlm()` 之类的函数“读取”此字符串，即“使用分隔符读取”函数。这可以在 DelimitedFiles 包中找到。

julia> using DelimitedFiles
julia> readdlm(IOBuffer(data))
3x4 Array{Float64,2}:
1.0 2.0 3.0 4.0
5.0 6.0 7.0 8.0
9.0 0.0 1.0 2.0

您可以添加可选的类型规范

julia> readdlm(IOBuffer(data), Int)
3x4 Array{Int64,2}:
1 2 3 4
5 6 7 8
9 0 1 2

有时您想对字符串做一些用数组可以更好地完成的事情。以下是一个例子。

julia> s = "/Users/me/Music/iTunes/iTunes Media/Mobile Applications";

您可以使用 `collect()` 将路径名字符串分解为字符对象数组，`collect()` 将集合或字符串中的项目收集到数组中。

julia> collect(s)
55-element Array{Char,1}:
'/'
'U'
's'
'e'
'r'
's'
'/'
...

类似地，您可以使用 `split()` 拆分字符串并计算结果。

julia> split(s, "")
55-element Array{Char,1}:
'/'
'U'
's'
'e'
'r'
's'
'/'
...

要计算特定字符对象的出现次数，您可以使用匿名函数。

julia> count(c -> c == '/', collect(s))
6

虽然这里转换为数组是多余且效率低下的。以下是一个更好的方法

julia> count(c -> c == '/', s)
6

查找和替换字符串中的内容

如果您想知道字符串是否包含特定字符，请使用通用 `in()` 函数。

julia> s = "Elementary, my dear Watson";
julia> in('m', s)
true

但是 `occursin()` 函数接受两个字符串，它更常用，因为您可以使用一个或多个字符的子字符串。请注意，您首先放置搜索词，然后放置要搜索的字符串 - `occursin(needle, haystack)`

julia> occursin("Wat", s)
true

julia> occursin("m", s)
true

julia> occursin("mi", s)
false

julia> occursin("me", s)
true

您可以使用 `findfirst(needle, haystack)` 获取子字符串首次出现的位置。第一个参数可以是单个字符、字符串或正则表达式

julia> s ="You know my methods, Watson.";

julia> findfirst("meth", s)
13:16

julia> findfirst(r"[aeiou]", s)  # first vowel
2

julia> findfirst(isequal('a'), s) # first occurrence of character 'a'
23

在每种情况下，结果都包含字符的索引（如果存在）。

替换

`replace()` 函数返回一个新的字符串，其中字符的子字符串被替换为其他内容

julia> replace("Sherlock Holmes", "e" => "ee")
"Sheerlock Holmees"

您使用 => 运算符指定要查找的模式及其替换。通常第三个参数是另一个字符串，如这里所示。但是，您也可以提供一个处理结果的函数

julia> replace("Sherlock Holmes", "e" => uppercase)
"ShErlock HolmEs"

其中函数（这里，内置的 `uppercase()` 函数）应用于匹配的子字符串。

没有 `replace!` 函数，其中“!”表示更改其参数的函数。这是因为您不能更改字符串 - 它们是不可变的。

使用函数替换

Julia 中的许多函数允许您在函数调用中提供函数，您可以很好地利用匿名函数来实现这一点。例如，以下是如何使用函数在 `replace()` 函数中提供随机替换。

julia>  t = "You can never foretell what any one man will do, but you can say with precision what an average number will be up to. Individuals vary, but percentages remain constant.";

julia> replace(t, r"a|e|i|o|u" => (c) -> rand(Bool) ? "0" : "1") 
"Y00 c1n n0v0r f1r0t1ll wh1t 0ny 0n0 m0n w1ll d0, b0t y01 c1n s1y w0th pr1c1s10n wh0t 1n 1v0r0g0 n1mb0r w0ll b0 0p t1. Ind1v0d11ls v0ry, b0t p1rc0nt0g0s r0m01n c1nst0nt."

julia> replace(t, r"a|e|i|o|u" => (c) -> rand(Bool) ? "0" : "1")
"Y11 c0n...n1v0r f0r1t0ll wh1t 1ny 0n1 m0n w1ll d1, b1t y10 c1n s1y w1th pr0c1s01n wh0t 0n 0v1r0g0 n1mb1r w0ll b0 1p t1. Ind1v0d01ls v0ry, b1t p0rc1nt1g0s r0m01n c1nst0nt."

正则表达式

您可以使用正则表达式查找子字符串的匹配项。一些接受正则表达式的函数是

`replace()` 更改正则表达式的出现
`match()` 返回第一个匹配项或无
`eachmatch()` 返回一个迭代器，让您搜索所有匹配项
`split()` 在每个匹配项处拆分字符串

使用 `replace()` 将每个辅音替换为下划线

julia> replace("Elementary, my dear Watson!", r"[^aeiou]" => "_")
"__e_e__a________ea___a__o__"

以下代码将每个元音替换为对每个匹配项运行函数的结果

julia> replace("Elementary, my dear Watson!", r"[aeiou]" => uppercase)
"ElEmEntAry, my dEAr WAtsOn!"

使用 `replace()`，您可以访问匹配项，如果您提供一个特殊的替换字符串 `s""`，其中 `\1` 指的是第一个匹配项，`\2` 指的是第二个匹配项，依此类推。使用此正则表达式操作，每个以空格开头的小写字母都会重复三次

julia> replace("Elementary, my dear Watson!", r"(\s)([a-z])" => s"\1\2\2\2")
"Elementary, mmmy dddear Watson!"

有关更多正则表达式乐趣，请使用 `-match-` 函数。

这里我从文件中加载了“福尔摩斯探案集”的完整文本到名为 `text` 的字符串中。

julia> f = "/tmp/adventures-of-sherlock-holmes.txt"
julia> text = read(f, String);

要将匹配的可能性用作布尔条件，例如适合在 `if` 语句中使用，请使用 `occursin()`。

julia> occursin(r"Opium", text)
false

这很奇怪。我们期望找到这位伟大侦探的特殊药物消遣的证据。事实上，“鸦片”一词确实出现在文本中，但只出现在小写中，因此产生了这个 `false` 结果——正则表达式区分大小写。

julia> occursin(r"(?i)Opium", text)
true

这是一个不区分大小写的搜索，由标志 `(?i)` 设置，它返回 `true`。

您可以使用一个简单的循环检查每一行中是否有该词。

for l in split(text, "\n")
    occursin(r"opium", l) && println(l)
end

opium. The habit grew upon him, as I understand, from some
he had, when the fit was on him, made use of an opium den in the
brown opium smoke, and terraced with wooden berths, like the
wrinkled, bent with age, an opium pipe dangling down from between
very short time a decrepit figure had emerged from the opium den,
opium-smoking to cocaine injections, and all the other little
steps - for the house was none other than the opium den in which
lives upon the second floor of the opium den, and who was
learn to have been the lodger at the opium den, and to have been
doing in the opium den, what happened to him when there, where is
"Had he ever showed any signs of having taken opium?"
room above the opium den when I looked out of my window and saw,

为了获得更易用的输出（在 REPL 中），请添加 `enumerate()` 和一些突出显示。

red = Base.text_colors[:red]; default = Base.text_colors[:default];
for (n, l) in enumerate(split(text, "\n"))
    occursin(r"opium", l) && println("$n $(replace(l, "opium" => "$(red)opium$(default)"))")
end

5087 opium. The habit grew upon him, as I understand, from some
5140 he had, when the fit was on him, made use of an opium den in the
5173 brown opium smoke, and terraced with wooden berths, like the
5237 wrinkled, bent with age, an opium pipe dangling down from between
5273 very short time a decrepit figure had emerged from the opium den,
5280 opium-smoking to cocaine injections, and all the other little
5429 steps - for the house was none other than the opium den in which
5486 lives upon the second floor of the opium den, and who was
5510 learn to have been the lodger at the opium den, and to have been
5593 doing in the opium den, what happened to him when there, where is
5846 "Had he ever showed any signs of having taken opium?"
6129 room above the opium den when I looked out of my window and saw,

添加正则表达式修饰符（如不区分大小写的匹配）有另一种语法。请注意第二个示例中正则表达式字符串后面的“i”。

julia> occursin(r"Opium", text)
false

julia> occursin(r"Opium"i, text)
true

使用 `eachmatch()` 函数，您将正则表达式应用于字符串以生成迭代器。例如，在我们的文本中查找匹配字母“L”，后面是其他一些字符，以“ed”结尾的子字符串。

julia> lmatch = eachmatch(r"L.*?ed", text)

`lmatch` 中的结果是一个可迭代对象，包含所有匹配项，作为 RegexMatch 对象。

julia> collect(lmatch)[1:10]
10-element Array{RegexMatch,1}:
RegexMatch("London, and proceed")         
RegexMatch("London is a pleasant thing indeed")  
RegexMatch("Looking for lodgings,\" I answered") 
RegexMatch("London he had received")       
RegexMatch("Lied")                
RegexMatch("Life,\" and it attempted")      
RegexMatch("Lauriston Gardens wore an ill-omened")
RegexMatch("Let\" card had developed")      
RegexMatch("Lestrade, is here. I had relied")   
RegexMatch("Lestrade grabbed")

我们可以遍历迭代器并依次查看每个匹配项。您可以访问 RegexMatch 的许多字段，以提取有关匹配项的信息。其中包括 `captures`、`match`、`offset`、`offsets` 和 `regex`。例如，`match` 字段包含匹配的子字符串。

for i in lmatch
    println(i.match)
end

London - quite so! Your Majesty, as I understand, became entangled
Lodge. As it pulled
Lord, Mr. Wilson, that I was a red
League of the Red
League was founded
London when he was young, and he wanted
LSON" in white letters, upon a corner house, announced
League, and the copying of the 'Encyclopaed
Leadenhall Street Post Office, to be left till called
Let the whole incident be a sealed
Lestrade, being rather puzzled
Lestrade would have noted
...
Lestrade," drawled
Lestrade looked
Lord St. Simon has not already arrived
Lord St. Simon sank into a chair and passed
Lord St. Simon had by no means relaxed
Lordship. "I may be forced
London. What could have happened
London, and I had placed

其他字段包括 `captures`，捕获的子字符串作为字符串数组，`offset`，整个匹配开始时字符串中的偏移量，以及 `offsets`，捕获的子字符串的偏移量。

要获取匹配字符串的数组，请使用类似以下的内容

julia> collect(m.match for m in eachmatch(r"L.*?ed", text))
58-element Array{SubString{String},1}:
"London - quite so! Your Majesty, as I understand, became entangled"
"Lodge. As it pulled"                        
"Lord, Mr. Wilson, that I was a red"                
"League of the Red"                         
"League was founded"                        
"London when he was young, and he wanted"              
"Leadenhall Street Post Office, to be left till called"       
"Let the whole incident be a sealed"                
"Lestrade, being rather puzzled"                  
"Lestrade would have noted"                     
"Lestrade looked"                          
"Lestrade laughed"                         
"Lestrade shrugged"                         
"Lestrade called"                          
... 
"Lord St. Simon shrugged"                      
"Lady St. Simon was decoyed"                    
"Lestrade,\" drawled"                        
"Lestrade looked"                          
"Lord St. Simon has not already arrived"              
"Lord St. Simon sank into a chair and passed"            
"Lord St. Simon had by no means relaxed"              
"Lordship. \"I may be forced"                    
"London. What could have happened"                 
"London, and I had placed"

基本 `match()` 函数查找正则表达式的第一个匹配项。使用 `match` 字段从 RegexMatch 对象中提取信息。

julia> match(r"She.*",text).match
"Sherlock Holmes she is always THE woman. I have seldom heard\r"

从文件中获取匹配行的更简洁的方法是

julia> f = "adventures of sherlock holmes.txt"

julia> filter(s -> occursin(r"(?i)Opium", s), map(chomp, readlines(open(f))))
12-element Array{SubString{String},1}:
"opium. The habit grew upon him, as I understand, from some"    
"he had, when the fit was on him, made use of an opium den in the" 
"brown opium smoke, and terraced with wooden berths, like the"   
"wrinkled, bent with age, an opium pipe dangling down from between"
"very short time a decrepit figure had emerged from the opium den,"
"opium-smoking to cocaine injections, and all the other little"  
"steps - for the house was none other than the opium den in which" 
"lives upon the second floor of the opium den, and who was"    
"learn to have been the lodger at the opium den, and to have been" 
"doing in the opium den, what happened to him when there, where is"
"\"Had he ever showed any signs of having taken opium?\""     
"room above the opium den when I looked out of my window and saw,"

制作正则表达式

有时您想从代码中创建正则表达式。您可以通过创建 Regex 对象来实现。以下是如何在文本中计算元音数量的一种方法

f = open("sherlock-holmes.txt")

text = read(f, String)

for vowel in "aeiou"
    r = Regex(string(vowel))
    l = [m.match for m = eachmatch(r, text)]
    println("there are $(length(l)) letter \"$vowel\"s in the text.")
end

there are 219626 letter "a"s in the text.
there are 337212 letter "e"s in the text.
there are 167552 letter "i"s in the text.
there are 212834 letter "o"s in the text.
there are 82924 letter "u"s in the text.

制作替换字符串

有时您需要组装替换字符串。为此，您可以使用 `SubstitutionString()` 而不是 `s"..."`。

例如，假设您想在替换字符串中进行一些字符串插值。也许您有一列文件，您想重新编号它们，以便“file2.png”变为“file1.png”。

files = ["file2.png", "file3.png", "file4.png", "file5.png", "file6.png", "file7.png"] 

for (n, f) in enumerate(files)
    newfilename = replace(f, r"(.*)\d\.png" => SubstitutionString("\\g<1>$(n).png"))
    # now to do the renaming...

请注意，您不能简单地在 SubstitutionString 中使用 `\1` 来引用第一个捕获的表达式，您必须将其转义为 `\\1`，并使用 `\g`（转义为 `\\g`）来引用命名的捕获组。

测试和更改字符串

有很多函数可以用来测试和更改字符串。

length(str) 字符串的长度
sizeof(str) 长度/大小
startswith(strA, strB) strA 是否以 strB 开头？
endswith(strA, strB) strA 是否以 strB 结尾？
occursin(strA, strB) strA 是否出现在 strB 中？
all(isletter, str) str 是否完全由字母组成？
all(isnumeric, str) str 是否完全由数字字符组成？
isascii(str) str 是否为 ASCII 字符串？
all(iscntrl, str) str 是否完全由控制字符组成？
all(isdigit, str) str 是否为 0-9？
all(ispunct, str) str 是否由标点符号组成？
all(isspace, str) str 是否为空白字符？
all(isuppercase, str) str 是否为大写？
all(islowercase, str) str 是否完全为小写？
all(isxdigit, str) str 是否完全由十六进制数字组成？
uppercase(str) 返回 str 的大写副本
lowercase(str) 返回 str 的小写副本
titlecase(str) 返回 str 的副本，其中每个单词的第一个字符都转换为大写
uppercasefirst(str) 返回 str 的副本，其中第一个字符转换为大写
lowercasefirst(str) 返回 str 的副本，其中第一个字符转换为小写
chop(str) 返回移除最后一个字符的副本
chomp(str) 返回移除最后一个字符的副本，但只有当该字符为换行符时

流

要写入字符串，可以使用 Julia 流。sprint()（字符串打印）函数允许您使用函数作为第一个参数，并使用该函数和其他参数向流发送信息，并将结果作为字符串返回。

例如，考虑以下函数 f。函数的主体将一个匿名“打印”函数映射到参数上，并将它们用尖括号括起来。当被 sprint 使用时，函数 f 处理其余参数并将它们发送到流。

function f(io::IO, args...)
    map((a) -> print(io,"<",a, ">"), args)
end
f (generic function with 1 method)

julia> sprint(f, "fred", "jim", "bill", "fred blogs")
"<fred><jim><bill><fred blogs>"

像 println() 这样的函数可以将 IOBuffer 或流作为其第一个参数。这使您可以将内容打印到流而不是打印到标准输出设备。

julia> iobuffer = IOBuffer()
IOBuffer(data=Uint8[...], readable=true, writable=true, seekable=true, append=false, size=0, maxsize=Inf, ptr=1, mark=-1)

julia> for i in 1:100
           println(iobuffer, string(i))
       end

之后，名为 iobuffer 的内存中流将充满了数字和换行符，即使终端上没有打印任何内容。要将 iobuffer 中的内容从流复制到字符串或数组，可以使用 take!()

julia> String(take!(iobuffer))
"1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n11\n12\n13\n14 ... \n98\n99\n100\n"

彩色/样式化输出

以下使用 printstyled 打印出相应的颜色消息。

julia> for color in [:red, :green, :blue, :magenta]
           printstyled("Hello in $(color)\n"; color = color)
       end
Hello in red
Hello in green
Hello in blue
Hello in magenta

打印格式化的回溯

在 try catch 语句中间，以下代码将打印导致异常的原始回溯。

try
    # some code that can fail, but you want to continue even after a failure
catch e
    # show the error, but with its backtrace
    showerror(stderr, e, catch_backtrace())
end

如果您不在 try-catch 中，并且想要在不停止执行的情况下打印堆栈跟踪，请使用以下代码。

showerror(stderr, ErrorException("show stacktrace"), stacktrace())