diff --git a/content/strings.md b/content/strings.md index dfaf6c2..58b3eb1 100644 --- a/content/strings.md +++ b/content/strings.md @@ -16,7 +16,7 @@ echo """ proc re(s: string): string = s -echo r" "" " +echo r".""." echo re"\b[a-z]++\b" ``` ``` console @@ -29,17 +29,23 @@ words words words ⚑ - " +.". \b[a-z]++\b ``` -There are several types of strings literals: +There are several types of string literals: - Quoted Strings: Created by wrapping the body in triple quotes, they never interpret escape codes - - Raw Strings: created by prefixing the string with an `r`. There are no escape sequences don't work, except for `"`, which can be escaped as `""` - - Proc Strings: raw strings, but the method name that prefixes the string is called + - Raw Strings: created by prefixing the string with an `r`. They do not interpret escape sequences, except for `""`, which is interpreted as `"`. This means that `r"\b[a-z]\b"` is interpreted as `\b[a-z]\b` instead of failing to compile with a syntax error. + - Proc Strings: raw strings, but the method name that prefixes the string is called, so that `foo"12\"` -> `foo(r"12\")`. + +Strings are null-terminated, so that `cstring("foo")` requires zero copying. However, you should be careful that the lifetime of the cstring does not exceed the lifetime of the string it is based upon. + +Strings can also almost be thought of as `seq[char]` with respect to assignment semantics. See [seqs][] + +[seqs]: /seqs/#immutability ## A note about unicode Unicode symbols are allowed in strings, but are not treated in any special way, so if you want count glyphs or uppercase unicode symbols, you must use the `unicode` module. -Strings are generally considered to be encoded as UTF-8, so because of unicode's backwards compatibility, can be treated exactly as ASCII, with all values above 127 ignored. \ No newline at end of file +Strings are generally considered to be encoded as UTF-8, so because of unicode's backwards compatibility, can be treated exactly as ASCII, with all values above 127 ignored.