Skip to content

Commit

Permalink
Clarify relationship of Strings and Runes
Browse files Browse the repository at this point in the history
As discussed in [the forums], this patch makes the changes desired to
clarify how Strings are related to Runes and hopefully clears up some
confusing and potentially misleading statements.

Ref: #2768

[the forums]: https://forum.exercism.org/t/potential-misleading-information-on-the-golang-runes-chapter/10082/1
  • Loading branch information
kotp committed Jul 28, 2024
1 parent e0e74ca commit f14919e
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 8 deletions.
9 changes: 5 additions & 4 deletions concepts/runes/about.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,12 +83,13 @@ fmt.Printf("myRune Unicode character: %c\n", myRune)
## Runes and Strings

Strings in Go are encoded using UTF-8 which means they contain Unicode characters.
Since the `rune` type represents a Unicode character, a string in Go is often referred to as a sequence of runes.
However, runes are stored as 1, 2, 3, or 4 bytes depending on the character.
Due to this, strings are really just a sequence of bytes.
In Go, slices are used to represent sequences and these slices can be iterated over using `range`.
Characters in strings are stored and encoded as 1, 2, 3, or 4 bytes depending on the Unicode character they represent.

In Go, slices are used to represent sequences and these slices can be iterated over using range.
When we iterate over a string, Go converts the string into a series of Runes, each of which is 4 bytes (remember, the rune type is an alias for an `int32`!)

Even though a string is just a slice of bytes, the `range` keyword iterates over a string's runes, not its bytes.

In this example, the `index` variable represents the starting index of the current rune's byte sequence and the `char` variable represents the current rune:

```go
Expand Down
9 changes: 5 additions & 4 deletions concepts/runes/introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,12 +74,13 @@ fmt.Printf("myRune Unicode code point: %U\n", myRune)
## Runes and Strings

Strings in Go are encoded using UTF-8 which means they contain Unicode characters.
Since the `rune` type represents a Unicode character, a string in Go is often referred to as a sequence of runes.
However, runes are stored as 1, 2, 3, or 4 bytes depending on the character.
Due to this, strings are really just a sequence of bytes.
In Go, slices are used to represent sequences and these slices can be iterated over using `range`.
Characters in strings are stored and encoded as 1, 2, 3, or 4 bytes depending on the Unicode character they represent.

In Go, slices are used to represent sequences and these slices can be iterated over using range.
When we iterate over a string, Go converts the string into a series of Runes, each of which is 4 bytes (remember, the rune type is an alias for an `int32`!)

Even though a string is just a slice of bytes, the `range` keyword iterates over a string's runes, not its bytes.

In this example, the `index` variable represents the starting index of the current rune's byte sequence and the `char` variable represents the current rune:

```go
Expand Down

0 comments on commit f14919e

Please sign in to comment.