-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
common/validate.go: redundant check for invalidChar #225
Comments
Cool, thanks. In my head I have a ToDo item to better understand the Unicode character er... sets, so we can't be accidentally caught out by things. Haven't gotten around to it yet (thus our mostly not supporting unicode), but this is good info that will likely help. 😄 |
another issue with the current code is its not checking for invalid UTF-8. when you iterate a string, with invalid UTF-8 it returns "\xA0\xA1" improved code: package unicode
import (
"unicode"
"unicode/utf8"
)
func binary(src []byte) bool {
for len(src) >= 1 {
r, size := utf8.DecodeRune(src)
if r == utf8.RuneError {
if size == 1 {
return true
}
}
if unicode.Is(unicode.C, r) {
return true
}
src = src[size:]
}
return false
} |
Oh crap. We'd better fix that. I should have time to look into this tonight. 😄 |
Hmmm:
Shouldn't a validation function for unicode look at things as runes rather than bytes? |
if you prefer you can use this instead: https://godocs.io/unicode/utf8#DecodeRuneInString but its basically the same thing. the input is https://godocs.io/unicode/utf8#Valid but if you need any other processing (which you do to detect |
dbhub.io/common/validate.go
Lines 91 to 93 in f467849
dbhub.io/common/validate.go
Lines 181 to 183 in f467849
the
IsControl
checks are redundant, becauseunicode.C
includes all control characters.The text was updated successfully, but these errors were encountered: