Skip to content

Commit

Permalink
0.7.26 - improved wordlists.
Browse files Browse the repository at this point in the history
  • Loading branch information
finnbear committed Jul 11, 2024
1 parent cf67ec4 commit 96d2fce
Show file tree
Hide file tree
Showing 9 changed files with 112 additions and 7 deletions.
2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[package]
name = "rustrict"
authors = ["Finn Bear"]
version = "0.7.25"
version = "0.7.26"
edition = "2021"
license = "MIT OR Apache-2.0"
repository = "https://github.com/finnbear/rustrict/"
Expand Down
5 changes: 2 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,8 +132,7 @@ If you want to add custom profanities or safe words, enable the `customize` feat
}
```

But wait, there's more! If your use-case is chat moderation, and you can store data on a per-user basis, you
might benefit from the `context` feature.
If your use-case is chat moderation, and you store data on a per-user basis, you can use `rustrict::Context` as a reference implementation:

```rust
#[cfg(feature = "context")]
Expand Down Expand Up @@ -178,7 +177,7 @@ is used as a dataset. Positive accuracy is the percentage of profanity detected

| Crate | Accuracy | Positive Accuracy | Negative Accuracy | Time |
|-------|----------|-------------------|-------------------|------|
| [rustrict](https://crates.io/crates/rustrict) | 79.74% | 94.00% | 76.18% | 9s |
| [rustrict](https://crates.io/crates/rustrict) | 79.74% | 94.00% | 76.19% | 9s |
| [censor](https://crates.io/crates/censor) | 76.16% | 72.76% | 77.01% | 23s |

## Development
Expand Down
1 change: 1 addition & 0 deletions src/character_analyzer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ fn main() {
'🐿' => 20,
'𒐫' => 40,
'𒈙' => 35,
'༺' | '༻' => 25,
_ => {
let max_width = (max_width(c, &fonts) as f32 / 100f32).round() as u16;
if max_width > u8::MAX as u16 {
Expand Down
Binary file modified src/character_widths.bin
Binary file not shown.
4 changes: 4 additions & 0 deletions src/context.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ use std::time::{Duration, Instant};

/// Context is useful for taking moderation actions on a per-user basis i.e. each user would get
/// their own Context.
///
/// # Recommendation
///
/// Use this as a reference implementation e.g. by copying and adapting it.
#[derive(Clone)]
#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
#[cfg_attr(doc, doc(cfg(feature = "context")))]
Expand Down
11 changes: 11 additions & 0 deletions src/dictionary_extra.txt
Original file line number Diff line number Diff line change
@@ -1,16 +1,21 @@
#8
# of
(until
2 secs
3 secs
4 secs
45s
5 secs
6 secs
7 secs
8 secs
88
9 secs
9 is still
99
0 secs
300 bot
600 bot
twinkie
two secs
three secs
Expand All @@ -22,6 +27,7 @@ eight secs
nine secs
ten secs
aboutit
admit it's
ain't it
alt
an ai
Expand Down Expand Up @@ -78,6 +84,7 @@ few secs
ffa game
fire cracker
fire crackers
forgot it's
francoitalian
franco italian
freakin
Expand All @@ -101,6 +108,7 @@ hellen
hellp
h on keyboard
h tier
hi @Bla
hi tirp
ho ho ho
honkeytonk
Expand Down Expand Up @@ -184,6 +192,7 @@ pp. 9
pussinboots
puss in boots
ref'd
refresh at
rip
saturated fat
shoehorn your
Expand All @@ -197,6 +206,7 @@ suicide squad
superbowlxxx
tally ho
tally-ho
tea the
test test test
then i guess
then talk
Expand Down Expand Up @@ -229,6 +239,7 @@ virgin islands
wassup
wasn't it
wouldn't it
xp or no
yass
yesturday
zenga
Expand Down
24 changes: 24 additions & 0 deletions src/false_positives.txt
Original file line number Diff line number Diff line change
@@ -1,13 +1,18 @@
# of
#8
(until
0 secs
2 secs
3 secs
300 bot
4 secs
45s
5 secs
6 secs
600 bot
7 secs
8 secs
9 is still
9 secs
a analog
a analyse
Expand Down Expand Up @@ -147,6 +152,7 @@ adipex nissan
adipex pee
adipex rated
adiposogenital
admit it's
ado lif
adramelech
adrammelech
Expand Down Expand Up @@ -2749,6 +2755,14 @@ bol lock
bol locks
bol look
bol looks
bomb china
bomb india
bomb iran
bomb israel
bomb palestine
bomb russia
bomb ukraine
bomb usage
bon ed
bon eric
bon erik
Expand Down Expand Up @@ -6863,6 +6877,7 @@ fore skin
forebreast
forget lost
forget married
forgot it's
fork cocktail
fork commission
fork cook
Expand Down Expand Up @@ -7800,6 +7815,7 @@ heterosex
heterotic
hexadic
hexanal
hi @Bla
hi little
hi tier
hi tile
Expand All @@ -7822,6 +7838,7 @@ highs perm
highs seeks
hilar
hildebrandic
hill hitting
hill illus
hill iv
hill ju
Expand Down Expand Up @@ -9654,6 +9671,7 @@ junk until
junk untitled
junk unto
jurisprude
just cumulative
justments cumulative
justments ext
justments hilt
Expand Down Expand Up @@ -9998,6 +10016,7 @@ kill twelve
kill twenty
kill twi
kill ty
killed yourself
killian
killing jewel
killing palestinian
Expand Down Expand Up @@ -10351,6 +10370,7 @@ less blin
less bo
lets cumulative
lets ext
lets fake
lets hilt
lets hit
lets lut
Expand Down Expand Up @@ -13515,6 +13535,7 @@ plumbaginaceous
plumbum
plumigerous
plzz
pmsg
pn lips
pn nigeria
pnigerophobia
Expand Down Expand Up @@ -15134,6 +15155,7 @@ res perm
res seeks
resex
resh aging
resh at
resh hilt
resh hit
resh it
Expand Down Expand Up @@ -17647,6 +17669,7 @@ tch linking
tch links
tch little
tchincou
tea the
teanal
teapottykin
teataster
Expand Down Expand Up @@ -19880,6 +19903,7 @@ xnxx until
xnxx untitled
xnxx unto
xnxx vie
xp or no
ya holes
yacht its
yacht texts
Expand Down
Loading

0 comments on commit 96d2fce

Please sign in to comment.