-
Notifications
You must be signed in to change notification settings - Fork 26
/
TODO
40 lines (40 loc) · 1.75 KB
/
TODO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
- TODO's marked all around in the source files.
- Derivation -Ik that makes adjectives from verbs.
- Causative forms of verbs ending in -Il needs attention.
- The formation of pronouns by -(s)I to "place adverbials", such as
içeri, aşağı, öte ...
- The adjectives (or nouns or postpositions) of location 'iç, dış,
üst, etc.' form complex adjectivals 'insanlık dışı, doğa üstü...'.
This is similar to what is described as 'postpositional usage' in
G&K, but it is different than the patterns listed by them. It also
looks like a noun compound.
- Relate irregular causatives, such as kalk->kaldır to their root forms.
kalk -> kaldır
öğren -> öğret
- Derivational morphemes need review
- Apostrophe insertion does not currently comply with official spelling
rules.
- <acr> is not analyzed
- Mark auxiliary verbs like ol/bul.
- Mark verbs for transitivity.
- Mark verbs for (typical) cases their indirect objects take.
- Handling of reduplication is less than acceptable. We have a
lexicon, Redup, with a few non-wrod reduplication pieces, like
'maşkım'. However,
(1) ideally these items should be tokenized together with the
previous word: e.g., 'aşkım maşkım'
(2) They should be allowed to inflect (the entry should be
something like 'aşk maşk<n>').
- apply up> okuyorsak mı
???
- yan- and yak- relation
- Use a more natural lexical representation for compound noun roots.
Now, we require the -(s)I at the end to be removed, which can be
done by xfst rules just fine.
- Optional apostrophe choice is buggy, it seems to be always
compulsory.
- -ken is allowed all all verbal noun suffixes. Not quite correct.
. gelmişken is ok
. geldikken is not
. gelişken is not
....