TODO Items:
1.Modify pinyin large table to merge scim-pinyin phrase lib into gb_char.table.
2. Write phrase to token conversion. (phrase_large_table)
3. Write n-gram segment to bootstrap phrase generation. (replace current mmseg.)
4. Larger corpus learning.
5. Entropy-based n-gram prune.
6. Add professional phrase libraries support.
7. Better fuzzy pinyin support.(like ms-pinyin)
Friday, August 08, 2008
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment