Monday, August 25, 2008

Sorry, for the bugs of novel-pinyin 0.2.3 release.

As the first release of 0.2.x series, I put the novel-pinyin on sourceforge. But later I withdraw the package, because some serious bug has been found.
During the Beijing Olympics, I finally released the novel-pinyin 0.2.3 package.
Thank lyman for feedbacking the bug in initializing code.
As novel-pinyin has been released, the fix code is relatively small, so I decide to release the fix as a seperate patch.
using the following command in novel-pinyin-0.2.3 directory:
patch -p2 < ../../urgent-patch-fix-novel-pinyin-first-load.patch

顺便提一句,输入法的中文名称变为了新智能拼音,英文名称为Novel Pinyin不变。

Friday, August 08, 2008

novel-pinyin 0.3.x wishlist

TODO Items:
1.Modify pinyin large table to merge scim-pinyin phrase lib into gb_char.table.
2. Write phrase to token conversion. (phrase_large_table)
3. Write n-gram segment to bootstrap phrase generation. (replace current mmseg.)
4. Larger corpus learning.
5. Entropy-based n-gram prune.
6. Add professional phrase libraries support.
7. Better fuzzy pinyin support.(like ms-pinyin)

novel-pinyin 0.2.3 released

Done Items:
1. Import the entire scim-pinyin phrases as corpus.
2. Better HMM parameter adjusts.
3. Better candidates adjusts.
4. Add version check.
5. Add data file corruption detection.
6. Protect against integer overflow.

Todo Items:
A input pad module for temporarily input Chinese characters by strokes lookup.
(Maybe this can be done in Hacker Week.)