Friday, November 30, 2007

Upload new novel-pinyin source code to sourceforge.net

Finished n-gram storage code, and update code on http://novel-pinyin.cvs.sourceforge.net/novel-pinyin/novel-pinyin/.

On currently implementation, I modified the P(P|W) from ratio which is computed from scim-pinyin, to k/n, k indicates k matched pinyins, n indicates total n pinyins for word W.
I don't know how it will influence the result of HMM correct rate, hopes it will not be bad.

Tested this approach on research prototype, the correct rate is lower. And test with manual input, it don't work too bad.

And counting the computing complexity, it seems that the speed is sufficient.

Wednesday, November 14, 2007

Novel-Pinyin Ver1 TODO List

Storage:
  • N-gram file
Training:
  • Training using parameters in prototype system.
Lookup:
  • Original Lookup with Candidate Selection.
Self-Learning:
  • Learn User Sentence when Commit String.
novel-imengine:
  • Special Table support rewrite from scim-pinyin in c.
  • Scim UI Config Module.