Finished n-gram storage code, and update code on http://novel-pinyin.cvs.sourceforge.net/novel-pinyin/novel-pinyin/.
On currently implementation, I modified the P(P|W) from ratio which is computed from scim-pinyin, to k/n, k indicates k matched pinyins, n indicates total n pinyins for word W.
I don't know how it will influence the result of HMM correct rate, hopes it will not be bad.
Tested this approach on research prototype, the correct rate is lower. And test with manual input, it don't work too bad.
And counting the computing complexity, it seems that the speed is sufficient.
Friday, November 30, 2007
Wednesday, November 14, 2007
- N-gram file
- Training using parameters in prototype system.
- Original Lookup with Candidate Selection.
- Learn User Sentence when Commit String.
- Special Table support rewrite from scim-pinyin in c.
- Scim UI Config Module.