Friday, November 30, 2007

Upload new novel-pinyin source code to sourceforge.net

Finished n-gram storage code, and update code on http://novel-pinyin.cvs.sourceforge.net/novel-pinyin/novel-pinyin/.

On currently implementation, I modified the P(P|W) from ratio which is computed from scim-pinyin, to k/n, k indicates k matched pinyins, n indicates total n pinyins for word W.
I don't know how it will influence the result of HMM correct rate, hopes it will not be bad.

Tested this approach on research prototype, the correct rate is lower. And test with manual input, it don't work too bad.

And counting the computing complexity, it seems that the speed is sufficient.

No comments: