Thursday, December 20, 2007

Finish segment & training part, upload it to sourceforge.

Upload new novel-pinyin code to sourceforge, currently finished segment & training part.

In this place, I use a modified interpolation method to ease implementation.
The parameter optimization part is done in research prototype.
So the code in novel-pinyin is relatively simple, just use parameters computed from prototype.

The word segment use shortest path algorithm to segment words, and prepare the data to training part.

Thursday, December 13, 2007

AIGLX on OpenSUSE 10.3

My Notebook has a 945 GM Graphics Card, which support AIGLX.
Default 3D Desktop on openSUSE will use XGL. But I want other OpenGL applications can benefit from
hardware acceleration. So I switched to AIGLX.
Refer to http://en.opensuse.org/AIGLX.

And I use X.org config file from here:
Section "Device"
Identifier "** Intel i810 (generic) [i810]"
Driver "i810"
VideoRam 262144
Option "DRI" "true"
Option "XAANoOffscreenPixmaps" "true"
EndSection

Section "ServerLayout"
Option "AIGLX" "true"
EndSection

Section "DRI"
Mode 0666
EndSection

Section "Extensions"
Option "Composite" "Enable"
EndSection

For Java Application, set AWT_TOOLKIT=MToolkit, from here:
export AWT_TOOLKIT=MToolkit to avoid gray window.

Friday, November 30, 2007

Upload new novel-pinyin source code to sourceforge.net

Finished n-gram storage code, and update code on http://novel-pinyin.cvs.sourceforge.net/novel-pinyin/novel-pinyin/.

On currently implementation, I modified the P(P|W) from ratio which is computed from scim-pinyin, to k/n, k indicates k matched pinyins, n indicates total n pinyins for word W.
I don't know how it will influence the result of HMM correct rate, hopes it will not be bad.

Tested this approach on research prototype, the correct rate is lower. And test with manual input, it don't work too bad.

And counting the computing complexity, it seems that the speed is sufficient.

Wednesday, November 14, 2007

Novel-Pinyin Ver1 TODO List

Storage:
  • N-gram file
Training:
  • Training using parameters in prototype system.
Lookup:
  • Original Lookup with Candidate Selection.
Self-Learning:
  • Learn User Sentence when Commit String.
novel-imengine:
  • Special Table support rewrite from scim-pinyin in c.
  • Scim UI Config Module.

Tuesday, October 16, 2007

Federico in Beijing

Now Federico has come to Beijing. He is a great hacker and good man.
Here we together looking at the pango CJK performance issue.
http://www.gnome.org/~federico/news-2007-10.html#pango-cjk-1

And He went to the Great Wall:
http://www.gnome.org/~federico/news-2007-10.html#15

Tuesday, September 11, 2007

Planet SUSE blog

From now on, I will post my blog in English, Because of the planetsuse.org.