As the first release of 0.2.x series, I put the novel-pinyin on sourceforge. But later I withdraw the package, because some serious bug has been found.
During the Beijing Olympics, I finally released the novel-pinyin 0.2.3 package.
Thank lyman for feedbacking the bug in initializing code.
As novel-pinyin has been released, the fix code is relatively small, so I decide to release the fix as a seperate patch.
using the following command in novel-pinyin-0.2.3 directory:
patch -p2 < ../../urgent-patch-fix-novel-pinyin-first-load.patch
PS:
顺便提一句,输入法的中文名称变为了新智能拼音,英文名称为Novel Pinyin不变。
Monday, August 25, 2008
Friday, August 08, 2008
novel-pinyin 0.3.x wishlist
TODO Items:
1.Modify pinyin large table to merge scim-pinyin phrase lib into gb_char.table.
2. Write phrase to token conversion. (phrase_large_table)
3. Write n-gram segment to bootstrap phrase generation. (replace current mmseg.)
4. Larger corpus learning.
5. Entropy-based n-gram prune.
6. Add professional phrase libraries support.
7. Better fuzzy pinyin support.(like ms-pinyin)
1.Modify pinyin large table to merge scim-pinyin phrase lib into gb_char.table.
2. Write phrase to token conversion. (phrase_large_table)
3. Write n-gram segment to bootstrap phrase generation. (replace current mmseg.)
4. Larger corpus learning.
5. Entropy-based n-gram prune.
6. Add professional phrase libraries support.
7. Better fuzzy pinyin support.(like ms-pinyin)
Labels:
novel-pinyin
novel-pinyin 0.2.3 released
Done Items:
1. Import the entire scim-pinyin phrases as corpus.
2. Better HMM parameter adjusts.
3. Better candidates adjusts.
4. Add version check.
5. Add data file corruption detection.
6. Protect against integer overflow.
Todo Items:
A input pad module for temporarily input Chinese characters by strokes lookup.
(Maybe this can be done in Hacker Week.)
1. Import the entire scim-pinyin phrases as corpus.
2. Better HMM parameter adjusts.
3. Better candidates adjusts.
4. Add version check.
5. Add data file corruption detection.
6. Protect against integer overflow.
Todo Items:
A input pad module for temporarily input Chinese characters by strokes lookup.
(Maybe this can be done in Hacker Week.)
Labels:
novel-pinyin
Wednesday, May 14, 2008
novel-pinyin 0.2.x wishlist
As the first version of novel-pinyin has been released, some feedback has been received.
The next version of novel-pinyin will try to finish the following todo tasks:
1. Model Modification. Modify the P(P|W) from k/n to C(P,W)/C(W).
(C(P,W) stands for counter of pinyin and word combination,
C(W) stands for word counter.)
2. Dynamic adjust phrase positions according to bi-gram possibilities.
As in HMM model training process, the frequency adjusted is very small(1 or 6).
To magnify the position changes, replace unigram with bi-gram when possible.
3. Versioned Data File Format.
As data file format will be changed in next release. So I will add a version file in
~/.scim/novel-pinyin, to indicate file format version.
When different version has been detected, the files of old version will be flushed.
Optional:
skim integration.
The next version of novel-pinyin will try to finish the following todo tasks:
1. Model Modification. Modify the P(P|W) from k/n to C(P,W)/C(W).
(C(P,W) stands for counter of pinyin and word combination,
C(W) stands for word counter.)
2. Dynamic adjust phrase positions according to bi-gram possibilities.
As in HMM model training process, the frequency adjusted is very small(1 or 6).
To magnify the position changes, replace unigram with bi-gram when possible.
3. Versioned Data File Format.
As data file format will be changed in next release. So I will add a version file in
~/.scim/novel-pinyin, to indicate file format version.
When different version has been detected, the files of old version will be flushed.
Optional:
skim integration.
Labels:
novel-pinyin
Tuesday, February 19, 2008
novel-pinyin 0.1.0 internal test
You can get newest novel-pinyin 0.1.0 from the following url:
http://download.opensuse.org/repositories/home:/wupeng/
The source code in sourceforge.net misses the data file, so it will not run.
Please use the rpm on the above url.
http://download.opensuse.org/repositories/home:/wupeng/
The source code in sourceforge.net misses the data file, so it will not run.
Please use the rpm on the above url.
Labels:
novel-pinyin
Thursday, February 14, 2008
2008 New Year!
我自己的输入法Novel Pinyin终于跑起来了,还有一些bug,不过影响不大。现在我就在用我自己写的输入法,写自己的博客。
下周开始在同事中测试新的输入法。
首先,在这个周末,要把rpm在openSUSE Build Service上做出来。
下周开始在同事中测试新的输入法。
首先,在这个周末,要把rpm在openSUSE Build Service上做出来。
Labels:
novel-pinyin
Thursday, December 20, 2007
Finish segment & training part, upload it to sourceforge.
Upload new novel-pinyin code to sourceforge, currently finished segment & training part.
In this place, I use a modified interpolation method to ease implementation.
The parameter optimization part is done in research prototype.
So the code in novel-pinyin is relatively simple, just use parameters computed from prototype.
The word segment use shortest path algorithm to segment words, and prepare the data to training part.
In this place, I use a modified interpolation method to ease implementation.
The parameter optimization part is done in research prototype.
So the code in novel-pinyin is relatively simple, just use parameters computed from prototype.
The word segment use shortest path algorithm to segment words, and prepare the data to training part.
Labels:
novel-pinyin
Thursday, December 13, 2007
AIGLX on OpenSUSE 10.3
My Notebook has a 945 GM Graphics Card, which support AIGLX.
Default 3D Desktop on openSUSE will use XGL. But I want other OpenGL applications can benefit from
hardware acceleration. So I switched to AIGLX.
Refer to http://en.opensuse.org/AIGLX.
And I use X.org config file from here:
For Java Application, set AWT_TOOLKIT=MToolkit, from here:
export AWT_TOOLKIT=MToolkit to avoid gray window.
Default 3D Desktop on openSUSE will use XGL. But I want other OpenGL applications can benefit from
hardware acceleration. So I switched to AIGLX.
Refer to http://en.opensuse.org/AIGLX.
And I use X.org config file from here:
Section "Device"
Identifier "** Intel i810 (generic) [i810]"
Driver "i810"
VideoRam 262144
Option "DRI" "true"
Option "XAANoOffscreenPixmaps" "true"
EndSection
Section "ServerLayout"
Option "AIGLX" "true"
EndSection
Section "DRI"
Mode 0666
EndSection
Section "Extensions"
Option "Composite" "Enable"
EndSection
For Java Application, set AWT_TOOLKIT=MToolkit, from here:
export AWT_TOOLKIT=MToolkit to avoid gray window.
Labels:
AIGLX
Subscribe to:
Posts (Atom)
