1 . 安装mmesg
wget http://www.coreseek.cn/uploads/csft/3.1/Source/mmseg-3.1.tar.gz
# ./configure --prefix=/usr/local/mmseg && make && make install
2 . 安装PYTHON2.5
wget http://www.python.org/ftp/python/2.5/Python-2.5.tar.bz2
./configure
3 . 设置环境变量
# cd -
# vim .profile
增加两行
export CPPFLAGS=-I/usr/local/include/python2.5
export LDFLAGS=-lpython2.5
4. 安装csft
wget http://www.coreseek.cn/uploads/csft/3.1/Source/csft-3.1.tar.gzhttp://www.coreseek.cn/uploads/csft/3.1/Source/csft-3.1.tar.gz
./configure --prefix=/usr/local/webserver/csft --with-mysql=/usr/local/webserver/mysql/ --with-mmseg-includes=/usr/local/mmseg/include/mmseg --with-mmseg-libs=/usr/local/mmseg/lib/ --without-python
You have new mail in /var/spool/mail/zz
$make
Making all in src
make[1]: Entering directory `/home/zyf/zyfwork/csft3.1b3/src'
if test -d ../.svn; then svn info .. --xml | perl svnxrev.pl; fi;
make all-am
make[2]: Entering directory `/home/zyf/zyfwork/csft3.1b3/src'
if g++ -DHAVE_CONFIG_H -I. -I. -I../config -DSYSCONFDIR="\"/usr/local/webserver/csft/etc\"" -I/usr/local/include -pthread -I/usr/local/webserver/mysql/include/mysql -DUNIV_LINUX -I/usr/local/include/mmseg -Wall -g -D_FILE_OFFSET_BITS=64 -O3 -DNDEBUG -MT tokenizer_zhcn.o -MD -MP -MF ".deps/tokenizer_zhcn.Tpo" -c -o tokenizer_zhcn.o tokenizer_zhcn.cpp; \
then mv -f ".deps/tokenizer_zhcn.Tpo" ".deps/tokenizer_zhcn.Po"; else rm -f ".deps/tokenizer_zhcn.Tpo"; exit 1; fi
In file included from /usr/local/include/mmseg/Segmenter.h:38,
from /usr/local/include/mmseg/SegmenterManager.h:32,
from tokenizer_zhcn.cpp:1:
/usr/local/include/mmseg/mmthunk.h: In member function `u2 css::ChunkQueue::getToken()':
/usr/local/include/mmseg/mmthunk.h:140: warning: comparison between signed and unsigned integer expressions
/usr/local/include/mmseg/mmthunk.h:157: warning: converting of negative value `-0x000000001' to `size_t'
/usr/local/include/mmseg/mmthunk.h:158: warning: comparison between signed and unsigned integer expressions
tokenizer_zhcn.cpp: In member function `css::Segmenter* CSphTokenizer_zh_CN_UTF8_Private::GetSegmenter(const char*)':
tokenizer_zhcn.cpp:54: error: no matching function for call to `css::SegmenterManager::getSegmenter(bool)'
/usr/local/include/mmseg/SegmenterManager.h:46: note: candidates are: css::Segmenter* css::SegmenterManager::getSegmenter()
tokenizer_zhcn.cpp: In member function `virtual bool CSphTokenizer_zh_CN_UTF8::IsSentenceEnd()':
tokenizer_zhcn.cpp:155: error: 'class css::Segmenter' has no member named 'isSentenceEnd'
tokenizer_zhcn.cpp: In member function `virtual bool CSphTokenizer_zh_CN_UTF8::IsKeyWord(BYTE*, int)':
tokenizer_zhcn.cpp:161: error: 'class css::Segmenter' has no member named 'isKeyWord'
tokenizer_zhcn.cpp: In member function `virtual int CSphTokenizer_zh_CN_UTF8::GetWordWeight(BYTE*, int) const':
tokenizer_zhcn.cpp:167: error: 'class css::Segmenter' has no member named 'getWordWeight'
tokenizer_zhcn.cpp: In member function `virtual BYTE* CSphTokenizer_zh_CN_UTF8::GetToken(int)':
tokenizer_zhcn.cpp:180: error: no matching function for call to `css::Segmenter::peekToken(u2&, u2&, int&)'
/usr/local/include/mmseg/Segmenter.h:110: note: candidates are: const u1* css::Segmenter::peekToken(u2&, u2&)
tokenizer_zhcn.cpp:181: error: no matching function for call to `css::Segmenter::popToken(u2&, int&)'
/usr/local/include/mmseg/Segmenter.h:111: note: candidates are: void css::Segmenter::popToken(u2)
make[2]: *** [tokenizer_zhcn.o] Error 1
make[2]: Leaving directory `/home/zyf/zyfwork/csft3.1b3/src'
make[1]: *** [all] Error 2
make[1]: Leaving directory `/home/zyf/zyfwork/csft3.1b3/src'
make: *** [all-recursive] Error 1
/usr/local/sphinx/src/sphinx.cpp:15557: undefined reference to `libiconv_open'
libsphinx.a(sphinx.o)(.text+0x53a01):/usr/local/sphinx/src/sphinx.cpp:15575: undefined
reference to `libiconv'
libsphinx.a(sphinx.o)(.text+0x53a28):/usr/local/sphinx/src/sphinx.cpp:15581: undefined
reference to `libiconv_close'
collect2: ld returned 1 exit status
make[2]: * [indexer] Error 1
make[2]: Leaving directory `/usr/local/sphinx/src'
make[1]: * [all] Error 2
make[1]: Leaving directory `/usr/local/sphinx/src'
make: * [all-recursive] Error 1
官网解决办法:
In the meantime I've change the configuration file and set
#define USE_LIBICONV 0 in line 8179.
修改configure 文件把 #define USE_LIBICONV 0 最后的数值由1改为0
重新编译。
我使用的参数如下:
./configure --prefix=/usr/local/coreseek --with-mysql=/opt/lampp --with-mysql-includes=/opt/lampp/include/mysql/ --with-mysql-libs=/opt/lampp/lib/mysql/ --with-mmseg-includes=/usr/local/mmseg/include/mmseg --with-mmseg-libs=/usr/local/mmseg/lib/
编译错误一
make[2]: *** [indexer] Error 1
make[2]: Leaving directory `/www/tmp/csft-3.1/src'
make[1]: *** [all] Error 2
make[1]: Leaving directory `/www/tmp/csft-3.1/src'
make: *** [all-recursive] Error 1
解决办法
vi ./src/sphinx.cpp
注释以下代码
#case TOKENIZER_ZHCN_GBK:
#pTokenizer = sphCreateGBKChineseTokenizer
#(tSettings.m_sDictPath.cstr(), tSettings.m_nBest); break;
然后重新编译
make clean
make
make install
错误二,索引出错
[root@localhost csft-3.1]# /usr/local/coreseek/bin/indexer --config /usr/local/coreseek/etc/sphinx.conf --all
/usr/local/coreseek/bin/indexer: error while loading shared libraries: libmysqlclient.so.16: cannot open shared object file: No such file or directory
解决办法
ln -s /opt/lampp/lib/mysql/libmysqlclient.so.16 /usr/lib/libmysqlclient.so.16
错误三,索引不生成
解决办法,原来是手误,把/usr/local/coreseek/bin/indexer --config /usr/local/coreseek/etc/sphinx.conf --all
写成
/usr/local/coreseek/bin/indexer --config /usr/local/coreseek/etc/sphinx.conf -all
错误四,服务启动报错
[root@localhost coreseek]# ./bin/searchd --config /usr/local/coreseek/etc/sphinx.conf
Coreseek Full Text Server 3.1
Copyright (c) 2006-2008 coreseek.com
using config file '/usr/local/coreseek/etc/sphinx.conf'...
listening on all interfaces, port=3312
iniparser: cannot open /usr/local/coreseek/data/dict/mmseg.ini
解决办法
vi /usr/local/coreseek/data/dict/mmseg.ini
输入以下
[mmseg]
merge_number_and_ascii=1;
number_and_ascii_joint=-;
compress_space=0;
seperate_number_ascii=1;
以上解释如下
/*
merge_number_and_ascii: 字母和数字连续出现是非切分
number_and_ascii_joint:连接数字和字母可用的符号,如'-' '.' 等
compress_space:暂时无效
seperate_number_ascii:是否拆分数字,如 1988 -> 1/x 9/x 8/x 8/x
*/