<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0">
<channel>
<title><![CDATA[沧海一粟]]></title> 
<link>http://www.dzhope.com/index.php</link> 
<description><![CDATA[Web系统架构与服务器运维,php开发]]></description> 
<language>zh-cn</language> 
<copyright><![CDATA[沧海一粟]]></copyright>
<item>
<link>http://www.dzhope.com/post//</link>
<title><![CDATA[Discuz! X2增加Sphinx全文检索支持操作记录]]></title> 
<author>jed &lt;jed521@163.com&gt;</author>
<category><![CDATA[服务器技术]]></category>
<pubDate>Sat, 28 Apr 2012 08:35:26 +0000</pubDate> 
<guid>http://www.dzhope.com/post//</guid> 
<description>
<![CDATA[ 
	Sphinx是一个很好的全文可检索软件,它支持MySQL和PGSQL.<br/><br/>一般来说Sphinx原版对英文全文检索校好,但对中文全文检索的话,就使用国人的修改版coreseek了.<br/><br/>coreseek安装比较麻烦一些.是我的安装记录:<br/><br/>首先下载coreseek源码:<br/><div class="code"><br/>wget http://www.coreseek.cn/uploads/csft/3.2/coreseek-3.2.14.tar.gz<br/></div><br/>解压:<br/><div class="code"><br/>tar xzvf coreseek-3.2.14.tar.gz<br/>cd coreseek-3.2.14<br/></div><br/>进入coreseek目录下有三个子目录,分别是mmseg和csft和testpack.需要分别先后安装mmseg和csft.<br/><br/>安装mmseg,中文分词库:<br/><div class="code"><br/>cd mmseg-3.2.14<br/>aclocal<br/>libtoolize --force<br/>automake --add-missing<br/>autoconf<br/>autoheader<br/>make clean #此时如有错误可忽略不管<br/>./configure --prefix=/usr/local/mmseg3<br/>make &amp;&amp; make install<br/></div><br/>在这里先做一点点小优化:<br/><div class="code"><br/>cd data<br/>ln -s /usr/local/mmseg3/mmseg /usr/bin/mmseg<br/>mmseg -u unigram.txt<br/>cp unigram.txt.uni /usr/local/mmseg3/etc/uni.lib<br/>cd ..<br/></div><br/>回到上级目录:<br/>cd ..<br/><br/>安装csft,也就是coreseek主程序:<br/><div class="code"><br/>cd csft-3.2.14<br/>sh buildconf.sh<br/>./configure --prefix=/usr/local/coreseek --without-python &#92;<br/>--without-unixodbc --with-mmseg --with-mmseg-includes=/usr/local/mmseg3/include/mmseg/ &#92;<br/>--with-mmseg-libs=/usr/local/mmseg3/lib/ --with-mysql<br/>make &amp;&amp; make install<br/>cd ..<br/></div><br/>这样coreseek就安装好了.<br/><br/>简单测试一下coreseek是否运行正确:<br/><div class="code"><br/>cd ../testpack<br/>/usr/local/coreseek/bin/indexer -c etc/csft.conf<br/>##以下为正常情况下的提示信息：<br/>Coreseek Fulltext 3.2 &#91; Sphinx 0.9.9-release (r2117)&#93;<br/>Copyright (c) 2007-2010,<br/>Beijing Choice Software Technologies Inc (http://www.coreseek.com)<br/><br/>using config file &#039;etc/csft.conf&#039;...<br/>total 0 reads, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg<br/>total 0 writes, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg<br/><br/>/usr/local/coreseek/bin/indexer -c etc/csft.conf --all<br/></div><br/>下面修改配置,支持Discuz! X2:<br/><div class="code"><br/>vi /usr/local/coreseek/etc/sphinx.conf<br/><br/>#<br/># Sphinx configuration file sample<br/>#<br/># WARNING! While this sample file mentions all available options,<br/># it contains (very) short helper descriptions only. Please refer to<br/># doc/sphinx.html for details.<br/>#<br/><br/>#############################################################################<br/>## data source definition<br/>#############################################################################<br/><br/>source threads<br/>&#123;<br/># data source type. mandatory, no default value<br/># known types are mysql, pgsql, mssql, xmlpipe, xmlpipe2, odbc<br/>type&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= mysql<br/><br/>#####################################################################<br/>## SQL settings (for &#039;mysql&#039; and &#039;pgsql&#039; types)<br/>#####################################################################<br/><br/># some straightforward parameters for SQL source types<br/>sql_host&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= localhost<br/>sql_user&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= XXX<br/>sql_pass&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= XXX<br/>sql_db&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= XXX<br/>sql_port&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 3306&nbsp;&nbsp;&nbsp;&nbsp;# optional, default is 3306<br/><br/># UNIX socket name<br/># optional, default is empty (reuse client library defaults)<br/># usually &#039;/var/lib/mysql/mysql.sock&#039; on Linux<br/># usually &#039;/tmp/mysql.sock&#039; on FreeBSD<br/>#<br/># sql_sock&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= /tmp/mysql.sock<br/><br/># MySQL specific client connection flags<br/># optional, default is 0<br/>#<br/># mysql_connect_flags&nbsp;&nbsp;&nbsp;&nbsp;= 32 # enable compression<br/><br/># MySQL specific SSL certificate settings<br/># optional, defaults are empty<br/>#<br/># mysql_ssl_cert&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= /etc/ssl/client-cert.pem<br/># mysql_ssl_key&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= /etc/ssl/client-key.pem<br/># mysql_ssl_ca&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= /etc/ssl/cacert.pem<br/><br/># MS SQL specific Windows authentication mode flag<br/># MUST be in sync with charset_type index-level setting<br/># optional, default is 0<br/>#<br/># mssql_winauth&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 1 # use currently logged on user credentials<br/><br/># MS SQL specific Unicode indexing flag<br/># optional, default is 0 (request SBCS data)<br/>#<br/># mssql_unicode&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 1 # request Unicode data from server<br/><br/># ODBC specific DSN (data source name)<br/># mandatory for odbc source type, no default value<br/>#<br/># odbc_dsn&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= DBQ=C:&#92;data;DefaultDir=C:&#92;data;Driver=&#123;Microsoft Text Driver (*.txt; *.csv)&#125;;<br/># sql_query&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= SELECT id, data FROM documents.csv<br/><br/># pre-query, executed before the main fetch query<br/># multi-value, optional, default is empty list of queries<br/>#<br/>sql_query_pre&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= SET NAMES utf8<br/>sql_query_pre&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= SET SESSION query_cache_type=OFF<br/><br/>#timy<br/>sql_query_pre&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; = CREATE TABLE IF NOT EXISTS sph_counter ( counter_id INTEGER PRIMARY KEY NOT NULL,max_doc_id INTEGER NOT NULL)<br/>sql_query_pre&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= REPLACE INTO sph_counter SELECT 1, MAX(tid)-100 FROM pre_forum_thread<br/>#timy<br/><br/># main document fetch query<br/># mandatory, integer document ID field MUST be the first selected column<br/>sql_query&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= &#92;<br/>SELECT t.tid AS id,t.tid,t.subject,t.digest,t.displayorder,t.authorid,t.lastpost,t.special &#92;<br/>FROM pre_forum_thread AS t &#92;<br/>WHERE t.tid&gt;=$start AND t.tid&lt;=$end<br/><br/># range query setup, query that must return min and max ID values<br/># optional, default is empty<br/>#<br/># sql_query will need to reference $start and $end boundaries<br/># if using ranged query:<br/>#<br/># sql_query&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= &#92;<br/>#&nbsp;&nbsp;&nbsp;&nbsp;SELECT doc.id, doc.id AS group, doc.title, doc.data &#92;<br/>#&nbsp;&nbsp;&nbsp;&nbsp;FROM documents doc &#92;<br/>#&nbsp;&nbsp;&nbsp;&nbsp;WHERE id&gt;=$start AND id&lt;=$end<br/>#<br/>sql_query_range&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= SELECT (SELECT MIN(tid) FROM pre_forum_thread),max_doc_id FROM sph_counter WHERE counter_id=1<br/><br/># range query step<br/># optional, default is 1024<br/>#<br/># sql_range_step&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 1000<br/><br/># unsigned integer attribute declaration<br/># multi-value (an arbitrary number of attributes is allowed), optional<br/># optional bit size can be specified, default is 32<br/>#<br/># sql_attr_uint&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= author_id<br/># sql_attr_uint&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= forum_id:9 # 9 bits for forum_id<br/>sql_attr_uint&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= tid<br/>sql_attr_uint&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= digest<br/>sql_attr_uint&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= displayorder<br/>sql_attr_uint&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= authorid<br/>sql_attr_uint&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= special<br/><br/># boolean attribute declaration<br/># multi-value (an arbitrary number of attributes is allowed), optional<br/># equivalent to sql_attr_uint with 1-bit size<br/>#<br/># sql_attr_bool&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= is_deleted<br/><br/># bigint attribute declaration<br/># multi-value (an arbitrary number of attributes is allowed), optional<br/># declares a signed (unlike uint!) 64-bit attribute<br/>#<br/># sql_attr_bigint&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= my_bigint_id<br/><br/># UNIX timestamp attribute declaration<br/># multi-value (an arbitrary number of attributes is allowed), optional<br/># similar to integer, but can also be used in date functions<br/>#<br/># sql_attr_timestamp&nbsp;&nbsp;&nbsp;&nbsp;= posted_ts<br/># sql_attr_timestamp&nbsp;&nbsp;&nbsp;&nbsp;= last_edited_ts<br/>sql_attr_timestamp&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= lastpost<br/><br/># string ordinal attribute declaration<br/># multi-value (an arbitrary number of attributes is allowed), optional<br/># sorts strings (bytewise), and stores their indexes in the sorted list<br/># sorting by this attr is equivalent to sorting by the original strings<br/>#<br/># sql_attr_str2ordinal&nbsp;&nbsp;&nbsp;&nbsp;= author_name<br/><br/># floating point attribute declaration<br/># multi-value (an arbitrary number of attributes is allowed), optional<br/># values are stored in single precision, 32-bit IEEE 754 format<br/>#<br/># sql_attr_float = lat_radians<br/># sql_attr_float = long_radians<br/><br/># multi-valued attribute (MVA) attribute declaration<br/># multi-value (an arbitrary number of attributes is allowed), optional<br/># MVA values are variable length lists of unsigned 32-bit integers<br/>#<br/># syntax is ATTR-TYPE ATTR-NAME &#039;from&#039; SOURCE-TYPE &#91;;QUERY&#93; &#91;;RANGE-QUERY&#93;<br/># ATTR-TYPE is &#039;uint&#039; or &#039;timestamp&#039;<br/># SOURCE-TYPE is &#039;field&#039;, &#039;query&#039;, or &#039;ranged-query&#039;<br/># QUERY is SQL query used to fetch all ( docid, attrvalue ) pairs<br/># RANGE-QUERY is SQL query used to fetch min and max ID values, similar to &#039;sql_query_range&#039;<br/>#<br/># sql_attr_multi&nbsp;&nbsp;&nbsp;&nbsp;= uint tag from query; SELECT id, tag FROM tags<br/># sql_attr_multi&nbsp;&nbsp;&nbsp;&nbsp;= uint tag from ranged-query; &#92;<br/>#&nbsp;&nbsp;&nbsp;&nbsp;SELECT id, tag FROM tags WHERE id&gt;=$start AND id&lt;=$end; &#92;<br/>#&nbsp;&nbsp;&nbsp;&nbsp;SELECT MIN(id), MAX(id) FROM tags<br/><br/># post-query, executed on sql_query completion<br/># optional, default is empty<br/>#<br/># sql_query_post&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=<br/><br/># post-index-query, executed on successful indexing completion<br/># optional, default is empty<br/># $maxid expands to max document ID actually fetched from DB<br/>#<br/># sql_query_post_index = REPLACE INTO counters ( id, val ) &#92;<br/>#&nbsp;&nbsp;&nbsp;&nbsp;VALUES ( &#039;max_indexed_id&#039;, $maxid )<br/><br/># ranged query throttling, in milliseconds<br/># optional, default is 0 which means no delay<br/># enforces given delay before each query step<br/>sql_ranged_throttle&nbsp;&nbsp;&nbsp;&nbsp;= 0<br/><br/># document info query, ONLY for CLI search (ie. testing and debugging)<br/># optional, default is empty<br/># must contain $id macro and must fetch the document by that id<br/>sql_query_info&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= SELECT * FROM pre_forum_thread WHERE tid=$id<br/><br/># kill-list query, fetches the document IDs for kill-list<br/># k-list will suppress matches from preceding indexes in the same query<br/># optional, default is empty<br/>#<br/># sql_query_killlist&nbsp;&nbsp;&nbsp;&nbsp;= SELECT id FROM documents WHERE edited&gt;=@last_reindex<br/><br/># columns to unpack on indexer side when indexing<br/># multi-value, optional, default is empty list<br/>#<br/># unpack_zlib = zlib_column<br/># unpack_mysqlcompress = compressed_column<br/># unpack_mysqlcompress = compressed_column_2<br/><br/># maximum unpacked length allowed in MySQL COMPRESS() unpacker<br/># optional, default is 16M<br/>#<br/># unpack_mysqlcompress_maxsize = 16M<br/><br/>#####################################################################<br/>## xmlpipe settings<br/>#####################################################################<br/><br/># type&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= xmlpipe<br/><br/># shell command to invoke xmlpipe stream producer<br/># mandatory<br/>#<br/># xmlpipe_command&nbsp;&nbsp;&nbsp;&nbsp;= cat /usr/local/coreseek/var/test.xml<br/><br/>#####################################################################<br/>## xmlpipe2 settings<br/>#####################################################################<br/><br/># type&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= xmlpipe2<br/># xmlpipe_command&nbsp;&nbsp;&nbsp;&nbsp;= cat /usr/local/coreseek/var/test2.xml<br/><br/># xmlpipe2 field declaration<br/># multi-value, optional, default is empty<br/>#<br/># xmlpipe_field&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= subject<br/># xmlpipe_field&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= content<br/><br/># xmlpipe2 attribute declaration<br/># multi-value, optional, default is empty<br/># all xmlpipe_attr_XXX options are fully similar to sql_attr_XXX<br/>#<br/># xmlpipe_attr_timestamp&nbsp;&nbsp;&nbsp;&nbsp;= published<br/># xmlpipe_attr_uint&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= author_id<br/><br/># perform UTF-8 validation, and filter out incorrect codes<br/># avoids XML parser choking on non-UTF-8 documents<br/># optional, default is 0<br/>#<br/># xmlpipe_fixup_utf8&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 1<br/>&#125;<br/><br/>#############################################################################<br/>## index definition<br/>#############################################################################<br/><br/># local index example<br/>#<br/># this is an index which is stored locally in the filesystem<br/>#<br/># all indexing-time options (such as morphology and charsets)<br/># are configured per local index<br/>index threads<br/>&#123;<br/># document source(s) to index<br/># multi-value, mandatory<br/># document IDs must be globally unique across all sources<br/>source&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= threads<br/><br/># index files path and file name, without extension<br/># mandatory, path must be writable, extensions will be auto-appended<br/>path&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= /usr/local/coreseek/var/data/threads<br/><br/># document attribute values (docinfo) storage mode<br/># optional, default is &#039;extern&#039;<br/># known values are &#039;none&#039;, &#039;extern&#039; and &#039;inline&#039;<br/>docinfo&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= extern<br/>#charset_dictpath = /etc<br/><br/># memory locking for cached data (.spa and .spi), to prevent swapping<br/># optional, default is 0 (do not mlock)<br/># requires searchd to be run from root<br/>mlock&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 0<br/><br/># a list of morphology preprocessors to apply<br/># optional, default is empty<br/>#<br/># builtin preprocessors are &#039;none&#039;, &#039;stem_en&#039;, &#039;stem_ru&#039;, &#039;stem_enru&#039;,<br/># &#039;soundex&#039;, and &#039;metaphone&#039;; additional preprocessors available from<br/># libstemmer are &#039;libstemmer_XXX&#039;, where XXX is algorithm code<br/># (see libstemmer_c/libstemmer/modules.txt)<br/>#<br/># morphology&nbsp;&nbsp;&nbsp;&nbsp; = stem_en, stem_ru, soundex<br/># morphology&nbsp;&nbsp;&nbsp;&nbsp;= libstemmer_german<br/># morphology&nbsp;&nbsp;&nbsp;&nbsp;= libstemmer_sv<br/>morphology&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= none<br/><br/># minimum word length at which to enable stemming<br/># optional, default is 1 (stem everything)<br/>#<br/># min_stemming_len&nbsp;&nbsp;&nbsp;&nbsp;= 1<br/><br/># stopword files list (space separated)<br/># optional, default is empty<br/># contents are plain text, charset_table and stemming are both applied<br/>#<br/># stopwords&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= /usr/local/coreseek/var/data/stopwords.txt<br/><br/># wordforms file, in &quot;mapfrom &gt; mapto&quot; plain text format<br/># optional, default is empty<br/>#<br/># wordforms&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= /usr/local/sphinx/var/data/wordforms.txt<br/><br/># tokenizing exceptions file<br/># optional, default is empty<br/>#<br/># plain text, case sensitive, space insensitive in map-from part<br/># one &quot;Map Several Words =&gt; ToASingleOne&quot; entry per line<br/>#<br/># exceptions&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= /usr/local/sphinx/var/data/exceptions.txt<br/><br/># minimum indexed word length<br/># default is 1 (index everything)<br/>min_word_len&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 1<br/><br/># charset encoding type<br/># optional, default is &#039;sbcs&#039;<br/># known types are &#039;sbcs&#039; (Single Byte CharSet) and &#039;utf-8&#039;<br/>charset_type&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= utf-8<br/>charset_dictpath = /usr/local/mmseg3/etc/<br/><br/>##### 字符表，注意：如使用这种方式，则sphinx会对中文进行单字切分，<br/>##### 即进行字索引，若要使用中文分词，必须使用其他分词插件如 coreseek，sfc<br/>charset_table = U+FF10..U+FF19-&gt;0..9, 0..9, U+FF41..U+FF5A-&gt;a..z, U+FF21..U+FF3A-&gt;a..z,&#92;<br/>A..Z-&gt;a..z, a..z, U+0149, U+017F, U+0138, U+00DF, U+00FF, U+00C0..U+00D6-&gt;U+00E0..U+00F6,&#92;<br/>U+00E0..U+00F6, U+00D8..U+00DE-&gt;U+00F8..U+00FE, U+00F8..U+00FE, U+0100-&gt;U+0101, U+0101,&#92;<br/>U+0102-&gt;U+0103, U+0103, U+0104-&gt;U+0105, U+0105, U+0106-&gt;U+0107, U+0107, U+0108-&gt;U+0109,&#92;<br/>U+0109, U+010A-&gt;U+010B, U+010B, U+010C-&gt;U+010D, U+010D, U+010E-&gt;U+010F, U+010F,&#92;<br/>U+0110-&gt;U+0111, U+0111, U+0112-&gt;U+0113, U+0113, U+0114-&gt;U+0115, U+0115, &#92;<br/>U+0116-&gt;U+0117,U+0117, U+0118-&gt;U+0119, U+0119, U+011A-&gt;U+011B, U+011B, U+011C-&gt;U+011D,&#92;<br/>U+011D,U+011E-&gt;U+011F, U+011F, U+0130-&gt;U+0131, U+0131, U+0132-&gt;U+0133, U+0133, &#92;<br/>U+0134-&gt;U+0135,U+0135, U+0136-&gt;U+0137, U+0137, U+0139-&gt;U+013A, U+013A, U+013B-&gt;U+013C, &#92;<br/>U+013C,U+013D-&gt;U+013E, U+013E, U+013F-&gt;U+0140, U+0140, U+0141-&gt;U+0142, U+0142, &#92;<br/>U+0143-&gt;U+0144,U+0144, U+0145-&gt;U+0146, U+0146, U+0147-&gt;U+0148, U+0148, U+014A-&gt;U+014B, &#92;<br/>U+014B,U+014C-&gt;U+014D, U+014D, U+014E-&gt;U+014F, U+014F, U+0150-&gt;U+0151, U+0151, &#92;<br/>U+0152-&gt;U+0153,U+0153, U+0154-&gt;U+0155, U+0155, U+0156-&gt;U+0157, U+0157, U+0158-&gt;U+0159,&#92;<br/>U+0159,U+015A-&gt;U+015B, U+015B, U+015C-&gt;U+015D, U+015D, U+015E-&gt;U+015F, U+015F, &#92;<br/>U+0160-&gt;U+0161,U+0161, U+0162-&gt;U+0163, U+0163, U+0164-&gt;U+0165, U+0165, U+0166-&gt;U+0167, &#92;<br/>U+0167,U+0168-&gt;U+0169, U+0169, U+016A-&gt;U+016B, U+016B, U+016C-&gt;U+016D, U+016D, &#92;<br/>U+016E-&gt;U+016F,U+016F, U+0170-&gt;U+0171, U+0171, U+0172-&gt;U+0173, U+0173, U+0174-&gt;U+0175,&#92;<br/>U+0175,U+0176-&gt;U+0177, U+0177, U+0178-&gt;U+00FF, U+00FF, U+0179-&gt;U+017A, U+017A, &#92;<br/>U+017B-&gt;U+017C,U+017C, U+017D-&gt;U+017E, U+017E, U+0410..U+042F-&gt;U+0430..U+044F, &#92;<br/>U+0430..U+044F,U+05D0..U+05EA, U+0531..U+0556-&gt;U+0561..U+0586, U+0561..U+0587, &#92;<br/>U+0621..U+063A, U+01B9,U+01BF, U+0640..U+064A, U+0660..U+0669, U+066E, U+066F, &#92;<br/>U+0671..U+06D3, U+06F0..U+06FF,U+0904..U+0939, U+0958..U+095F, U+0960..U+0963, &#92;<br/>U+0966..U+096F, U+097B..U+097F,U+0985..U+09B9, U+09CE, U+09DC..U+09E3, U+09E6..U+09EF, &#92;<br/>U+0A05..U+0A39, U+0A59..U+0A5E,U+0A66..U+0A6F, U+0A85..U+0AB9, U+0AE0..U+0AE3, &#92;<br/>U+0AE6..U+0AEF, U+0B05..U+0B39,U+0B5C..U+0B61, U+0B66..U+0B6F, U+0B71, U+0B85..U+0BB9, &#92;<br/>U+0BE6..U+0BF2, U+0C05..U+0C39,U+0C66..U+0C6F, U+0C85..U+0CB9, U+0CDE..U+0CE3, &#92;<br/>U+0CE6..U+0CEF, U+0D05..U+0D39, U+0D60,U+0D61, U+0D66..U+0D6F, U+0D85..U+0DC6, &#92;<br/>U+1900..U+1938, U+1946..U+194F, U+A800..U+A805,U+A807..U+A822, U+0386-&gt;U+03B1, &#92;<br/>U+03AC-&gt;U+03B1, U+0388-&gt;U+03B5, U+03AD-&gt;U+03B5,U+0389-&gt;U+03B7, U+03AE-&gt;U+03B7, &#92;<br/>U+038A-&gt;U+03B9, U+0390-&gt;U+03B9, U+03AA-&gt;U+03B9,U+03AF-&gt;U+03B9, U+03CA-&gt;U+03B9, &#92;<br/>U+038C-&gt;U+03BF, U+03CC-&gt;U+03BF, U+038E-&gt;U+03C5,U+03AB-&gt;U+03C5, U+03B0-&gt;U+03C5, &#92;<br/>U+03CB-&gt;U+03C5, U+03CD-&gt;U+03C5, U+038F-&gt;U+03C9,U+03CE-&gt;U+03C9, U+03C2-&gt;U+03C3, &#92;<br/>U+0391..U+03A1-&gt;U+03B1..U+03C1,U+03A3..U+03A9-&gt;U+03C3..U+03C9, U+03B1..U+03C1, &#92;<br/>U+03C3..U+03C9, U+0E01..U+0E2E,U+0E30..U+0E3A, U+0E40..U+0E45, U+0E47, U+0E50..U+0E59, &#92;<br/>U+A000..U+A48F, U+4E00..U+9FBF,U+3400..U+4DBF, U+20000..U+2A6DF, U+F900..U+FAFF, &#92;<br/>U+2F800..U+2FA1F, U+2E80..U+2EFF,U+2F00..U+2FDF, U+3100..U+312F, U+31A0..U+31BF, &#92;<br/>U+3040..U+309F, U+30A0..U+30FF,U+31F0..U+31FF, U+AC00..U+D7AF, U+1100..U+11FF, &#92;<br/>U+3130..U+318F, U+A000..U+A48F,U+A490..U+A4CF<br/>min_prefix_len = 0<br/>min_infix_len = 1<br/>ngram_len = 1<br/><br/># charset definition and case folding rules &quot;table&quot;<br/># optional, default value depends on charset_type<br/>#<br/># defaults are configured to include English and Russian characters only<br/># you need to change the table to include additional ones<br/># this behavior MAY change in future versions<br/>#<br/># &#039;sbcs&#039; default value is<br/># charset_table&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 0..9, A..Z-&gt;a..z, _, a..z, U+A8-&gt;U+B8, U+B8, U+C0..U+DF-&gt;U+E0..U+FF, U+E0..U+FF<br/>#<br/># &#039;utf-8&#039; default value is<br/># charset_table&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 0..9, A..Z-&gt;a..z, _, a..z, U+410..U+42F-&gt;U+430..U+44F, U+430..U+44F<br/><br/># ignored characters list<br/># optional, default value is empty<br/>#<br/># ignore_chars&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= U+00AD<br/><br/># minimum word prefix length to index<br/># optional, default is 0 (do not index prefixes)<br/>#<br/># min_prefix_len&nbsp;&nbsp;&nbsp;&nbsp;= 0<br/><br/># minimum word infix length to index<br/># optional, default is 0 (do not index infixes)<br/>#<br/># min_infix_len&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 0<br/><br/># list of fields to limit prefix/infix indexing to<br/># optional, default value is empty (index all fields in prefix/infix mode)<br/>#<br/># prefix_fields&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= filename<br/># infix_fields&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= url, domain<br/><br/># enable star-syntax (wildcards) when searching prefix/infix indexes<br/># known values are 0 and 1<br/># optional, default is 0 (do not use wildcard syntax)<br/>#<br/># enable_star&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 1<br/><br/># n-gram length to index, for CJK indexing<br/># only supports 0 and 1 for now, other lengths to be implemented<br/># optional, default is 0 (disable n-grams)<br/>#<br/># ngram_len&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 1<br/><br/># n-gram characters list, for CJK indexing<br/># optional, default is empty<br/>#<br/># ngram_chars&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= U+3000..U+2FA1F<br/><br/># phrase boundary characters list<br/># optional, default is empty<br/>#<br/># phrase_boundary&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= ., ?, !, U+2026 # horizontal ellipsis<br/><br/># phrase boundary word position increment<br/># optional, default is 0<br/>#<br/># phrase_boundary_step&nbsp;&nbsp;&nbsp;&nbsp;= 100<br/><br/># whether to strip HTML tags from incoming documents<br/># known values are 0 (do not strip) and 1 (do strip)<br/># optional, default is 0<br/>html_strip&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 0<br/><br/># what HTML attributes to index if stripping HTML<br/># optional, default is empty (do not index anything)<br/>#<br/># html_index_attrs&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= img=alt,title; a=title;<br/><br/># what HTML elements contents to strip<br/># optional, default is empty (do not strip element contents)<br/>#<br/># html_remove_elements&nbsp;&nbsp;&nbsp;&nbsp;= style, script<br/><br/># whether to preopen index data files on startup<br/># optional, default is 0 (do not preopen), searchd-only<br/>#<br/># preopen&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 1<br/><br/># whether to keep dictionary (.spi) on disk, or cache it in RAM<br/># optional, default is 0 (cache in RAM), searchd-only<br/>#<br/># ondisk_dict&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 1<br/><br/># whether to enable in-place inversion (2x less disk, 90-95% speed)<br/># optional, default is 0 (use separate temporary files), indexer-only<br/>#<br/># inplace_enable&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 1<br/><br/># in-place fine-tuning options<br/># optional, defaults are listed below<br/>#<br/># inplace_hit_gap&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 0&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;# preallocated hitlist gap size<br/># inplace_docinfo_gap&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 0&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;# preallocated docinfo gap size<br/># inplace_reloc_factor&nbsp;&nbsp;&nbsp;&nbsp;= 0.1&nbsp;&nbsp;&nbsp;&nbsp;# relocation buffer size within arena<br/># inplace_write_factor&nbsp;&nbsp;&nbsp;&nbsp;= 0.1&nbsp;&nbsp;&nbsp;&nbsp;# write buffer size within arena<br/><br/># whether to index original keywords along with stemmed versions<br/># enables &quot;=exactform&quot; operator to work<br/># optional, default is 0<br/>#<br/># index_exact_words&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 1<br/><br/># position increment on overshort (less that min_word_len) words<br/># optional, allowed values are 0 and 1, default is 1<br/>#<br/># overshort_step&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 1<br/><br/># position increment on stopword<br/># optional, allowed values are 0 and 1, default is 1<br/>#<br/># stopword_step&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 1<br/>&#125;<br/><br/>#threads_minute<br/>source threads_minute : threads<br/>&#123;<br/>sql_query_pre&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=<br/>sql_query_pre&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= SET NAMES UTF8<br/>sql_query_pre&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; = SET SESSION query_cache_type=OFF<br/><br/>sql_query_range&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= SELECT max_doc_id+1,(SELECT MAX(tid) FROM pre_forum_thread) FROM sph_counter WHERE counter_id=1<br/>&#125;<br/><br/>#threads_minute<br/>index threads_minute : threads<br/>&#123;<br/>source&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= threads_minute<br/>path&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= /usr/local/coreseek/var/data/threads_minute<br/>&#125;<br/><br/>#posts<br/>source posts : threads<br/>&#123;<br/>type&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= mysql<br/>sql_query_pre&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=<br/>sql_query_pre&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= SET NAMES UTF8<br/>sql_query_pre&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; = SET SESSION query_cache_type=OFF<br/>sql_query_pre&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; = CREATE TABLE IF NOT EXISTS sph_counter ( counter_id INTEGER PRIMARY KEY NOT NULL,max_doc_id INTEGER NOT NULL)<br/>sql_query_pre&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= REPLACE INTO sph_counter SELECT 2, MAX(pid)-5000 FROM pre_forum_post<br/><br/>sql_query&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= SELECT p.pid AS id,p.tid,p.subject,p.message,t.digest,t.displayorder,t.authorid,t.lastpost,t.special &#92;<br/>FROM pre_forum_post AS p LEFT JOIN pre_forum_thread AS t USING(tid) &#92;<br/>WHERE p.pid&gt;=$start AND p.pid&lt;=$end<br/><br/>sql_query_range&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= SELECT (SELECT MIN(pid) FROM pre_forum_post),max_doc_id FROM sph_counter WHERE counter_id=2<br/>sql_range_step&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 4096<br/><br/>sql_attr_uint&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= tid<br/>sql_attr_uint&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= digest<br/>sql_attr_uint&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= displayorder<br/>sql_attr_uint&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= authorid<br/>sql_attr_uint&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= special<br/><br/>sql_attr_timestamp&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=lastpost<br/><br/>sql_query_info&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= SELECT * FROM pre_forum_post WHERE pid=$id<br/>&#125;<br/><br/>#posts<br/>index posts<br/>&#123;<br/>source&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= posts<br/>path&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= /usr/local/coreseek/var/data/posts<br/>docinfo&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= extern<br/>mlock&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 0<br/>morphology&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= none<br/>charset_dictpath= /usr/local/coreseek/etc/<br/>charset_debug&nbsp;&nbsp; =&nbsp;&nbsp; 0<br/>#### 索引的词最小长度<br/>min_word_len = 1<br/>charset_type = utf-8<br/>html_strip = 0<br/><br/>##### 字符表，注意：如使用这种方式，则sphinx会对中文进行单字切分，<br/>##### 即进行字索引，若要使用中文分词，必须使用其他分词插件如 coreseek，sfc<br/>charset_table = U+FF10..U+FF19-&gt;0..9, 0..9, U+FF41..U+FF5A-&gt;a..z, U+FF21..U+FF3A-&gt;a..z,&#92;<br/>A..Z-&gt;a..z, a..z, U+0149, U+017F, U+0138, U+00DF, U+00FF, U+00C0..U+00D6-&gt;U+00E0..U+00F6,&#92;<br/>U+00E0..U+00F6, U+00D8..U+00DE-&gt;U+00F8..U+00FE, U+00F8..U+00FE, U+0100-&gt;U+0101, U+0101,&#92;<br/>U+0102-&gt;U+0103, U+0103, U+0104-&gt;U+0105, U+0105, U+0106-&gt;U+0107, U+0107, U+0108-&gt;U+0109,&#92;<br/>U+0109, U+010A-&gt;U+010B, U+010B, U+010C-&gt;U+010D, U+010D, U+010E-&gt;U+010F, U+010F,&#92;<br/>U+0110-&gt;U+0111, U+0111, U+0112-&gt;U+0113, U+0113, U+0114-&gt;U+0115, U+0115, &#92;<br/>U+0116-&gt;U+0117,U+0117, U+0118-&gt;U+0119, U+0119, U+011A-&gt;U+011B, U+011B, U+011C-&gt;U+011D,&#92;<br/>U+011D,U+011E-&gt;U+011F, U+011F, U+0130-&gt;U+0131, U+0131, U+0132-&gt;U+0133, U+0133, &#92;<br/>U+0134-&gt;U+0135,U+0135, U+0136-&gt;U+0137, U+0137, U+0139-&gt;U+013A, U+013A, U+013B-&gt;U+013C, &#92;<br/>U+013C,U+013D-&gt;U+013E, U+013E, U+013F-&gt;U+0140, U+0140, U+0141-&gt;U+0142, U+0142, &#92;<br/>U+0143-&gt;U+0144,U+0144, U+0145-&gt;U+0146, U+0146, U+0147-&gt;U+0148, U+0148, U+014A-&gt;U+014B, &#92;<br/>U+014B,U+014C-&gt;U+014D, U+014D, U+014E-&gt;U+014F, U+014F, U+0150-&gt;U+0151, U+0151, &#92;<br/>U+0152-&gt;U+0153,U+0153, U+0154-&gt;U+0155, U+0155, U+0156-&gt;U+0157, U+0157, U+0158-&gt;U+0159,&#92;<br/>U+0159,U+015A-&gt;U+015B, U+015B, U+015C-&gt;U+015D, U+015D, U+015E-&gt;U+015F, U+015F, &#92;<br/>U+0160-&gt;U+0161,U+0161, U+0162-&gt;U+0163, U+0163, U+0164-&gt;U+0165, U+0165, U+0166-&gt;U+0167, &#92;<br/>U+0167,U+0168-&gt;U+0169, U+0169, U+016A-&gt;U+016B, U+016B, U+016C-&gt;U+016D, U+016D, &#92;<br/>U+016E-&gt;U+016F,U+016F, U+0170-&gt;U+0171, U+0171, U+0172-&gt;U+0173, U+0173, U+0174-&gt;U+0175,&#92;<br/>U+0175,U+0176-&gt;U+0177, U+0177, U+0178-&gt;U+00FF, U+00FF, U+0179-&gt;U+017A, U+017A, &#92;<br/>U+017B-&gt;U+017C,U+017C, U+017D-&gt;U+017E, U+017E, U+0410..U+042F-&gt;U+0430..U+044F, &#92;<br/>U+0430..U+044F,U+05D0..U+05EA, U+0531..U+0556-&gt;U+0561..U+0586, U+0561..U+0587, &#92;<br/>U+0621..U+063A, U+01B9,U+01BF, U+0640..U+064A, U+0660..U+0669, U+066E, U+066F, &#92;<br/>U+0671..U+06D3, U+06F0..U+06FF,U+0904..U+0939, U+0958..U+095F, U+0960..U+0963, &#92;<br/>U+0966..U+096F, U+097B..U+097F,U+0985..U+09B9, U+09CE, U+09DC..U+09E3, U+09E6..U+09EF, &#92;<br/>U+0A05..U+0A39, U+0A59..U+0A5E,U+0A66..U+0A6F, U+0A85..U+0AB9, U+0AE0..U+0AE3, &#92;<br/>U+0AE6..U+0AEF, U+0B05..U+0B39,U+0B5C..U+0B61, U+0B66..U+0B6F, U+0B71, U+0B85..U+0BB9, &#92;<br/>U+0BE6..U+0BF2, U+0C05..U+0C39,U+0C66..U+0C6F, U+0C85..U+0CB9, U+0CDE..U+0CE3, &#92;<br/>U+0CE6..U+0CEF, U+0D05..U+0D39, U+0D60,U+0D61, U+0D66..U+0D6F, U+0D85..U+0DC6, &#92;<br/>U+1900..U+1938, U+1946..U+194F, U+A800..U+A805,U+A807..U+A822, U+0386-&gt;U+03B1, &#92;<br/>U+03AC-&gt;U+03B1, U+0388-&gt;U+03B5, U+03AD-&gt;U+03B5,U+0389-&gt;U+03B7, U+03AE-&gt;U+03B7, &#92;<br/>U+038A-&gt;U+03B9, U+0390-&gt;U+03B9, U+03AA-&gt;U+03B9,U+03AF-&gt;U+03B9, U+03CA-&gt;U+03B9, &#92;<br/>U+038C-&gt;U+03BF, U+03CC-&gt;U+03BF, U+038E-&gt;U+03C5,U+03AB-&gt;U+03C5, U+03B0-&gt;U+03C5, &#92;<br/>U+03CB-&gt;U+03C5, U+03CD-&gt;U+03C5, U+038F-&gt;U+03C9,U+03CE-&gt;U+03C9, U+03C2-&gt;U+03C3, &#92;<br/>U+0391..U+03A1-&gt;U+03B1..U+03C1,U+03A3..U+03A9-&gt;U+03C3..U+03C9, U+03B1..U+03C1, &#92;<br/>U+03C3..U+03C9, U+0E01..U+0E2E,U+0E30..U+0E3A, U+0E40..U+0E45, U+0E47, U+0E50..U+0E59, &#92;<br/>U+A000..U+A48F, U+4E00..U+9FBF,U+3400..U+4DBF, U+20000..U+2A6DF, U+F900..U+FAFF, &#92;<br/>U+2F800..U+2FA1F, U+2E80..U+2EFF,U+2F00..U+2FDF, U+3100..U+312F, U+31A0..U+31BF, &#92;<br/>U+3040..U+309F, U+30A0..U+30FF,U+31F0..U+31FF, U+AC00..U+D7AF, U+1100..U+11FF, &#92;<br/>U+3130..U+318F, U+A000..U+A48F,U+A490..U+A4CF<br/>min_prefix_len = 0<br/>min_infix_len = 1<br/>ngram_len = 1<br/><br/>&#125;<br/><br/>#posts_minute<br/>source posts_minute : posts<br/>&#123;<br/>sql_query_pre&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=<br/>sql_query_pre&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= SET NAMES UTF8<br/>sql_query_pre&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; = SET SESSION query_cache_type=OFF<br/><br/>sql_query_range&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= SELECT max_doc_id+1,(SELECT MAX(pid) FROM pre_forum_post) FROM sph_counter WHERE counter_id=2<br/>&#125;<br/><br/>#posts_minute<br/>index posts_minute : posts<br/>&#123;<br/>source&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= posts_minute<br/>path&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= /usr/local/coreseek/var/data/posts_minute<br/>&#125;<br/><br/>#############################################################################<br/>## indexer settings<br/>#############################################################################<br/><br/>indexer<br/>&#123;<br/># memory limit, in bytes, kiloytes (16384K) or megabytes (256M)<br/># optional, default is 32M, max is 2047M, recommended is 256M to 1024M<br/>mem_limit&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 256M<br/><br/># maximum IO calls per second (for I/O throttling)<br/># optional, default is 0 (unlimited)<br/>#<br/># max_iops&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 40<br/><br/># maximum IO call size, bytes (for I/O throttling)<br/># optional, default is 0 (unlimited)<br/>#<br/># max_iosize&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 1048576<br/><br/># maximum xmlpipe2 field length, bytes<br/># optional, default is 2M<br/>#<br/># max_xmlpipe2_field&nbsp;&nbsp;&nbsp;&nbsp;= 4M<br/><br/># write buffer size, bytes<br/># several (currently up to 4) buffers will be allocated<br/># write buffers are allocated in addition to mem_limit<br/># optional, default is 1M<br/>#<br/># write_buffer&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 1M<br/>&#125;<br/><br/>#############################################################################<br/>## searchd settings<br/>#############################################################################<br/><br/>searchd<br/>&#123;<br/># hostname, port, or hostname:port, or /unix/socket/path to listen on<br/># multi-value, multiple listen points are allowed<br/># optional, default is 0.0.0.0:9312 (listen on all interfaces, port 9312)<br/>#<br/># listen&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 127.0.0.1<br/># listen&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 192.168.0.1:9312<br/>listen&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 9312<br/># listen&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= /var/run/searchd.sock<br/><br/># log file, searchd run info is logged here<br/># optional, default is &#039;searchd.log&#039;<br/>log&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= /usr/local/coreseek/var/log/searchd.log<br/><br/># query log file, all search queries are logged here<br/># optional, default is empty (do not log queries)<br/>query_log&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= /usr/local/coreseek/var/log/query.log<br/><br/># client read timeout, seconds<br/># optional, default is 5<br/>read_timeout&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 5<br/><br/># request timeout, seconds<br/># optional, default is 5 minutes<br/>client_timeout&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 300<br/><br/># maximum amount of children to fork (concurrent searches to run)<br/># optional, default is 0 (unlimited)<br/>max_children&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 30<br/><br/># PID file, searchd process ID file name<br/># mandatory<br/>pid_file&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= /usr/local/coreseek/var/log/searchd.pid<br/><br/># max amount of matches the daemon ever keeps in RAM, per-index<br/># WARNING, THERE&#039;S ALSO PER-QUERY LIMIT, SEE SetLimits() API CALL<br/># default is 1000 (just like Google)<br/>max_matches&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 1000<br/><br/># seamless rotate, prevents rotate stalls if precaching huge datasets<br/># optional, default is 1<br/>seamless_rotate&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 1<br/><br/># whether to forcibly preopen all indexes on startup<br/># optional, default is 0 (do not preopen)<br/>preopen_indexes&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 0<br/><br/># whether to unlink .old index copies on succesful rotation.<br/># optional, default is 1 (do unlink)<br/>unlink_old&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 1<br/><br/># attribute updates periodic flush timeout, seconds<br/># updates will be automatically dumped to disk this frequently<br/># optional, default is 0 (disable periodic flush)<br/>#<br/># attr_flush_period&nbsp;&nbsp;&nbsp;&nbsp;= 900<br/><br/># instance-wide ondisk_dict defaults (per-index value take precedence)<br/># optional, default is 0 (precache all dictionaries in RAM)<br/>#<br/># ondisk_dict_default&nbsp;&nbsp;&nbsp;&nbsp;= 1<br/><br/># MVA updates pool size<br/># shared between all instances of searchd, disables attr flushes!<br/># optional, default size is 1M<br/>mva_updates_pool&nbsp;&nbsp;&nbsp;&nbsp;= 1M<br/><br/># max allowed network packet size<br/># limits both query packets from clients, and responses from agents<br/># optional, default size is 8M<br/>max_packet_size&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 8M<br/><br/># crash log path<br/># searchd will (try to) log crashed query to &#039;crash_log_path.PID&#039; file<br/># optional, default is empty (do not create crash logs)<br/>#<br/># crash_log_path&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= /usr/local/coreseek/var/log/crash<br/><br/># max allowed per-query filter count<br/># optional, default is 256<br/>max_filters&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 256<br/><br/># max allowed per-filter values count<br/># optional, default is 4096<br/>max_filter_values&nbsp;&nbsp;&nbsp;&nbsp;= 4096<br/><br/># socket listen queue length<br/># optional, default is 5<br/>#<br/># listen_backlog&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 5<br/><br/># per-keyword read buffer size<br/># optional, default is 256K<br/>#<br/># read_buffer&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 256K<br/><br/># unhinted read size (currently used when reading hits)<br/># optional, default is 32K<br/>#<br/># read_unhinted&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;= 32K<br/>&#125;<br/><br/># --eof--<br/></div><br/>接着生成索引:<br/><div class="code"><br/>/usr/local/coreseek/bin/indexer -c /usr/local/coreseek/etc/sphinx.conf --all<br/></div><br/>完成之后,启动服务进程searchd:<br/><div class="code"><br/>/usr/local/coreseek/bin/searchd -c /usr/local/coreseek/etc/sphinx.conf<br/></div><br/>到这里,coreseek(shpinx)就已经能正常服务了,下面需要修改一下DZ X2:<br/>登录X2后台,全局=&gt;搜索设置=&gt;启动sphinx作为全文检索,具体配置如下图:<br/><br/><br/>一般地说,到这里就可以完全正常地使用Dz x2的shpinx全文检索功能了.但因为shpinx全文检索功能真的很强,我打算使用全文检索来取代常规搜索.所以还需要修改一个Dz x2的一个搜索程序:<br/>修改source&#92;module&#92;search&#92;search_forum.php<br/><br/>找到$srchtype = empty($_G[&#039;gp_srchtype&#039;]) ? &#039;&#039; : trim($_G[&#039;gp_srchtype&#039;]);,并在前面添加一个#号,然后新起一行,添加:$srchtype = empty($_G[&#039;gp_srchtype&#039;]) ? &#039;&#039; :&#039;fulltext&#039;;这样,不管用户有没有选择全文搜索,都是使用sphinx的全文检索功能了.(在这里说一下,原来我使用$srchtype = &#039;fulltext&#039;;后来发现,这样会造成&quot;查看新帖 &quot;功能用不了,今天修复了.)<br/><br/>接着需要配置一下sphinx增量索引:<br/><br/>增量索引:build_delta_index.sh<br/><div class="code"><br/>#!/bin/sh<br/>/usr/local/coreseek/bin/indexer --config /usr/local/coreseek/etc/sphinx.conf threads_minute posts_minute --rotate &gt;&gt; /var/log/sphinx_delta.log<br/>#/usr/local/coreseek/bin/indexer --config /usr/local/coreseek/etc/sphinx.conf --merge threads threads_minute&nbsp;&nbsp;--rotate &gt;&gt; /var/log/sphinx_delta.log<br/>#/usr/local/coreseek/bin/indexer --config /usr/local/coreseek/etc/sphinx.conf --merge posts posts_minute&nbsp;&nbsp;--rotate &gt;&gt; /var/log/sphinx_delta.log<br/></div><br/>注意后面两行的#号,原来我打算在处理增量索引的同时,执行一下索引合并的,但是,考虑一是由于现有的贴子数有230百万左右,每次合并都要花很长的时候,第二,由于sphinx在合并时,对于重复的记录并不会删除,而只是添加一个新记录,这样,索引文件的体积就会X2.现在的索引文件已经用了几十个G了,再X2那就没必要了.还好,DZ X2 会从主索引和增量索引去检索,所以我只是每天合并一次增量索引.目前看来,运行良好.<br/><br/>合并增量索引:merge_delta_index.sh<br/><div class="code"><br/>#!/bin/sh<br/>/usr/local/coreseek/bin/indexer --config /usr/local/coreseek/etc/sphinx.conf --merge threads threads_minute&nbsp;&nbsp;--rotate &gt;&gt; /var/log/sphinx_delta.log<br/>/usr/local/coreseek/bin/indexer --config /usr/local/coreseek/etc/sphinx.conf --merge posts posts_minute&nbsp;&nbsp;--rotate &gt;&gt; /var/log/sphinx_delta.log<br/></div><br/>主索引:build_main_index.sh<br/><div class="code"><br/>#!/bin/sh<br/>/usr/local/coreseek/bin/indexer --config /usr/local/coreseek/etc/sphinx.conf threads posts --rotate &gt;&gt; /var/log/sphinx_main.log<br/></div><br/>Cron定时更新索引:<br/><div class="code"><br/>crontab -e<br/><br/>*/5 * * * * /root/build_delta_index.sh &gt; /dev/null 2 &gt;&amp;1<br/>0 3 * * * 1-6 /root/merge_delta_index.sh &gt; /dev/null 2 &gt;&amp;1<br/>0 3&nbsp;&nbsp;* * 0 /root/build_main_index.sh &gt; /dev/null 2 &gt;&amp;1<br/></div><br/>保存退出.<br/><br/>安装Sphinx过程中遇到的一些问题和解决方法:<br/><br/>/usr/local/coreseek/bin/search --config /usr/local/sphinx/etc/sphinx.conf<br/>/usr/local/sphinx/bin/indexer: error while loading shared libraries: libmysqlclient.so.16: cannot open shared object file: No such file or directory<br/><br/>这个是因为coreseek(Sphinx)找不到 libmysqlclient.so引起的,解决方法:<br/><br/>vi /etc/ld.so.conf<br/>在最后面添加一行:<br/>/usr/lib/mysql<br/>ldconfig<br/><br/>./bootstrap: line 23: aclocal: command not found<br/>./bootstrap: line 24: libtoolize: command not found<br/>需要安装 libtoo libtool,解决方法:<br/>yum install autoconf automake libtoo libtool<br/><br/>WARNING: source &#039;xml&#039;: xmlpipe2 support NOT compiled in. To use xmlpipe2, install missing XML libraries, reconfigure, and rebuild Sphinx<br/>需要安装libxml2,解决方法:<br/><br/>yum install libxml2<br/>然后重新编译coreseek(Sphinx)<br/><br/>如果提示不支持charset_table的话,很可能你是运行标准版的Shpinx而不是coreseek,只有coreseek才支持这个属性.运行正常的coreseek路径就可以了.<br/><br/>
]]>
</description>
</item><item>
<link>http://www.dzhope.com/post//#blogcomment</link>
<title><![CDATA[[评论] Discuz! X2增加Sphinx全文检索支持操作记录]]></title> 
<author> &lt;user@domain.com&gt;</author>
<category><![CDATA[评论]]></category>
<pubDate>Thu, 01 Jan 1970 00:00:00 +0000</pubDate> 
<guid>http://www.dzhope.com/post//#blogcomment</guid> 
<description>
<![CDATA[ 
	
]]>
</description>
</item>
</channel>
</rss>