Uploaded image for project: 'eZ Publish / Platform'
  1. eZ Publish / Platform
  2. EZP-21239

eZ Find's auto-complete functionality does not work with Kanji and Hiragana Japanese characters

    XMLWordPrintable

Details

    Description

      eZ Find's autocomplete functionality does not work, on both backend and frontend siteaccesses, with Kanji and Hiragana Japanese characters. However, is does work with katakana characters.

      Steps to reproduce:

      1. Configure CJKTokenizer in solr. Following SOLR's example (http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_6/solr/example/solr/conf/schema.xml), I added the following block to ./ezpublish_legacy/extension/ezfind/java/solr/conf/schema.xml:

      <fieldType name="text_cjk" class="solr.TextField" positionIncrementGap="100">
      	<analyzer>
      		<tokenizer class="solr.StandardTokenizerFactory"/>
      		<!-- normalize width before bigram, as e.g. half-width dakuten combine  -->
      		<filter class="solr.CJKWidthFilterFactory"/>
      		<!-- for any non-CJK -->
      		<filter class="solr.LowerCaseFilterFactory"/>
      		<filter class="solr.CJKBigramFilterFactory"/>
      	</analyzer>
      </fieldType>
      

      ...just after:

      <fieldtype name="geohash" class="solr.GeoHashField"/>
      

      Please note that you must re-start SOLR for the changes to take effect. Re-indexing in not necessary, though.

      2. Create Japanese content. For the sake of completeness, I created content in Kanji, Hiragana and Katakana:

      Kanji: 漢字(かんじ) no auto-complete

      Hiragana: ひらがな no auto-complete

      Katakana: カタカナ auto-complete works

      Attachments

        Activity

          People

            Unassigned Unassigned
            nuno.oliveira-obsolete@ez.no Nuno Oliveira (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0 minutes
                0m
                Logged:
                Time Spent - 1 week, 4 days, 30 minutes
                1w 4d 30m