Uploaded image for project: 'eZ Publish / Platform'
  1. eZ Publish / Platform
  2. EZP-19902

Add "age" based boosting to ezfind/solr relevancy score

    Details

      Description

      the current "relevancy" sorting in solr does not take into account the date of the documents.

      The desired solution is to have a proper age-based boost factor to tune the relevancy score of matched documents.
      an introduction to the problem can be found here:
      http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_boost_the_score_of_newer_documents

        Activity

        Show
        Ricardo Correia (Inactive) added a comment - The following documents have been updated: http://doc.ez.no/Extensions/eZ-Publish-extensions/eZ-Find/eZ-Find-LS-5.0.0/Basic-Configuration/Configuration-settings-eZ-Find#queryboost http://doc.ez.no/Extensions/eZ-Publish-extensions/eZ-Find/eZ-Find-LS-5.0.0/Advanced-Configuration/Boosting-Dedicated-Functions
        Hide
        Paul Borgermans (Inactive) added a comment - - edited

        Minimal doc:

        Since eZ Publish 5, the boost_functions parameter is extended and now properly
        supports all boosting capabilities of Solr in order to fine-tune the search result
        relevancy scores.

        A) Fetch function parameters

        Most boost features are exposed via ezfind's native search fetch function parameters.

        The existing boost_function now has 4 distinct subparameters, each of them accepting an array of
        expressions corresponding to the Solr edismax boost parameters.

        1) 'fields'
        -----------

        Fields can be specified either in the form <class_identifier>/<attribute identifier>
        or as a their raw field identifier in the Solr index (typically for custom fields)

        Example:

        ... 'boost_functions', hash('fields',array('article/tags:3')) ....

        or with a raw Solr field identifier

        ... 'boost_functions', hash('fields',array('attr_tags_lk:3')) ....

        which will apply a boost factor of 3
        for the tags attribute at query time if matches are found there

        2) 'mfunctions' (since eZ Find 2.8 / eZ Publish 5)
        --------------------------------------------------

        These correspond to the Solr edismax 'boost' parameter and multiply the score
        with the function value. This is usually what you should use (and not the additive 'functions')

        Example for age based boosting:

        ... 'boost_functions', hash('mfunctions', array('recip(ms(NOW/DAY,meta_published_dt),3.16e-11,0.5,0.5)' )) ....

        3) 'functions':
        ----------------

        These are like mfunctions, but add their value to teh relevancy score

        Example:

        .... 'boost_functions', hash('functions', array('sum(product(attr_importance_si,0.1),1)')) ...

        4) 'queries'
        ------------

        These are added to the main query and need to follow the Solr/Lucene query format ans specify the boost factor explicitely for it

        Example:

        'boost_functions', hash('queries', array('meta_class_identifier_ms:article^10')),

        B) Through ini settings

        There is also a possibility to define site-wide raw boost queries in ezfind.ini, for example

        [QueryBoost]
        #RawBoostQueries[]
        RawBoostQueries[]=meta_class_identifier_ms:summary^4

        Show
        Paul Borgermans (Inactive) added a comment - - edited Minimal doc: Since eZ Publish 5, the boost_functions parameter is extended and now properly supports all boosting capabilities of Solr in order to fine-tune the search result relevancy scores. A) Fetch function parameters Most boost features are exposed via ezfind's native search fetch function parameters. The existing boost_function now has 4 distinct subparameters, each of them accepting an array of expressions corresponding to the Solr edismax boost parameters. 1) 'fields' ----------- Fields can be specified either in the form <class_identifier>/<attribute identifier> or as a their raw field identifier in the Solr index (typically for custom fields) Example: ... 'boost_functions', hash('fields',array('article/tags:3')) .... or with a raw Solr field identifier ... 'boost_functions', hash('fields',array('attr_tags_lk:3')) .... which will apply a boost factor of 3 for the tags attribute at query time if matches are found there 2) 'mfunctions' (since eZ Find 2.8 / eZ Publish 5) -------------------------------------------------- These correspond to the Solr edismax 'boost' parameter and multiply the score with the function value. This is usually what you should use (and not the additive 'functions') Example for age based boosting: ... 'boost_functions', hash('mfunctions', array('recip(ms(NOW/DAY,meta_published_dt),3.16e-11,0.5,0.5)' )) .... 3) 'functions': ---------------- These are like mfunctions, but add their value to teh relevancy score Example: .... 'boost_functions', hash('functions', array('sum(product(attr_importance_si,0.1),1)')) ... 4) 'queries' ------------ These are added to the main query and need to follow the Solr/Lucene query format ans specify the boost factor explicitely for it Example: 'boost_functions', hash('queries', array('meta_class_identifier_ms:article^10')), B) Through ini settings There is also a possibility to define site-wide raw boost queries in ezfind.ini, for example [QueryBoost] #RawBoostQueries[] RawBoostQueries[]=meta_class_identifier_ms:summary^4
        Hide
        Paul Borgermans (Inactive) added a comment -
        Show
        Paul Borgermans (Inactive) added a comment - See pull request https://github.com/ezsystems/ezfind/pull/70

          People

          • Assignee:
            Unassigned
            Reporter:
            Paulo Bras (Inactive)
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 0 minutes
              0m
              Remaining:
              Remaining Estimate - 0 minutes
              0m
              Logged:
              Time Spent - 1 week, 1 day, 3 hours, 24 minutes
              1w 1d 3h 24m