Uploaded image for project: 'eZ Publish / Platform'
  1. eZ Publish / Platform
  2. EZP-19902

Add "age" based boosting to ezfind/solr relevancy score

    Details

      Description

      the current "relevancy" sorting in solr does not take into account the date of the documents.

      The desired solution is to have a proper age-based boost factor to tune the relevancy score of matched documents.
      an introduction to the problem can be found here:
      http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_boost_the_score_of_newer_documents

        Activity

        Paulo Bras (Inactive) created issue -
        Paulo Bras (Inactive) made changes -
        Field Original Value New Value
        Status Open [ 1 ] Backlog [ 10000 ]
        Gunnstein Lye made changes -
        Status Backlog [ 10000 ] InputQ [ 10001 ]
        Paul Borgermans (Inactive) made changes -
        Assignee Paul Borgermans [ paul.borgermans@ez.no ]
        Paul Borgermans (Inactive) logged work - 13/Aug/12 6:12 PM - edited
        • Time Spent:
          4 hours, 30 minutes
           

          development, porting from customer project

        Paul Borgermans (Inactive) made changes -
        Remaining Estimate 0 minutes [ 0 ]
        Time Spent 4 hours, 30 minutes [ 16200 ]
        Worklog Id 13607 [ 13607 ]
        Paul Borgermans (Inactive) made changes -
        Worklog Id 13607 [ 13607 ]
        André Rømcke made changes -
        Rank Ranked higher
        Paul Borgermans (Inactive) logged work - 23/Aug/12 2:00 AM
        • Time Spent:
          3 hours
           

          R&D

        Paul Borgermans (Inactive) logged work - 24/Aug/12 2:00 AM
        • Time Spent:
          6 hours
           

          R&D, testing

        Paul Borgermans (Inactive) logged work - 27/Aug/12 2:00 AM
        • Time Spent:
          4 hours
           

          R&D

        Paul Borgermans (Inactive) logged work - 28/Aug/12 2:00 AM - edited
        • Time Spent:
          6 hours
           

          Reviewing reviews and applying recommendations

        Paul Borgermans (Inactive) made changes -
        Status InputQ [ 10001 ] Development [ 3 ]
        Paul Borgermans (Inactive) logged work - 29/Aug/12 2:00 AM
        • Time Spent:
          1 hour
           

          Merging into master, mininimal docs

        Hide
        Paul Borgermans (Inactive) added a comment -
        Show
        Paul Borgermans (Inactive) added a comment - See pull request https://github.com/ezsystems/ezfind/pull/70
        Paul Borgermans (Inactive) made changes -
        Status Development [ 3 ] Devlopment done [ 5 ]
        Hide
        Paul Borgermans (Inactive) added a comment - - edited

        Minimal doc:

        Since eZ Publish 5, the boost_functions parameter is extended and now properly
        supports all boosting capabilities of Solr in order to fine-tune the search result
        relevancy scores.

        A) Fetch function parameters

        Most boost features are exposed via ezfind's native search fetch function parameters.

        The existing boost_function now has 4 distinct subparameters, each of them accepting an array of
        expressions corresponding to the Solr edismax boost parameters.

        1) 'fields'
        -----------

        Fields can be specified either in the form <class_identifier>/<attribute identifier>
        or as a their raw field identifier in the Solr index (typically for custom fields)

        Example:

        ... 'boost_functions', hash('fields',array('article/tags:3')) ....

        or with a raw Solr field identifier

        ... 'boost_functions', hash('fields',array('attr_tags_lk:3')) ....

        which will apply a boost factor of 3
        for the tags attribute at query time if matches are found there

        2) 'mfunctions' (since eZ Find 2.8 / eZ Publish 5)
        --------------------------------------------------

        These correspond to the Solr edismax 'boost' parameter and multiply the score
        with the function value. This is usually what you should use (and not the additive 'functions')

        Example for age based boosting:

        ... 'boost_functions', hash('mfunctions', array('recip(ms(NOW/DAY,meta_published_dt),3.16e-11,0.5,0.5)' )) ....

        3) 'functions':
        ----------------

        These are like mfunctions, but add their value to teh relevancy score

        Example:

        .... 'boost_functions', hash('functions', array('sum(product(attr_importance_si,0.1),1)')) ...

        4) 'queries'
        ------------

        These are added to the main query and need to follow the Solr/Lucene query format ans specify the boost factor explicitely for it

        Example:

        'boost_functions', hash('queries', array('meta_class_identifier_ms:article^10')),

        B) Through ini settings

        There is also a possibility to define site-wide raw boost queries in ezfind.ini, for example

        [QueryBoost]
        #RawBoostQueries[]
        RawBoostQueries[]=meta_class_identifier_ms:summary^4

        Show
        Paul Borgermans (Inactive) added a comment - - edited Minimal doc: Since eZ Publish 5, the boost_functions parameter is extended and now properly supports all boosting capabilities of Solr in order to fine-tune the search result relevancy scores. A) Fetch function parameters Most boost features are exposed via ezfind's native search fetch function parameters. The existing boost_function now has 4 distinct subparameters, each of them accepting an array of expressions corresponding to the Solr edismax boost parameters. 1) 'fields' ----------- Fields can be specified either in the form <class_identifier>/<attribute identifier> or as a their raw field identifier in the Solr index (typically for custom fields) Example: ... 'boost_functions', hash('fields',array('article/tags:3')) .... or with a raw Solr field identifier ... 'boost_functions', hash('fields',array('attr_tags_lk:3')) .... which will apply a boost factor of 3 for the tags attribute at query time if matches are found there 2) 'mfunctions' (since eZ Find 2.8 / eZ Publish 5) -------------------------------------------------- These correspond to the Solr edismax 'boost' parameter and multiply the score with the function value. This is usually what you should use (and not the additive 'functions') Example for age based boosting: ... 'boost_functions', hash('mfunctions', array('recip(ms(NOW/DAY,meta_published_dt),3.16e-11,0.5,0.5)' )) .... 3) 'functions': ---------------- These are like mfunctions, but add their value to teh relevancy score Example: .... 'boost_functions', hash('functions', array('sum(product(attr_importance_si,0.1),1)')) ... 4) 'queries' ------------ These are added to the main query and need to follow the Solr/Lucene query format ans specify the boost factor explicitely for it Example: 'boost_functions', hash('queries', array('meta_class_identifier_ms:article^10')), B) Through ini settings There is also a possibility to define site-wide raw boost queries in ezfind.ini, for example [QueryBoost] #RawBoostQueries[] RawBoostQueries[]=meta_class_identifier_ms:summary^4
        Paul Borgermans (Inactive) made changes -
        Status Devlopment done [ 5 ] Documentation [ 10010 ]
        Paul Borgermans (Inactive) made changes -
        Time Spent 4 hours, 30 minutes [ 16200 ] 7 hours, 30 minutes [ 27000 ]
        Worklog Id 14858 [ 14858 ]
        Paul Borgermans (Inactive) made changes -
        Time Spent 7 hours, 30 minutes [ 27000 ] 1 day, 5 hours, 30 minutes [ 48600 ]
        Worklog Id 14859 [ 14859 ]
        Paul Borgermans (Inactive) made changes -
        Time Spent 1 day, 5 hours, 30 minutes [ 48600 ] 2 days, 1 hour, 30 minutes [ 63000 ]
        Worklog Id 14865 [ 14865 ]
        Paul Borgermans (Inactive) made changes -
        Time Spent 2 days, 1 hour, 30 minutes [ 63000 ] 2 days, 2 hours, 30 minutes [ 66600 ]
        Worklog Id 14867 [ 14867 ]
        Paul Borgermans (Inactive) made changes -
        Time Spent 2 days, 2 hours, 30 minutes [ 66600 ] 2 days, 6 hours, 30 minutes [ 81000 ]
        Worklog Id 14869 [ 14869 ]
        Paul Borgermans (Inactive) made changes -
        Time Spent 2 days, 6 hours, 30 minutes [ 81000 ] 3 days, 30 minutes [ 88200 ]
        Worklog Id 14865 [ 14865 ]
        André Rømcke made changes -
        Assignee Paul Borgermans [ paul.borgermans@ez.no ]
        Status Documentation [ 10010 ] Backlog [ 10000 ]
        André Rømcke made changes -
        Summary add "age" based boosting to ezfind/solr relevancy score Add "age" based boosting to ezfind/solr relevancy score
        Original Estimate 0 minutes [ 0 ]
        Component/s Documentation [ 10206 ]
        Component/s eZ Publish 4.x [ 10200 ]
        André Rømcke made changes -
        Status Backlog [ 10000 ] InputQ [ 10001 ]
        Gunnstein Lye made changes -
        Project eZ Publish EE/4.x Temp Development [ 10003 ] eZ Publish [ 10401 ]
        Key EZPNEXT-689 EZP-19902
        Workflow eZ Engineering Kanban Workflow [ 12534 ] eZ Community Workflow [ 34241 ]
        Component/s Documentation [ 10793 ]
        Component/s Documentation [ 10206 ]
        Gunnstein Lye made changes -
        Fix Version/s Customer request [ 11018 ]
        Gunnstein Lye made changes -
        Status InputQ [ 10001 ] Backlog [ 10000 ]
        Paulo Bras (Inactive) made changes -
        Labels extensibility support wit extensibility support
        André Rømcke made changes -
        Workflow eZ Community Workflow [ 34241 ] eZ Engineering Scrumban Workflow [ 35735 ]
        André Rømcke made changes -
        Status Backlog [ 10000 ] Documentation [ 10010 ]
        Assignee André Rømcke [ andre.romcke@ez.no ]
        André Rømcke made changes -
        Status Documentation [ 10010 ] InputQ [ 10001 ]
        Assignee André Rømcke [ andre.romcke@ez.no ]
        André Rømcke made changes -
        Status InputQ [ 10001 ] Development [ 3 ]
        Assignee André Rømcke [ andre.romcke@ez.no ]
        André Rømcke made changes -
        Status Development [ 3 ] Devlopment done [ 5 ]
        André Rømcke made changes -
        Status Devlopment done [ 5 ] Development Acceptance Done [ 10030 ]
        André Rømcke made changes -
        Assignee André Rømcke [ andre.romcke@ez.no ] Paul Borgermans [ paul.borgermans@ez.no ]
        Ricardo Correia (Inactive) logged work - 11/Jun/13 2:00 AM
        • Time Spent:
          7 hours, 30 minutes
           

          Investigating improvement.

        Ricardo Correia (Inactive) made changes -
        Status Development Acceptance Done [ 10030 ] Documentation [ 10010 ]
        Assignee Paul Borgermans [ paul.borgermans@ez.no ] Ricardo Correia [ ricardo.correia@ez.no ]
        Show
        Ricardo Correia (Inactive) added a comment - The following documents have been updated: http://doc.ez.no/Extensions/eZ-Publish-extensions/eZ-Find/eZ-Find-LS-5.0.0/Basic-Configuration/Configuration-settings-eZ-Find#queryboost http://doc.ez.no/Extensions/eZ-Publish-extensions/eZ-Find/eZ-Find-LS-5.0.0/Advanced-Configuration/Boosting-Dedicated-Functions
        Ricardo Correia (Inactive) made changes -
        Time Spent 3 days, 30 minutes [ 88200 ] 4 days, 15 minutes [ 116100 ]
        Worklog Id 35219 [ 35219 ]
        Ricardo Correia (Inactive) made changes -
        Status Documentation [ 10010 ] Documentation done [ 10011 ]
        Ricardo Correia (Inactive) made changes -
        Time Spent 4 days, 15 minutes [ 116100 ] 4 days, 7 hours, 45 minutes [ 143100 ]
        Worklog Id 35220 [ 35220 ]
        Pedro Resende (Inactive) made changes -
        Status Documentation done [ 10011 ] QA [ 10008 ]
        Assignee Ricardo Correia [ ricardo.correia@ez.no ] Pedro Resende [ pedro.resende@ez.no ]
        Pedro Resende (Inactive) logged work - 14/Jun/13 10:48 AM
        • Time Spent:
          6 hours, 30 minutes
           

          Work on story

        Paul Borgermans (Inactive) logged work - 14/Jun/13 2:24 PM
        • Time Spent:
          1 hour
           

          helping Pedro

        Pedro Resende (Inactive) made changes -
        Rank Ranked lower
        Pedro Resende (Inactive) made changes -
        Rank Ranked lower
        Pedro Resende (Inactive) made changes -
        Time Spent 4 days, 7 hours, 45 minutes [ 143100 ] 1 week, 6 hours, 15 minutes [ 166500 ]
        Worklog Id 35263 [ 35263 ]
        Pedro Resende (Inactive) made changes -
        Status QA [ 10008 ] InputQ [ 10001 ]
        Assignee Pedro Resende [ pedro.resende@ez.no ]
        Ricardo Correia (Inactive) made changes -
        Status InputQ [ 10001 ] Documentation [ 10010 ]
        Assignee Ricardo Correia [ ricardo.correia@ez.no ]
        Ricardo Correia (Inactive) made changes -
        Time Spent 1 week, 6 hours, 15 minutes [ 166500 ] 1 week, 6 hours, 15 minutes [ 166515 ]
        Worklog Id 35266 [ 35266 ]
        Ricardo Correia (Inactive) made changes -
        Status Documentation [ 10010 ] Documentation done [ 10011 ]
        Ricardo Correia (Inactive) made changes -
        Time Spent 1 week, 6 hours, 15 minutes [ 166515 ] 1 week, 6 hours, 30 minutes [ 167400 ]
        Worklog Id 35266 [ 35266 ]
        Pedro Resende (Inactive) made changes -
        Status Documentation done [ 10011 ] QA [ 10008 ]
        Assignee Ricardo Correia [ ricardo.correia@ez.no ] Pedro Resende [ pedro.resende@ez.no ]
        Pedro Resende (Inactive) logged work - 17/Jun/13 12:32 PM
        • Time Spent:
          3 hours, 54 minutes
           

          Review documentation and write test case

        Pedro Resende (Inactive) made changes -
        Time Spent 1 week, 6 hours, 30 minutes [ 167400 ] 1 week, 1 day, 2 hours, 24 minutes [ 181440 ]
        Worklog Id 35310 [ 35310 ]
        Pedro Resende (Inactive) made changes -
        Assignee Pedro Resende [ pedro.resende@ez.no ]
        Status QA [ 10008 ] Closed [ 6 ]
        Resolution Fixed [ 1 ]
        Paul Borgermans (Inactive) made changes -
        Resolution Fixed [ 1 ]
        Status Closed [ 6 ] Reopened [ 4 ]
        Paul Borgermans (Inactive) made changes -
        Time Spent 1 week, 1 day, 2 hours, 24 minutes [ 181440 ] 1 week, 1 day, 3 hours, 24 minutes [ 185040 ]
        Worklog Id 36188 [ 36188 ]
        Paul Borgermans (Inactive) made changes -
        Status Reopened [ 4 ] Closed [ 6 ]
        Resolution Fixed [ 1 ]
        André Rømcke made changes -
        Workflow eZ Engineering Scrumban Workflow [ 35735 ] EZ* Development Workflow [ 70368 ]
        Alex Schuster made changes -
        Workflow EZ* Development Workflow [ 70368 ] EZEE Development Workflow [ 108901 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Backlog Backlog
        8m 47s 1 Paulo Bras (Inactive) 22/Jun/12 11:27 AM
        Development Done Development Done Documentation Documentation
        1d 23h 34m 1 paul.borgermans@ez.no 31/Aug/12 1:55 PM
        Documentation Documentation Backlog Backlog
        55d 2h 10m 1 André Rømcke 25/Oct/12 4:05 PM
        Backlog Backlog InputQ InputQ
        24d 1h 12m 2 André Rømcke 25/Oct/12 4:06 PM
        InputQ InputQ Backlog Backlog
        5d 1h 1 Gunnstein Lye 30/Oct/12 4:06 PM
        Backlog Backlog Documentation Documentation
        196d 4h 39m 1 André Rømcke 14/May/13 9:46 PM
        Documentation Documentation InputQ InputQ
        21d 20h 17m 1 André Rømcke 05/Jun/13 6:04 PM
        InputQ InputQ Development Development
        42d 23h 28m 2 André Rømcke 05/Jun/13 6:04 PM
        Development Development Development Done Development Done
        1d 2h 12m 2 André Rømcke 05/Jun/13 6:04 PM
        Development Done Development Done Removed Status Removed Status
        6s 1 André Rømcke 05/Jun/13 6:04 PM
        Removed Status Removed Status Documentation Documentation
        7d 8m 1 ricardo.correia@ez.no 12/Jun/13 6:12 PM
        QA QA InputQ InputQ
        7h 28m 1 pedro.resende@ez.no 14/Jun/13 4:41 PM
        InputQ InputQ Documentation Documentation
        32m 33s 1 ricardo.correia@ez.no 14/Jun/13 5:13 PM
        Documentation Documentation Documentation Review done Documentation Review done
        1m 59s 2 ricardo.correia@ez.no 14/Jun/13 5:14 PM
        Documentation Review done Documentation Review done QA QA
        1d 15h 8m 2 pedro.resende@ez.no 14/Jun/13 5:24 PM
        QA QA Closed Closed
        2d 22h 45m 1 pedro.resende@ez.no 17/Jun/13 4:09 PM
        Closed Closed Reopened Reopened
        17d 20h 15m 1 paul.borgermans@ez.no 05/Jul/13 12:25 PM
        Reopened Reopened Closed Closed
        0s 1 paul.borgermans@ez.no 05/Jul/13 12:25 PM

          People

          • Assignee:
            Unassigned
            Reporter:
            Paulo Bras (Inactive)
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 0 minutes
              0m
              Remaining:
              Remaining Estimate - 0 minutes
              0m
              Logged:
              Time Spent - 1 week, 1 day, 3 hours, 24 minutes
              1w 1d 3h 24m