Uploaded image for project: 'Community Platforms'
  1. Community Platforms
  2. COM-19843

Forum RSS feed content contains geshi formatted of source code snippets not properly encoded into html entities within literal tag content which prevents problem free display of content in feed

    Details

    • Type: Bug Bug
    • Status: InputQ
    • Priority: High High
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: articles, blogs, forums, share.ez.no
    • Labels:
      None

      Description

      Hello,

      This problem is fairly simple in nature but technically complicated to understand for most people.

      In short we need geshi to encode source code snippets into html entities in the content with the rss feeds.

      Specifically today we know now we need geshi encode source code snippets into html entities (like html, php, any language content) in rss feeds.

      We now know the solution to this problem (it did take some detailed time and research to narrow down the cause and perfect solution). This is a dire need.

      Currently no one consuming the share.ez.no rss feed content can solve this problem without the solution we have tested and found perfectly solving this problem implemented.

      The solution to this problem must be made to share.ez.no to solve it properly for everyone.

      Reproduce use case
      ================

      This problem is hard to reproduce because you must test with content which does not live in the feed for very long. Or you must import share.ez.no rss feed (forum) content as text block datatype attribute and reproduce the problem trying to display the content which is nearly impossible to display without rendering problems.

      Longer description
      ==============

      We are importing rss content forum content from share.ez.no ...

      but ezgeshi is enabled on the site's forum content to provide pretty display of literal tag content and the rss module view is getting the geshi output in the pre tag which makes it almost impossible to display the output pre tag content if it contains html markup (which has not been properly encoded into html entities).

      So the problem seems to be any content in pre tag (ezliteral tag), mixed in with geshi html pretty output html in pre tag (again without first being properly encoded into html entities).

      When the content is imported the html entities are all converted to tags making the display of the forum message content break the page because the content contains pre tag with html (both example code for sharing and review) and mixed in with geshi html

      We can write all kinds of special output php to clean up the html of the forum message content but it seems like it's virtually impossible to FIX the problem with geshi html mixed in a pre tag with html (like a snippet of html to share for review)

      .. because how do you say don't transform ezgeshi html into entities but do turn the html code ezgeshi formatted into entities. This is the problem users of this share.ez.no forum rss feeds are faced with. We can't solve a problem of this complex nature that originates within the content rendered in the rss module view templates.

      We need a solution made on share.ez.no templates which either prevents rendering of content using the rss module view from ever being run through geshi or properly encodes the user submitted literal tag code snippets through html entity encoding.

      We have even checked other feed sites by importing share.ez.no forum content and seeing how it renders on their site's page and the results are the same (consistent but terrible and next to impossible to read) … this is defiantly a share.ez.no specific problem.

      Please help!
      ==========

      We have to moderate every forum post (hide) by hand (time consuming and disappointing) which contains html literal tag usage as most posts who do use this .. well the geshi output causes the browser to interpret the markup and break the web page display entirely in random ways which are next to impossible to predict or prevent.

      Idea
      ====
      If you could only detect that the rss module feed view is being used in tpl and properly encode source code content in ezliteral tag use into html entities OR not push the literal tag output through geshi then our lives would be spared this horrible problem every day a user uses the literal tag and provides an html snippet.

      Please let us know what you think.

      Thanking you in advance for your continued support!

      Respectfully,
      Brookins Consulting
      http://ezecosystem.org

        Activity

        Hide
        Brookins Consulting added a comment - - edited

        Hello,

        After another round of exhaustive testing we have come once again to what we think is a final conclusion (I will skip all the painful lessons re-learned save one).

        If geshi was not enabled for literal rss content output we could write our own code to transform all pre tag html into entities to prevent rendering.

        Thus we think our original suggestion is the best possible solution.

        We think this is the best suggestion since in the rss feed all html in the description element is xml entities encoded and can not (to our knowledge) contain html entities (or we would go a step further in suggesting geshi could be modified to do this work (convert html4strict content from tags to entities for us while it is adding the fancy looking output html to pre tag content html)).

        extension/community/design/standard/templates/content/datatype/view/ezxmltags/literal.tpl

        {if and( is_set( $language ), $language )}
        {* if and( is_set( $filename ), $filename )}<p class="filename">{$filename|wash}</p>{/if *}
        {if module_params().function_name|eq('feed')}
            <pre{if ne( $language|trim, '' )} class="{$language|wash}"{/if}>{$content|wash( xhtml )}</pre>
        {else}{$content|trim|geshi( $language )}{/if}
        {else}
            {if ne( $classification, 'html' )}
            <pre{if ne( $classification|trim, '' )} class="{$classification|wash}"{/if}>{$content|wash( xhtml )}</pre>
            {else}
            {$content}
            {/if}
        {/if}
        

        Thank you again for your continued support!

        Cheers,
        Brookins Consulting

        Show
        Brookins Consulting added a comment - - edited Hello, After another round of exhaustive testing we have come once again to what we think is a final conclusion (I will skip all the painful lessons re-learned save one). If geshi was not enabled for literal rss content output we could write our own code to transform all pre tag html into entities to prevent rendering. Thus we think our original suggestion is the best possible solution. We think this is the best suggestion since in the rss feed all html in the description element is xml entities encoded and can not (to our knowledge) contain html entities (or we would go a step further in suggesting geshi could be modified to do this work (convert html4strict content from tags to entities for us while it is adding the fancy looking output html to pre tag content html)). extension/community/design/standard/templates/content/datatype/view/ezxmltags/literal.tpl {if and( is_set( $language ), $language )} {* if and( is_set( $filename ), $filename )}<p class="filename">{$filename|wash}</p>{/if *} {if module_params().function_name|eq('feed')} <pre{if ne( $language|trim, '' )} class="{$language|wash}"{/if}>{$content|wash( xhtml )}</pre> {else}{$content|trim|geshi( $language )}{/if} {else} {if ne( $classification, 'html' )} <pre{if ne( $classification|trim, '' )} class="{$classification|wash}"{/if}>{$content|wash( xhtml )}</pre> {else} {$content} {/if} {/if} Thank you again for your continued support! Cheers, Brookins Consulting
        Hide
        Brookins Consulting added a comment - - edited

        Hello,

        Apologies for our delay. We finally re-setup our old copy of the share.ez.no svn source code on our server and tested the end-to-end complete life cycle use case individually (on share localhost only and on eZecosystem combined).

        We found that according to the specs and testing rss2.0 can contain html entities when properly encoded (so can Atom).

        We found that the following solves the problem while disabling geshi output for forum rss exported content (backup solution)

        extension/community/settings/template.ini.append.php

        <snip />
        [PHP]
        PHPOperatorList[htmlentities]=htmlentities
        <snip />
        

        extension/community/design/standard/templates/content/datatype/view/ezxmltags/literal.tpl

        {if and( is_set( $language ), $language )}
        {* if and( is_set( $filename ), $filename )}<p class="filename">{$filename|wash}</p>{/if *}
        {if module_params().function_name|eq('feed')}
        <pre{if ne( $language|trim, '' )} class="{$language|wash}"{/if}>{$content|htmlentities|wash( xhtml )}</pre>
        {else}{$content|trim|geshi( $language )}{/if}
        {else}
            {if ne( $classification, 'html' )}
            <pre{if ne( $classification|trim, '' )} class="{$classification|wash}"{/if}>{$content|wash( xhtml )}</pre>
            {else}
            {$content}
            {/if}
        {/if}
        

        We found that the following changes solves the problem while retaining geshi output for forum rss exported content (preferred solution)

        extension/community/design/standard/templates/content/datatype/view/ezxmltags/literal.tpl

        {if and( is_set( $language ), $language )}
        {* if and( is_set( $filename ), $filename )}<p class="filename">{$filename|wash}</p>{/if *}
        {if module_params().function_name|eq('feed')}
        {$content|trim|geshi( concat( $language, '-rss' ) )}
        {else}
        {$content|trim|geshi( $language )}
        {/if}
        {else}
            {if ne( $classification, 'html' )}
            <pre{if ne( $classification|trim, '' )} class="{$classification|wash}"{/if}>{$content|wash( xhtml )}</pre>
            {else}
            {$content}
            {/if}
        {/if}
        

        extension/community/classes/geshi.php

            function GeSHi($source, $language, $path = '') {
                $this->set_source($source);
                $this->set_language_path($path);
         
                if( strpos( $language, '-rss' ) !== false )
                {
                   $actual_language = str_replace( "-rss", "", $language );
                   $this->set_language( $actual_language );
                   $this->rss = true;
                }
                else
                {
                   $this->set_language( $language );
                   $this->rss = false;
                }
            }
        

        extension/community/classes/geshi.php

            function finalise($parsed_code) {
        <snip />
        else {
                    // No line numbers, but still need to handle highlighting lines extra.
                    // Have to use divs so the full width of the code is highlighted
                    $code = explode("\n", $parsed_code);
                    $parsed_code = '';
                    $i = 0;
                    foreach ($code as $line) {
                        // Make lines have at least one space in them if they're empty
                        // BenBE: Checking emptiness using trim instead of relying on blanks
                        if ('' == trim($line)) {
                            $line = '&nbsp;';
                        }
                        if (in_array(++$i, $this->highlight_extra_lines)) {
                            if ($this->use_classes) {
                                if (array_key_exists($i, $this->highlight_extra_lines_styles)) {
                                    $parsed_code .= "<div class=\"lx$i\">";
                                } else {
                                    $parsed_code .= "<div class=\"ln-xtra\">";
                                }
                            } else {
                                $parsed_code .= "<div style=\"" . $this->get_line_style($i) . "\">";
                            }
                            // Remove \n because it stuffs up <pre> header
                            $parsed_code .= $this->rss == true ? htmlentities( $line ) : $line . "</div>";
                        } else {
                            $parsed_code .= $this->rss == true ? htmlentities( $line ) . "\n" : $line . "\n";
                        }
                    }
                }
        
        

        Consider this problem solved! That is once you install the preferred solution (above).

        Thank you again for your continued support! Please let us know your thoughts.

        Cheers,
        Brookins Consulting

        Show
        Brookins Consulting added a comment - - edited Hello, Apologies for our delay. We finally re-setup our old copy of the share.ez.no svn source code on our server and tested the end-to-end complete life cycle use case individually (on share localhost only and on eZecosystem combined). We found that according to the specs and testing rss2.0 can contain html entities when properly encoded (so can Atom). We found that the following solves the problem while disabling geshi output for forum rss exported content (backup solution) extension/community/settings/template.ini.append.php <snip /> [PHP] PHPOperatorList[htmlentities]=htmlentities <snip /> extension/community/design/standard/templates/content/datatype/view/ezxmltags/literal.tpl {if and( is_set( $language ), $language )} {* if and( is_set( $filename ), $filename )}<p class="filename">{$filename|wash}</p>{/if *} {if module_params().function_name|eq('feed')} <pre{if ne( $language|trim, '' )} class="{$language|wash}"{/if}>{$content|htmlentities|wash( xhtml )}</pre> {else}{$content|trim|geshi( $language )}{/if} {else} {if ne( $classification, 'html' )} <pre{if ne( $classification|trim, '' )} class="{$classification|wash}"{/if}>{$content|wash( xhtml )}</pre> {else} {$content} {/if} {/if} We found that the following changes solves the problem while retaining geshi output for forum rss exported content (preferred solution) extension/community/design/standard/templates/content/datatype/view/ezxmltags/literal.tpl {if and( is_set( $language ), $language )} {* if and( is_set( $filename ), $filename )}<p class="filename">{$filename|wash}</p>{/if *} {if module_params().function_name|eq('feed')} {$content|trim|geshi( concat( $language, '-rss' ) )} {else} {$content|trim|geshi( $language )} {/if} {else} {if ne( $classification, 'html' )} <pre{if ne( $classification|trim, '' )} class="{$classification|wash}"{/if}>{$content|wash( xhtml )}</pre> {else} {$content} {/if} {/if} extension/community/classes/geshi.php function GeSHi($source, $language, $path = '') { $this->set_source($source); $this->set_language_path($path);   if( strpos( $language, '-rss' ) !== false ) { $actual_language = str_replace( "-rss", "", $language ); $this->set_language( $actual_language ); $this->rss = true; } else { $this->set_language( $language ); $this->rss = false; } } extension/community/classes/geshi.php function finalise($parsed_code) { <snip /> else { // No line numbers, but still need to handle highlighting lines extra. // Have to use divs so the full width of the code is highlighted $code = explode("\n", $parsed_code); $parsed_code = ''; $i = 0; foreach ($code as $line) { // Make lines have at least one space in them if they're empty // BenBE: Checking emptiness using trim instead of relying on blanks if ('' == trim($line)) { $line = '&nbsp;'; } if (in_array(++$i, $this->highlight_extra_lines)) { if ($this->use_classes) { if (array_key_exists($i, $this->highlight_extra_lines_styles)) { $parsed_code .= "<div class=\"lx$i\">"; } else { $parsed_code .= "<div class=\"ln-xtra\">"; } } else { $parsed_code .= "<div style=\"" . $this->get_line_style($i) . "\">"; } // Remove \n because it stuffs up <pre> header $parsed_code .= $this->rss == true ? htmlentities( $line ) : $line . "</div>"; } else { $parsed_code .= $this->rss == true ? htmlentities( $line ) . "\n" : $line . "\n"; } } } Consider this problem solved! That is once you install the preferred solution (above). Thank you again for your continued support! Please let us know your thoughts. Cheers, Brookins Consulting
        Hide
        Brookins Consulting added a comment - - edited

        Hello,

        In a continued effort to help make this solution's implementation clear and easy to re-implement, here is svn diffs of the preferred solution file changes:

        svn diff extension/community/design/standard/templates/content/datatype/view/ezxmltags/literal.tpl
        Index: extension/community/design/standard/templates/content/datatype/view/ezxmltags/literal.tpl
        ===================================================================
        --- extension/community/design/standard/templates/content/datatype/view/ezxmltags/literal.tpl   (revision 66166)
        +++ extension/community/design/standard/templates/content/datatype/view/ezxmltags/literal.tpl   (working copy)
        @@ -1,6 +1,11 @@
         {if and( is_set( $language ), $language )}
         {* if and( is_set( $filename ), $filename )}<p class="filename">{$filename|wash}</p>{/if *}
        -{$content|trim|geshi( $language )}
        +{if module_params().function_name|eq('feed')}
        +{$content|trim|geshi( concat( $language, '-rss' ) )}
        +{else}{$content|trim|geshi( $language )}{/if}
         {else}
             {if ne( $classification, 'html' )}
             <pre{if ne( $classification|trim, '' )} class="{$classification|wash}"{/if}>{$content|wash( xhtml )}</pre>
        

        svn diff extension/community/classes/geshi.php
        Index: extension/community/classes/geshi.php
        ===================================================================
        --- extension/community/classes/geshi.php       (revision 66166)
        +++ extension/community/classes/geshi.php       (working copy)
        @@ -438,7 +438,19 @@
             function GeSHi($source, $language, $path = '') {
                 $this->set_source($source);
                 $this->set_language_path($path);
        -        $this->set_language($language);
        +
        +        if( strpos( $language, '-rss' ) !== false )
        +        {
        +           $actual_language = str_replace( "-rss", "", $language );
        +           $this->set_language( $actual_language );
        +           $this->rss = true;
        +        }
        +        else
        +        {
        +           $actual_language = str_replace( "-rss", "", $language );
        +           $this->set_language( $actual_language );
        +           $this->rss = false;
        +        }
             }
         
             /**
        @@ -2603,9 +2615,9 @@
                                 $parsed_code .= "<div style=\"" . $this->get_line_style($i) . "\">";
                             }
                             // Remove \n because it stuffs up <pre> header
        -                    $parsed_code .= $line . "</div>";
        +                    $parsed_code .= $this->rss == true ? htmlentities( $line ) : $line . "</div>";
                         } else {
        -                    $parsed_code .= $line . "\n";
        +                    $parsed_code .= $this->rss == true ? htmlentities( $line ) . "\n" : $line . "\n";
                         }
                     }
                 }
        

        Show
        Brookins Consulting added a comment - - edited Hello, In a continued effort to help make this solution's implementation clear and easy to re-implement, here is svn diffs of the preferred solution file changes: svn diff extension/community/design/standard/templates/content/datatype/view/ezxmltags/literal.tpl Index: extension/community/design/standard/templates/content/datatype/view/ezxmltags/literal.tpl =================================================================== --- extension/community/design/standard/templates/content/datatype/view/ezxmltags/literal.tpl (revision 66166) +++ extension/community/design/standard/templates/content/datatype/view/ezxmltags/literal.tpl (working copy) @@ -1,6 +1,11 @@ {if and( is_set( $language ), $language )} {* if and( is_set( $filename ), $filename )}<p class="filename">{$filename|wash}</p>{/if *} -{$content|trim|geshi( $language )} +{if module_params().function_name|eq('feed')} +{$content|trim|geshi( concat( $language, '-rss' ) )} +{else}{$content|trim|geshi( $language )}{/if} {else} {if ne( $classification, 'html' )} <pre{if ne( $classification|trim, '' )} class="{$classification|wash}"{/if}>{$content|wash( xhtml )}</pre> svn diff extension/community/classes/geshi.php Index: extension/community/classes/geshi.php =================================================================== --- extension/community/classes/geshi.php (revision 66166) +++ extension/community/classes/geshi.php (working copy) @@ -438,7 +438,19 @@ function GeSHi($source, $language, $path = '') { $this->set_source($source); $this->set_language_path($path); - $this->set_language($language); + + if( strpos( $language, '-rss' ) !== false ) + { + $actual_language = str_replace( "-rss", "", $language ); + $this->set_language( $actual_language ); + $this->rss = true; + } + else + { + $actual_language = str_replace( "-rss", "", $language ); + $this->set_language( $actual_language ); + $this->rss = false; + } } /** @@ -2603,9 +2615,9 @@ $parsed_code .= "<div style=\"" . $this->get_line_style($i) . "\">"; } // Remove \n because it stuffs up <pre> header - $parsed_code .= $line . "</div>"; + $parsed_code .= $this->rss == true ? htmlentities( $line ) : $line . "</div>"; } else { - $parsed_code .= $line . "\n"; + $parsed_code .= $this->rss == true ? htmlentities( $line ) . "\n" : $line . "\n"; } } }
        Hide
        Brookins Consulting added a comment -

        Hello Sylvain!

        We noticed this issue due date has been changed several times now.

        We were wondering if you would say when you think you would be able to implement the above perfect ready to go no side-affects no further problems solution to the problems described?

        We all really need the solution implemented to prevent the whole host of source code not encoded into html entities problems described above.

        Respectfully,
        Brookins Consulting

        Show
        Brookins Consulting added a comment - Hello Sylvain! We noticed this issue due date has been changed several times now. We were wondering if you would say when you think you would be able to implement the above perfect ready to go no side-affects no further problems solution to the problems described? We all really need the solution implemented to prevent the whole host of source code not encoded into html entities problems described above. Respectfully, Brookins Consulting
        Hide
        Brookins Consulting added a comment -

        Hello Again!

        We have re-named/titled and re-written out issue ticket description to help clarify the issue ticket problem and solution description texts since we have now solved and shared the fixes for this issue and now need them implemented and posted to share.ez.no website for group testing to ensure that the fixes work well for you folks at eZ Systems (we know these fixes will work perfectly for us in the eZ Community).

        Cheers,
        Brookins Consulting

        Show
        Brookins Consulting added a comment - Hello Again! We have re-named/titled and re-written out issue ticket description to help clarify the issue ticket problem and solution description texts since we have now solved and shared the fixes for this issue and now need them implemented and posted to share.ez.no website for group testing to ensure that the fixes work well for you folks at eZ Systems (we know these fixes will work perfectly for us in the eZ Community). Cheers, Brookins Consulting

          People

          • Assignee:
            Sylvain Guittard
            Reporter:
            Brookins Consulting
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Due:
              Created:
              Updated: