Mediawiki parser and recursiveTagParse

Mediawiki parser and recursiveTagParse - php

I have an issue with rendering wikitext in hook for tag processing.
public static function onTagRender( $input, array $args, $parser, $frame ) {
...
$text = $parser->recursiveTagParse($sometext, $frame);
...
return $text;
}
If $sometext contains e.g.
"Example from page [[XYZ]]"
then I expect returned $text should contain
"Example from page XYZ"
But I get only
"Example from page <!--LINK 0:0-->"
I have tried also $parser->replaceInternalLinks(), but with same result. What have I overlooked?

If some people run into the same problem, try calling replaceLinkHolders after recursiveTagParse. (I didn't have the same problem so I didn't test it.)
So in OP's code snippet, that would be:
public static function onTagRender( $input, array $args, $parser, $frame ) {
...
$text = $parser->recursiveTagParse($sometext, $frame);
$text = $parser->replaceLinkHolders($text);
...
return $text;
}
Explanation according to my understanding:
Actually, the usual parse method calls the internalParse method -- which does most of the job -- and then do some other stuff. On the other hand, recursiveTagParse is almost only calling internalParse, so it doesn't execute the other stuff from parse.
Problem is, links are parsed in two steps:
Links are first extracted into LinkHolderArray and they are replaced with <!--LINK $ns:$key--> in the text.
(This is done by replaceInternalLinks, called by internalParse, so that's fine.)
Then <!--LINK $ns:$key--> markers are parsed into HTML links.
(This is done by replaceLinkHolders which is called by parse, not by internalParse, and thus not by recursiveTagParse.)

Parser::recursiveTagParse only do partial rendering, afaik. That may or may not be the problem. To fully render any user input, you will have to create a parser function (http://www.mediawiki.org/wiki/Manual:Parser_functions) instead of a tag function.
See http://www.mediawiki.org/wiki/Manual:Tag_extensions#How_do_I_render_wikitext_in_my_extension.3F

Related

How to omit counting certain p tags within p tag counting function in wordpress

I need some help with my 2 WordPress function that I use inside my functions.php file to push ad codes within the blog article based on paragraph count.
What code do you currently use?
Here is the code that I currently using inside my functions.php file:
/*Add ad after 20 paragraph of post if there is more than 21 paragraph*/
add_filter( 'the_content', 'ad_20', 15 );
function ad_20( $content ) {
global $post;
if( check_paragraph_count_blog( $content ) > 21 ) {
$ad_code = '...ad code goes here...';
if ( $post->post_type == 'post' ) {
return prefix_insert_after_paragraph( $ad_code, 20, $content );
}
}
return $content;
}
// Parent Function that makes the magic happen
function prefix_insert_after_paragraph( $insertion, $paragraph_id, $content ) {
$closing_p = '</p>';
$paragraphs = explode( $closing_p, $content );
foreach ($paragraphs as $index => $paragraph) {
if ( trim( $paragraph ) ) {
$paragraphs[$index] .= $closing_p;
}
if ( $paragraph_id == $index + 1 ) {
$paragraphs[$index] .= $insertion;
}
}
return implode( '', $paragraphs );
}
//Check paragraph count on a blog post
function check_paragraph_count_blog( $content ) {
global $post;
if ( $post->post_type == 'post' ) {
$count = substr_count( $content, '</p>' );
return $count;
} else {
return 0;
}
}
What's the problem with your code?
Well, my code works fine without any error but it doesn't follow the full purpose that I want from this code.
What do you want your code to do?
The main problem with the code that I'm currently using and posted above is that both the prefix_insert_after_paragraph() function & check_paragraph_count_blog() function check for all p tags regardless of where they are located. But this is not what I want, I want the following:
Don't consider the p tags present within <code>, <pre>, <code class="some-language-name">, <pre class="some-language-name>.
Also don't consider p tags present within certain div tags, like for example <div class="callout some-class some-other-class">.
What's the problem with those certain div tags?
Well, I use several shortcode inside my article to show up some well designed note, callouts etc. Now if the counter consider those divs for counting then it may show up the ads within the shortcode design making the entire look and feel bad.
Sample Paragraph Input
<p>At the time of creating any blog or news based websites most webmasters gives the least amount of importance to the commenting system of their website, without even understanding the importance of it. Eventually comment section of a website is the only place where people interact with the author when they are exited or happy with the article and helps to grow the whole website community. In most cases they end up using some third party commenting system like Disqus or Spot.im etc. without even realizing what a blunder they are making. I’ve seen many websites (both big & popular as well as small websites) using Disqus commenting system, without even realizing the consequences. And by the time you will realize it, your site would have become so big & popular they you can’t take the risk of changing your commenting system. If you are thinking why, keep reading.</p>
<p><img src="I want to omit this p from counting"></p>
<p>As creating websites has become very easy now-a-days many non-techy people can make a websites too, but they don’t get the insights of an experienced personal. Before writing this article I’ve used disqus for months to research it thoroughly and at the same time I’ve also tried Spot.im (a new player in this arena) but in both cases I’ve come up with the same conclusion. Never ever use these third party commenting system on your website. Here are the 7 facts about Disqus and similar commenting system for which I will suggest you to stay away from them.</p>
What do you want from us?
I need your help guys. It would be really helpful if someone can provide me a rewritten version of the prefix_insert_after_paragraph() and check_paragraph_count_blog() function which will do the p tag counting and checking by omitting the condition I've described above.
Thank you in advance, looking forward to your help.
Some Update About the Answer Posted Below
The answer posted below works just fine without any problem but please note that it can only be used once. For example if you want to push 3 ads within your blog post and hence created 3 functions like ad_10(), ad_20() and ad_30(), the below code will only work in any one of them. If you put it in more than 1 function within your WordPress functions.php you might get blank content. Something to keep in mind.

Using DOMDocument - and not regexes - you can easily handle the job. The idea is to selecting all p tags that are not within those specific elements or in other words all p tags that doesn't belong to such a parent.
It's all done by a XPath query:
//p[
not(
ancestor::div[contains(#class, 'callout') or contains(#class, 'callin')]
or ancestor::pre
or ancestor::code
or a/img # As per comments
)
]
If you see you can find that it's a negated query which will look for all p elements which aren't a child of divs with callout or callin classes (you may add more classes following similar syntax), pre or code elements (Note: all pre and code elements)
By the way you don't need any other functions, all things are done in ad_20()
Regexes are not a tool made for this kind of complex situations (HTML parsing). I don't say you can't parse HTML with that. You can but unless you know perfectly what you are doing.
Live demo
add_filter('the_content', 'ad_20', 15);
function ad_20($content) {
global $post;
$adCode = '...ad code goes here...';
// Ad code will be added right after 20th paragraph
$paragraphNumber = 20;
// Convert to HTML entities
$content = mb_convert_encoding($content, 'HTML-ENTITIES', 'UTF-8');
if ($post->post_type == 'post') {
libxml_use_internal_errors(true);
// Initializing a new DOM object
$dom = new DOMDocument;
// Load HTML content
$dom->loadHTML($content, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
// Initializing a new XPath object
$xpath = new DOMXPath($dom);
// Query all `p` tags that their parent is not those specific elements
$paragraphs = $xpath->query('//p[not(ancestor::div[contains(#class, \'callout\') or contains(#class, \'callin\')] or ancestor::pre or ancestor::code or a/img)]');
// If we have a number of satisfying paragraphs
if ($paragraphs->length > $paragraphNumber) {
// Loading and importing javascript code
// <span> is important
$script = '<span>.........code.........</span>';
$newDom = new DOMDocument;
$newDom->loadHTML($script, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$node = $newDom->getElementsByTagName('span')->item(0);
$adNode = $dom->importNode($node, true);
// Add our ad node after `$paragraphNumber`th paragraph
$paragraphs->item($paragraphNumber)->parentNode->insertBefore($adNode, $paragraphs->item($paragraphNumber));
}
libxml_use_internal_errors(false);
return $dom->saveHTML();
}
return $content;
}

PHP Define Function without executing

I am currently developing an advertising module for a custom CMS, and using template tags to allow customers to add adverts into their pages through a WSYWIG page content editor.
Eg. {=advert_1}
On the frontend, this will be found through a regex and then be converted into a function, which will look to a database to select and display an advert
Template_tags.php
while ($advertRow = $advertResult->fetch_assoc()) {
$advertGroupID = $advertRow['grpID'];
$advert = "advert_";
${$advert . $advertGroupID} = showAdvert($advertGroupID);
}
This means {=advert_1} will be converted to showAdvert(1)
The problem I am having is that the showAdvert function will be run for all adverts regardless of whether or not it appears on the page, which then adds to the "views", even though the advert may not be displayed.
What I want is to just define the function without executing it, so when it appears in the page content, only then will the function will be executed.

Use a function expression to create a closure.
${$advert . $advertGroupID} = function() use($advertGroupID) {
showAdvert($advertGroupID);
};
To call the function, you need to put parentheses after it:
$name = 'advert_1';
echo $$name();
To use it with preg_replace_callback
preg_replace_callback("/\{=([^\{]{1,100}?)\}/", function($match) {
return $match[1]();
}, $pageContent);

How to test this class or rewrite it to be testable? phpspec

This class is fairly simple, it adds a twitter hashtag to a string if there is room for it. Twitter only allows 140 characters (minus 23 for a url). So the hashtags keep getting added if there is space for one.
I don't think it's 100% working as expected, but that is not relevant to my question which is located below.
class Hashtags {
private $url_character_count = 23;
private $characters_allowed = 140;
public function __construct(Article $article)
{
$this->article = $article;
$this->characters_remaining = $this->characters_allowed - $this->url_character_count;
}
public function createHashtagString()
{
$hashtags = '';
$hashtags .= $this->addEachNodeHashtag();
$hashtags .= $this->addHashtagIfSpace($this->article->topic_hashtag);
$hashtags .= $this->addHashtagIfSpace($this->article->pubissue_hashtag);
$hashtags .= $this->addHashtagIfSpace($this->article->subject_area_hashtag);
$hashtags .= $this->addHashtagIfSpace('#aviation');
return $hashtags;
}
private function addEachNodeHashtag()
{
//Returns a hashtag or calls hashtagString() if it is a comma separated list
}
private function hashtagString()
{
//Explodes a comma seperated string of hashtags and calls addHashtagIfSpace()
}
private function addHashtagIfSpace($hashtag_string)
{
if((strlen($hashtag_string) + 1) <= $this->characters_remaining)
{
$this->characters_remaining = $this->characters_remaining - strlen($hashtag_string);
if(empty($hashtag_string))
{
return '';
}
return ' ' . $hashtag_string;
}
}
}
Here is my test, my problem is that this only tests one specific case, where all the fields are filled in, and when there is enough space to fit them all. Should I just keep making a bunch of these test functions for different cases? I am guessing there will be about 10 of them. I have never done testing before, so I am a bit out of my element and need to to pointed in the correct direction.
Thank you
class HashtagsSpec extends ObjectBehavior
{
function it_creates_hashtag_string_with_all_fields_filled_in(Article $article)
{
$this->beConstructedWith($article);
$article->title = 'This is the article title';
$article->url = 'http://website.com/node/XXX';
$article->pubissue_hashtag = '#AHASHTAG';
$article->subject_area_hashtag = '#SUBAREA';
$article->topic_hashtag = '#TOPIC';
$article->node_hashtags = '#Test1,#Test2,#Test3';
$this->createHashtagString()->shouldReturn(' #Test1 #Test2 #Test3 #TOPIC #AHASHTAG #SUBAREA #aviation');
}
}

Step 0
Remove your class and start over by writing specs first.
When doing this you'll often find yourself writing a different (simpler) implementation, when driving it with specs. It won't take much time, but your class will benefit of a better design and testability.
I often use this practice when I don't know what code should look like. I prototype it first. Once the design starts to clarify, I remove the code and start over by speccing it.
You don't have to remove it for real, make a backup ;)
Step 1
Define your initial test list. This will be a list of behaviours you think you need to cover. It doesn't have to be complete and it will evolve as you go along.
You could start with:
it adds a topic hashtag if there is room for it in the message
it adds a pubissue hashtag if there is room for it in the message after adding a topic hashtag
it adds a subject area hashtag if there is room for it in the message after adding topic and pubissue hashtags
it does not add a topic hashtag if there is no room for it in the message
it does not add a pubissue hashtag if there is no room for it in the message after adding a topic hashtag
it does not add a subject area hashtag if there is no room for it in the message after adding topic and pubissue hashtags
Step 2
Write a first spec. Think of better naming as Hashtags might not be specific enough. Also consider a better API for your class. I chose to accept Article in a method call rather than passing it via the constructor:
class HashtagsSpec
{
function it_adds_a_topic_hashtag_if_there_is_room_for_it_in_the_message(Article $article)
{
$article->pubissue_hashtag = '#AHASHTAG';
$article->subject_area_hashtag = '#SUBAREA';
$article->topic_hashtag = '#TOPIC';
$article->node_hashtags = '#Test1,#Test2,#Test3';
$this->createHashtagString($article)->shouldMatch('/#TOPIC/');
}
}
Step 3
Run phpspec and write the simplest code to make specs pass.
class Hashtags
{
public function createHashtagString(Article $article)
{
return $article->topic_hashtag;
}
}
Step 4
Refactor - improve the design of code you wrote in Step 3.
It might be that there's nothing to improve, especially in the first iteration(s).
As you go along, your code will become more generic, while your specs become more specific.
Step 5
Repeat steps 2 to 5 until you're done. Simply pickup next behaviour you want to cover. It doesn't have to be the next one on your list. Whatever you feel is best to implement next.
During the whole process you'll often discover new behaviours or edge cases you haven't thought about before. Simply add them to your test list so it doesn't distract your flow.

preg_replace not retrieving correct data

I'm using my own cms from scratch, so, i'm adding useful functions for my system, but i got stuck on this:
A phrase is being loaded from lang file on array, in this case, $lang['sign']['server'] = 'Sign in with your {{servername}} registered account:';, and then, by a function, {{servername}} must be replaced by $config['servername'].
What i have so far on my functions class is the following:
public function replaceTags($text)
{
global $config;
return preg_replace("/{{(.*?)}}/" , $config[strtolower("$1")], $text) ;
}
Im calling this function here: $main->set('ssocial', $FUNC->replaceTags($lang['sign']['social']));, but the result is Sign in with your registered account: instead of Sign in with your "Server Name Goes Here" registered account.
Any ideas about why the preg_replace is not retrieving the value?
Also, when $config[”$1”] is inside '' like this '$config[”$1”]', the output is Sign in with your $config[”servername”] registered account:, so i have no clues about what's wrong.
Thanks in advance.

This is a quick and dirty working example using preg_replace_callback
<?php
$config = array('server' => 'my custom text');
function handler($matches){
global $config;
return $config[$matches[1]];
}
function replaceTags($text)
{
return preg_replace_callback("/{{(.*?)}}/" , 'handler', $text) ;
}
print replaceTags("Hello {{server}}");
Output:
Hello my custom text
As for why your code doesn't work: the second parameter of preg_replace is $config[strtolower("$1")], so php will literally look for key "$1" in $config, which probably doesn't exist.

Returning results from a custom MediaWiki hook

I'm hacking an existing MediaWiki extension, ProcessCite, that adds a custom hook to the Cite extension. Since migrating to PHP 5.4 and MW 1.22 (from PHP 5.3 and MW 1.19.2), the extension does not appear to work correctly. The problem is with the custom hook not returning the data it should do.
Here are the relevant parts of the code:
ProcessCite.php
# Declare the hook:
$wgHooks['CiteBeforeStackEntry'][] = 'wfProcessCite';
# the function itself
function wfProcessCite($str, $argv){
# process $argv and $str to create a new version of $str
# $argv remains unchanged, $str is set to new value
...
$str = "new string";
return true;
}
in Cite_body.php
function stack( $str, $key = null, $group, $follow, $call ) {
# add the call to the CiteBeforeStackEntry hook
wfRunHooks( 'CiteBeforeStackEntry', array( &$str, &$call ) );
I've added debugging statements to the beginning and end of wfProcessCite, which show that $str is being altered; however, debug statements before and after wfRunHooksshow no change to $str.
Can anyone help?

Got the answer from Andru Vallance on the Mediawiki mailing list:
Prepending the function argument with an ampersand character causes it to be passed it in by reference.
That means you are directly manipulating the original variable inside your function, rather than a copy.
function wfProcessCite( &$str, $argv ){
$str = ‘new value’;
return true;

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Mediawiki parser and recursiveTagParse - php

Related

How to omit counting certain p tags within p tag counting function in wordpress

PHP Define Function without executing

How to test this class or rewrite it to be testable? phpspec

preg_replace not retrieving correct data

Returning results from a custom MediaWiki hook

Categories

Resources