K2 Joomla Wrong Urls in item comments - php

K2 is parsing un-necessary text into urls in item comments.
1.Created a item using joomla admin panel and as a guest entered comment with following text
"node.js is a power full js engine. Enven.though this is not a valid url it has been rendered as valid.url anything with xxx.xxx are parsed as urls and even like sub domain syntax iam.not.valid i.e mail.yahoo.com how funny this is"
In the above coomment node.js, even.though, valid.url, xxx.xxx iam.not.valid i.e mail.yahoo.com are rendered as valid url. but in this case only mail.yahoo.com is valid not others.
K2 is using some smart intelligence using following snippet in $JHOME/components/com_k2/views/item/view.html.php lines (159-178)
$comments = $model->getItemComments($item->id, $limitstart, $limit, $commentsPublished);
$pattern = "#\b(https?://)?(([0-9a-zA-Z_!~*'().&=+$%-]+:)?[0-9a-zA-Z_!~*'().&=+$%-]+\#)?(([0-9]{1,3}\.){3}[0-9]{1,3}|([0-9a-zA-Z_!~*'()-]+\.)*([0-9a-zA-Z][0-9a-zA-Z-]{0,61})?[0-9a-zA-Z]\.[a-zA-Z]{2,6})(:[0-9]{1,4})?((/[0-9a-zA-Z_!~*'().;?:\#&=+$,%#-]+)*/?)#";
for ($i = 0; $i < sizeof($comments); $i++) {
$comments[$i]->commentText = nl2br($comments[$i]->commentText);
$comments[$i]->commentText = preg_replace($pattern, '<a target="_blank" rel="nofollow" href="\0">\0</a>', $comments[$i]->commentText);
$comments[$i]->userImage = K2HelperUtilities::getAvatar($comments[$i]->userID, $comments[$i]->commentEmail, $params->get('commenterImgWidth'));
if ($comments[$i]->userID>0) {
$comments[$i]->userLink = K2HelperRoute::getUserRoute($comments[$i]->userID);
}
else {
$comments[$i]->userLink = $comments[$i]->commentURL;
}
if($reportSpammerFlag && $comments[$i]->userID>0) {
$comments[$i]->reportUserLink = JRoute::_('index.php?option=com_k2&view=comments&task=reportSpammer&id='.$comments[$i]->userID.'&format=raw');
}
else {
$comments[$i]->reportUserLink = false;
}
}
Can somebody help fixing above regular expression? Thanks

You are going to have this problem any time a user types.in a period with no spaces around it. You could add in some login to test for valid TLDs, but even that would not be perfect because there are plenty of TLDs that would fool the logic, like .it.
If you want to try your hand at fixing the regular expression, the pattern that determines if a string is a URL is here -
$pattern = "#\b(https?://)?(([0-9a-zA-Z_!~*'().&=+$%-]+:)?[0-9a-zA-Z_!~*'().&=+$%-]+\#)?(([0-9]{1,3}\.){3}[0-9]{1,3}|([0-9a-zA-Z_!~*'()-]+\.)*([0-9a-zA-Z][0-9a-zA-Z-]{0,61})?[0-9a-zA-Z]\.[a-zA-Z]{2,6})(:[0-9]{1,4})?((/[0-9a-zA-Z_!~*'().;?:\#&=+$,%#-]+)*/?)#";
Personally, I would just disable links in comments altogether by removing or commenting out this code -
$comments[$i]->commentText = preg_replace($pattern, '<a target="_blank" rel="nofollow" href="\0">\0</a>', $comments[$i]->commentText);

Related

php preg_replace reuse of subject

I've got a big problem and hopefully you can help me out...
I'd like to code a plugin. The plugin should search on a website (for example www.example.com/index.php for a specific word for example Number1. Afterwords the word found should be replaced by an hyperlink. The link should guide to an extern website for example www.example2.com/index.php.
According to the current website, the found word should added to the hyperlink. So if the current site is www.example.com/index.php and the found word is Number1, the hyperlink should look like this:
www.example2.com/index.php/Number1
And it shouldn't be always the same hyperlink, but created dynamic by the word this plugin finds.
Beneath is my current code.
I hope someone can help me out. Thanks.
public function onContentPrepare($context, &$row, &$params, $page = 0)
{
$text = $row->text;
$pattern = array();
$pattern[0] = '/Number1/';
$pattern[1] = '/Number2/';
$pattern[2] = '/Number3/';
$Subject = array();
$Subject[2] = 'https://www.example.com/index.php/';
$Subject[1] = '(...)';
$Subject[0] = '(...)';
$row->text = preg_replace($pattern,$subject,$text);
}

Allow links with preg-replace

I have this so far :
preg_replace("/[^a-zA-Z0-9\/!?\" \' :,.;><_ ]/", "",
html_entity_decode($text, ENT_QUOTES));
It works well, if I use other string a part from links
.How do i accept
<script></script> <iframe> http:// https:// ?
I have done many projects with RegEx in the past, here is a few of my queries.
Match "Every" link on a page.
$links = preg_match_all('#(?:<a\s+.*?href=[\'"]([^\'"]+)[\'"]\s*?.*?>((?:\s*(?!<\s*\/\s*a\s*>).\s*)*)<\s*\/\s*a\s*>)#i',$html,$patterns);
// $patterns[0] (array) will give you the full tag <a herf="" ...etc
// $patterns[1] (array) will give you the urls
You should print_r($patterns) to be sure what the actual arrays look like and how you want to use them.
To match <script> tags (this actually finds full javascript blocks, which may not be exactly what you're asking), however you can modify the code some.
preg_match_all("#<\s*script[^>]*[^/]>(.*?)<\s*/\s*script\s*>#i",$html,$scripts);
To match <iframe> you can use this function (matches "every" iframe tag within html)
function html_iframe_tags($str)
{
$iframes = array();
$iframeSearch = preg_match_all('#(?:<iframe[^>]*)(?:(?:/>)|(?:>.*?</\s*iframe>))#i', $str, $rawiframes);
if (count($rawiframes[0])<1) return false;
for ($i = 0; $i < count($rawiframes[0]); $i++)
{
$iframes[$i]['tag'] = $rawiframes[0][$i];
preg_match_all('/src="([^"]*)"/i',$iframes[$i]['tag'], $iframesrc);
$iframes[$i]['src'] = (isset($iframesrc[1][0]) ? $iframesrc[1][0] : '');
preg_match_all('/\swidth="([^"]*)"/i',$iframes[$i]['tag'], $iframewidth);
$iframes[$i]['width'] = (isset($iframewidth[1][0]) ? $iframewidth[1][0] : '');
preg_match_all('/\sheight="([^"]*)"/i',$iframes[$i]['tag'], $iframeheight);
$iframes[$i]['height'] = (isset($iframeheight[1][0]) ? $iframeheight[1][0] : '');
}
return $iframes;
}
Then print_r() the results and see how the array looks for your exact usage, this function actually determines more than your use such as width/height etc. But also includes the src of which you are looking for.
Hopefully this stuff can give you direction for your project.
Here is a website that has some reference to regex in html
http://www.the-art-of-web.com/php/parse-links/

Tags in email template editor

I am making a system with a email function but I want that the admin can change those email messages with the admin panel. I am planning to do this with ckeditor but I want that I can add tags where for example the user name will be placed. Something like this:
Dear %%name%%,
Thank you for purchasing %%item%%.
Kind Regards,
%%business%%
Thanks !
A part of your code could look like this:
$text = <text from editor>
$tokens['name'] = 'Mr. Jones';
$tokens['item'] = 'left shoe';
$tokens['business'] = 'WalkThisWay';
// match anything between '%%' than's not a control chr or '%', min 3 and max 16 chrs.
if (preg_match_all('/%%[^[:cntrl:]%]{3,16}%%/',$text,$matches))
{
// replace all found tokens
foreach ($matches[0] as $match)
{
// get token
$tokenKey = trim($match,'%');
// replace
$text = str_replace($match,$tokens[$tokenKey],$text);
}
}
It seeks out matching tokens and replaces them with the real content. You still need to think about how to get your $tokens array.

Matching a regex

My regex skills are poor. For my js/css to show the correct menu I am adding in a "selected" class to my li so that the js knows which is the current item (so it can display the rest of the drop down).
My problem is matching the correct uri string in codeigniter:
<li <? if((strstr($this->uri->uri_string(),"rfid_finder/finder")) || (strstr($this->uri->uri_string(),"finder"))) {?>class="selected"<?}?> rel="home"> <?= anchor('finder','HOME')?></li>
I was using this method initially but my routes are now a bit more complex to allow for searching and pagination.
I need a regex that would match all of the following routes:
$route['finder/(:any)/(:any)/(:num)'] = "rfid_finder/finder/$1/$2/$3";
$route['finder/(:any)/(:any)'] = "rfid_finder/finder/$1/$2";
$route['finder/(:any)'] = "rfid_finder/finder/$1";
$route['finder'] = 'rfid_finder/finder';
but when a user visits:
rfid_finder/search_form
the first menu is not given the selected class.
Update
I want first code snippet to match the routes and not the rfid_finder/search route- I have a second line of code which matches the rfid_finder/search_form route. my problem lies in trying to capture the route using (strstr($this->uri->uri_string(),"finder") it matches all my routes even the rfid_finder/search_form
Hmm, let me know if this works for you. It seems to work in my tests.
preg_match('|^(rfid_finder/)?finder/?([a-z0-9]+?)?(/[a-z0-9]+?)?(/\d+?)?$|i', $string)
Here's my testing example:
$route = array();
$route['finder/(:any)/(:any)/(:num)'] = "rfid_finder/finder/$1/$2/$3";
$route['finder/(:any)/(:any)'] = "rfid_finder/finder/$1/$2";
$route['finder/(:any)'] = "rfid_finder/finder/$1";
$route['finder'] = 'rfid_finder/finder';
$route['finder/2313'] = 'rfid_finder/finder';
$route['finder/asd/dsda'] = 'rfid_finder/finder';
$route['rfid_finder/finder/dd/122'] = 'rfid_finder/finder';
$route['rfid_finder/finder/dd/122/qwewe'] = 'rfid_finder/finder';
$route['rfid_finder/finder/dd/asdsad/333'] = 'rfid_finder/finder';
foreach ($route as $string=>$match) {
if ( preg_match('|^(rfid_finder/)?finder/?([a-z0-9]+?)?(/[a-z0-9]+?)?(/\d+?)?$|i', $string) ) {
echo $string.' - yes';
} else {
echo $string.' - no';
}
echo '<br />';
}
exit();
It searches the string to make sure it starts with either finder or rfid_finder (i wasn't really sure which you prefered. if you want it to only start with rfid_finder just remove the parentheses around rfid_finder and the trailing ? before finder)

Make Automatic Links in a Text from File : Like Internal Wikipedia - PHP

I want help on this script I am making...
I want my website to be a wikipedia in itself... take for example I have a php website... I publish daily articles on it.
Suppose I publish 2 articles on Jenna Bush and Michael Jackson respectively
now I save into text/xml/database text and link
example
jenna bush, http://www.domain.com/jenna.html
michael jackson, http://www.domain.com/michael.html
or any which ways required like
<xml>
<item>
<text>jenna bush</text>
<link>http://www.domain.com/jenna.html</link>
</item>
... etc
</xml>
now what I want is the PHP script should automatically convert any jenna bush or any michael jackson linked to their respective links all over my website...
Any help is much appreciated...
Assuming that the text containing those words are in the database the best way to achieve something like that is using str_replace http://ie2.php.net/manual/en/function.str-replace.php
Right before the text is submitted to the database you run a function on it that looks for certain phrases and replaces them with other phrases.
Alternatively and probably a better approach is the same one that mediawiki (the software that wikipedia runs on uses), everytime you want to create a link to another article in a mediawiki you put [[ ]] around it, for example [[Michael Jackson]].
That way you have more control over what becomes a link.
Example: If you had an article on Prince the musician and one on Prince Charles and you wanted to link to Prince Charles, the first method might find Prince first and link to him, however if you use the mediawiki method you would write [[Prince Charles]] and it would know what to look for.
To do that I'd recommend preg_match http://www.php.net/manual/en/function.preg-match.php
It may be worth having a look at how mediawiki does the same thing, you can download it for free and it's written in php
I customized it and here is for everyone interested
function tags_autolink($text)
{
$text = " $text ";
$query_tags_autolink = "SELECT tag from tags";
$rs_tags_autolink = mysql_query($query_tags_autolink) or print "error getting tags";
while($row_tags_autolink = mysql_fetch_array($rs_tags_autolink))
{
$tag_name = trim($row_tags_autolink['tag']);
$tag_url = "http://www.domain.com/tag/".createLink(trim(htmlentities($tag_name)))."/";
$text = preg_replace("|(?!<[^<>]*?)(?<![?./&])\b($tag_name)\b(?!:)(?![^<>]*?>)|imsU","$1" , $text);
}
return trim( $text );
}
the create link function simply makes a string of "abcd is kk" like "abcd-is-kk" for a tag page ending ;)
cheers !
function auto_href($x)
{
$x = explode(' ', $x);
foreach ($x as $y)
{
if (substr($y, 0, 7) == 'http://')
$y = ''.$y.'';
$z[] = $y;
}
return implode($z, ' ');
}
function tags_autolink()
{
$conn = mysqli_connect("localhost", "root", "", "sample")
or die ("Could not connect to mysql because ".mysqli_error());
$text = 'You paragraph or text here';
$query_tags_autolink = "SELECT tag from tags";
$rs_tags_autolink = mysqli_query($conn,$query_tags_autolink) or print "error getting tags";
while($row_tags_autolink = mysqli_fetch_array($rs_tags_autolink))
{
$tag_name = trim($row_tags_autolink['tag']);
$trimedurl = str_replace(' ', '-',$tag_name);
$trimedurl=strtolower("$trimedurl");
$tag_url = "http://yourdomain/tag/$trimedurl";
$text = preg_replace("|(?!<[^<>]*?)(?<![?./&])\b($tag_name)\b(?!:)(?![^<>]*?>)|imsU","$1" , $text);
}
return trim($text);
}
echo tags_autolink() ;
Wikipedia's automatic hyperlinking code is in mediawiki:Parser.php, methods handleMagicLinks and makeFreeExternalLink.
The first searches for protocols, the latter removes stuff like trailing punctuation.

Categories