regular expression in PHP to create wiki-style links - php

I'm developing a site which is going to use wiki-style links to internal content eg [[Page Name]]
I'm trying to write a regex to achieve this and I've got as far as turning it into a link and replacing spaces with dashes (this is our space substitute rather than underscores) but only for page names of two words.
I could write a separate regex for all likely numbers of words (say from 10 downwards) but I'm sure there must be a neater way of doing it.
Here's what I have at the moment:
$regex = "#[\[][\[]([^\s\]]*)[\s]([^\s\]]*)[\]][\]]#";
$description = preg_replace($regex,"$1 $2",$description);
If someone can advise me how I can modify this regex so it works for any number of words that would be really helpful.

You can use the preg_replace_callback() function which accepts a callback to process the replacement string. You can also use lazy quantifiers in the pattern instead of a lot of negations inside character classes.
The external preg_replace_callback will extract the matched text and pass it to the callback function, which will return the properly modified version.
$str = '[[Page Name with many words]]';
echo preg_replace_callback('/\[\[(.*?)\]\]/', 'parse_tags', $str);
function parse_tags($match) {
$text = $match[1];
$slug = preg_replace('/\s+/', '-', $text);
return "$text";
}

You should use a callback function to do the replacement (using preg_replace_callback):
$str = preg_replace_callback('/\[\[([^\]]+)\]\]/', function($matches) {
return '<a href="' . preg_replace('/\s+/', '-', $matches[1]) . '>' . $matches[1] . '</a>';
}, $str);

Related

Replace words in a string including plural variations with apostrophes

I want to link matches for specific words in a sentence. Overall this is easy, and sample code could go like this:
$words = array("Facebook", "Apple");
$text = "Is Facebook's vr hardware better than Apple's current prototype?";
foreach($words as $w) {
$pattern = '/' . $w .'\b/i';
$link = '' . $w . '';
$text = preg_replace($pattern, $link, $text);
}
print $text;
However I would like to catch variations of words that have 's (apostrophe-s).
To do that I need to search for the two possible variations (with and without the 's), but the outcome also affects what text used in the replacement.
I'm drawing a blank on how to pro-actively used preg_match and then alter preg_replace based on the outcome. Any advice appreciated.
try using the optional ? quantifier and parenthesis.
$pattern = '/' . $w .'(\'s)?\b/i';
should match either version.
now, to use the match in your replacement, you can add an extra set of parenthesis, like this:
$pattern = '/(' . $w .'(\'s)?)\b/i';
then insert the matched string into your replacement, like this:
$link = '$1';
the $1 in the replacement string will be replaced with whatever the outer parenthesis of the match contains.

Using the Preg_Replace_Callback Function

Okay, I think the reason why the search bar on this page is broken is because the PHP updated, and preg_replace is deprecated. https://sparklewash.com/
I tried replacing the preg_replace function to preg_replace_callback like so, but I'm still getting some issues.
Original:
function clean($string) {
$string = str_replace(' ', '-', $string); // Replaces all spaces with hyphens.
return preg_replace('/[^A-Za-z0-9\-]/', '', $string); // Removes special chars.
}
New Version:
function clean($string) {
$string = str_replace(' ', '-', $string); // Replaces all spaces with hyphens.
return preg_replace_callback('/(^|_)([a-z])/',
create_function ('$matches', 'return strtoupper($matches[2]);'), $string); // Removes special chars.
}
I apologize if this is easy for you, I was trying to follow an article on here but I'm still relatively new to PHP.
Edit: I belive the preg_replace isn't what broke it due to some of the comments. I've made a new question here to stay on topic: Redirect Loop on $_GET Request
I would not recommend the syntax you are using, the most likely cause of the error, kindly try below syntax.
$result = preg_replace_callback('/(^|_)([a-z])/', function($matches){
return strtoupper($matches[0]);
/*
$matches[0] is the complete match of your regular expression
$matches[1] is the match of the 1st round brackets () similarly for $matches[2]...and so on.
*/
}, $string);
//Also $result will contain the resultant string
You have to just pass the $matched to your callback function, you can also declare the callback separately, as an stand alone function.
function make_upper($matches){
return strtoupper($matches[0]);
}
$result = preg_replace_callback('/(^|_)([a-z])/','make_upper' , $string);
Hope my solution works for you, Thanks. :)

replacing text with preg_replace_callback and str_replace

Okay, so I'm trying to replace really long quotes in my websites comments section that uses bbcode, what I'm trying to do is encase long quotes in a collapse I already have coded in js and css.
My problem is that it will do the first quote, then any other quotes vanish. I'm obviously missing something, but this is my first time using callbacks like this.
Here's my php code right now to do this:
$body = preg_replace_callback("/\[quote\](.*?)\[\/quote\]/is",
function($matches)
{
if (strlen($matches[1]) >= '1000')
{
$matches[0] = str_replace($matches[0], '<div class="box"><div class="collapse_container"><div class="collapse_header"><span>Long quote, click to expand</span></div><div class="collapse_content">' . $matches[1] . '</div></div></div>', $matches[0]);
return $matches[0];
}
}, $body);
Some example text:
[quote]aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa[/quote]
[quote]booohoo[/quote]
[quote]new quoting[/quote]
[b]test[/b]
You need to move the return $matches[0] code outside the if block:
function($matches)
{
if (strlen($matches[1]) >= '1000') {
$matches[0] = str_replace($matches[0], '<div class="box"><div class="collapse_container"><div class="collapse_header"><span>Long quote, click to expand</span></div><div class="collapse_content">' . $matches[1] . '</div></div></div>', $matches[0]);
}
return $matches[0];
}
Also, I advise to unroll your lazy matching regex as follows:
'~\[quote\]([^[]*(?:\[(?!/quote\])[^[]*)*)\[/quote\]~i'
See my regex demo (30 steps) and your regex demo (2025 steps).
See IDEONE demo
Phps preg* functions are acting greedy by default. They will match the longest possible string described by your regex. In your case the regex mathces everything from the first [quote] to the very last [/quote]. To turn this behavior of you have to use the "U" modifier:
$body = preg_replace_callback("/\[quote\](.*?)\[\/quote\]/isU",...);
For a list of modifiers see http://php.net/manual/en/reference.pcre.pattern.modifiers.php

Get rid of multiple white spaces in php or mysql

I have a form which takes user inputs; Recently, I have come across many user inputs with multiple white spaces.
Eg.
"My tests are working fine!"
Is there any way I can get rid of these white spaces at PHP level or MySQL level?
Clearly trim doesn't work here.
I was thinking of using Recursive function but not sure if there's an easy and fast way of doing this.
my code so far is as below:
function noWhiteSpaces($string) {
if (!empty($string)) {
$string = trim($string);
$new_str = str_replace(' ', ' ', $string);
} else {
return false;
}
return $new_str;
}
echo noWhiteSpaces("My tests are working fine here !");
If the input is actual whitespaces and you want to replace them with a single space, you could use a regular expression.
$stripped = preg_replace('/\s+/', ' ', $input);
\s means 'whitespace' character. + means one or more. So, this replaces every instance of one or more whitespace characters' in $input with a single space. See the preg_replace() docs, and a nice tutorial on regexes.
If you're not looking to replace real whitespace but rather stuff like , you could do the same, but not with \s. Use this instead:
$stripped = preg_replace('/( )+/', ' ', $input);
Note how the brackets enclose .

Put URLs from string into array using regex (problem with trailing period)

I am trying to write a function that pulls all url's from a string and remove a potential trailing slash from the end.
function getUrls($string) {
$regex = '/https?\:\/\/[^\" ]+/i';
preg_match_all($regex, $string, $matches);
return ($matches[0]);
}
But that returns http://test.com. (trailing period) If i have
$string = "Hi I am sharing http://test.com.";
$urls = getUrls($string);
It returns the URL with the period at the end.
This one seems to work (taken from here)
$regex="/(https?:\/\/+[\w\-]+\.[\w\-]+)/i";
In case anyone comes across this, here is what I put together:
$aProtocols = array('http:\/\/', 'https:\/\/', 'ftp:\/\/', 'news:\/\/', 'nntp:\/\/', 'telnet:\/\/', 'irc:\/\/', 'mms:\/\/', 'ed2k:\/\/', 'xmpp:', 'mailto:');
$aSubdomains = array('www'=>'http://', 'ftp'=>'ftp://', 'irc'=>'irc://', 'jabber'=>'xmpp:');
$sRELinks = '/(?:(' . implode('|', $aProtocols) . ')[^\^\[\]{}|\\"\'<>`\s]*[^!#\^()\[\]{}|\\:;"\',.?<>`\s])|(?:(?:(?:(?:[^#:<>(){}`\'"\/\[\]\s]+:)?[^#:<>(){}`\'"\/\[\]\s]+#)?(' . implode('|', array_keys($aSubdomains)) . ')\.(?:[^`~!##$%^&*()_=+\[{\]}\\|;:\'",<.>\/?\s]+\.)+[a-z]{2,6}(?:[\/#?](?:[^\^\[\]{}|\\"\'<>`\s]*[^!#\^()\[\]{}|\\:;"\',.?<>`\s])?)?)|(?:(?:[^#:<>(){}`\'"\/\[\]\s]+#)?((?:(?:(?:(?:[0-1]?[0-9]?[0-9])|(?:2[0-4][0-9])|(?:25[0-5]))(?:\.(?:(?:[0-1]?[0-9]?[0-9])|(?:2[0-4][0-9])|(?:25[0-5]))){3})|(?:[A-Fa-f0-9:]{16,39}))|(?:(?:[^`~!##$%^&*()_=+\[{\]}\\|;:\'",<.>\/?\s]+\.)+[a-z]{2,6}))\/(?:[^\^\[\]{}|\\"\'<>`\s]*[^!#\^()\[\]{}|\\:;"\',.?<>`\s](?:[#?](?:[^\^\[\]{}|\\"\'<>`\s]*[^!#\^()\[\]{}|\\:;"\',.?<>`\s])?)?)?)|(?:[^#:<>(){}`\'"\/\[\]\s]+:[^#:<>(){}`\'"\/\[\]\s]+#((?:(?:(?:(?:[0-1]?[0-9]?[0-9])|(?:2[0-4][0-9])|(?:25[0-5]))(?:\.(?:(?:[0-1]?[0-9]?[0-9])|(?:2[0-4][0-9])|(?:25[0-5]))){3})|(?:[A-Fa-f0-9:]{16,39}))|(?:(?:[^`~!##$%^&*()_=+\[{\]}\\|;:\'",<.>\/?\s]+\.)+[a-z]{2,6}))(?:\/(?:(?:[^\^\[\]{}|\\"\'<>`\s]*[^!#\^()\[\]{}|\\:;"\',.?<>`\s])?)?)?(?:[#?](?:[^\^\[\]{}|\\"\'<>`\s]*[^!#\^()\[\]{}|\\:;"\',.?<>`\s])?)?))|([^#:<>(){}`\'"\/\[\]\s]+#(?:(?:(?:[^`~!##$%^&*()_=+\[{\]}\\|;:\'",<.>\/?\s]+\.)+[a-z]{2,6})|(?:(?:(?:(?:(?:[0-1]?[0-9]?[0-9])|(?:2[0-4][0-9])|(?:25[0-5]))(?:\.(?:(?:[0-1]?[0-9]?[0-9])|(?:2[0-4][0-9])|(?:25[0-5]))){3})|(?:[A-Fa-f0-9:]{16,39}))))(?:[^\^*\[\]{}|\\"<>\/`\s]+[^!#\^()\[\]{}|\\:;"\',.?<>`\s])?)/i';
function getUrls($string) {
global $sRELinks;
preg_match_all($sRELinks, $string, $matches);
return ($matches[0]);
}
From http://yellow5.us/journal/server_side_text_linkification/
Depending on how strict you want to be, consider the Liberal, Accurate Regex Pattern for Matching URLs regular expression pattern discussed on Daring Fireball. The pattern in full is:
\b(([\w-]+://?|www[.])[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/)))
If you are interested in how it works, Alan Storm has a great explanation.

Categories