I'm still totally lost when it comes to preg_replace function so I would be very happy if someone helped me with this one.
I have string which can contain a call to function like: Published("today")and I need to convert it through regular expression to Published("today", 1)
I basically need to add a second parameter to the function via regular expression.
I cant use str_replace because the first parameter can be (has to be) alphanumeric text.
$string = 'Published("today")';
$foo = preg_replace('/Published\("(\w+)"\)/', 'Published("$1", 1)', $string);
preg_replace_callback should do the job I reckon.
<?php
$string = 'Published("today"); Published("yesterday"); Published("5 days ago");';
$callback = function($match) {
return sprintf('%s, 1', $match[0]);
};
$string = preg_replace_callback(
'~(?<=Published\()"[^"]+"(?=\))~',
$callback,
$string
);
echo $string;
/*
Published("today", 1); Published("yesterday", 1); Published("5 days ago", 1);
*/
Related
I have string urls like this:
$url = 'htttp://mysite.com/sub1/sub2/%d8%f01'
I'd like to capitalize the encoded part of an url(only %** substrings) as per example it is '%d8%f01' so final url will be :
htttp://mysite.com/sub1/sub2/%D8%F01
Probably using preg_replace(), but can't make a correct regex.
Any clues? Thanks!!!
You could use preg_replace_callback to convert matched %** substrings to upper case:
$url = 'http://example.com/sub1/sub2/%d8%f01';
echo preg_replace_callback('/(%..)/', function ($m) { return strtoupper($m[1]); }, $url);
Output:
http://example.com/sub1/sub2/%D8%F01
Note this will also work if not all of the URL is encoded, for example:
$url = 'http://example.com/sub1/sub2/%cf%81abcd%ce%b5';
echo preg_replace_callback('/(%..)/', function ($m) { return strtoupper($m[1]); }, $url);
Output:
http://example.com/sub1/sub2/%CF%81abcd%CE%B5
Update
It is also possible to solve this with a straight preg_replace, although the patterns and replacements are quite repetitive as you have to consider all possible hex digits in each position after the %:
$url = 'http://example.com/sub1/sub2/%cf%81abcd%ce%5b';
echo preg_replace(array('/%a/', '/%b/', '/%c/', '/%d/', '/%e/', '/%f/',
'/%(.)a/', '/%(.)b/', '/%(.)c/', '/%(.)d/', '/%(.)e/', '/%(.)f/'),
array('%A', '%B', '%C', '%D', '%E', '%F',
'%$1A', '%$1B', '%$1C', '%$1D', '%$1E', '%$1F'),
$url);
Output:
http://example.com/sub1/sub2/%CF%81abcd%CE%5B
Update 2
Inspired by #Martin I did some performance testing, and the preg_replace_callback solution typically ran about 25% faster than the preg_replace (0.0156 seconds vs 0.0220 seconds for 10000 iterations).
I am not at all knowledgeable with PHP, but here's an example (albeit a bodged-together one which could probably be refactored) of how to do it without Regex. There's no need to use Regex with something like this.
$str = "http://example.com/sub1/sub2/%d8%f01";
$expl = explode('%', $str);
foreach ($expl as &$val) {
if(strpos($val, 'http') === false) {
$val = '%' . strtoupper($val);
};
}
print_r(join('', $expl));
http://ronaldarichardson.com/2011/09/23/recursive-php-spintax-class-3-0/
I like this script, but it isn't perfect. If you use this test input case:
{This is my {spintax|spuntext} formatted string, my {spintax|spuntext} formatted string, my {spintax|spuntext} formatted string example.}
You can see that the result ALWAYS contains 3 repetitions of either "spintax" or "spuntext". It never contains 1 "spintax" and 2 "spuntext", for example.
Example:
This is my spuntext formatted string, my spuntext formatted string, my spuntext formatted string example.
To be truly random it needs to generate a random iteration for each spintax {|} block and not repeat the same selection for identical blocks, like {spintax|spuntext}.
If you look at comment #7 on that page, fransberns is onto something, however when using his modified code in a live environment, the script would repeatedly run in an infinite loop and eat up all the server memory. So there must be a bug there, but I'm not sure what it is.
Any ideas? Or does anyone know of a robust PHP spintax script that allows for nested spintax and is truly random?
Please check this gist, it is working (and it is far simpler than original code ..).
The reason the Spintax class replaces all instances of {spintax|spuntext} with the same randomly chosen option is because of this line in the class:
$str = str_replace($match[0], $new_str, $str);
The str_replace function replaces all instances of the substring with the replacement in the search string. To replace only the first instance, progressing in a serial fashion as you desired, we need to use the function preg_replace with a passed "count" argument of 1. However, when I looked over your link to the Spintax class and reference to post #7 I noticed an error in his suggested augmentation to the Spintax class.
fransberns suggested replacing:
$str = str_replace($match[0], $new_str, $str);
with this:
//one match at a time
$match_0 = str_replace("|", "\|", $match[0]);
$match_0 = str_replace("{", "\{", $match_0);
$match_0 = str_replace("}", "\}", $match_0);
$reg_exp = "/".$match_0."/";
$str = preg_replace($reg_exp, $new_str, $str, 1);
The problem with fransbergs' suggestion is that in his code he did not properly construct the regular expression for the preg_replace function. His error came from not properly escaping the \ character. His replacement code should have looked like this:
//one match at a time
$match_0 = str_replace("|", "\\|", $match[0]);
$match_0 = str_replace("{", "\\{", $match_0);
$match_0 = str_replace("}", "\\}", $match_0);
$reg_exp = "/".$match_0."/";
$str = preg_replace($reg_exp, $new_str, $str, 1);
Consider replacing the original class with this augmented version utilizing my correction on fransberns' suggested replacemnet:
class Spintax {
function spin($str, $test=false)
{
if(!$test){
do {
$str = $this->regex($str);
} while ($this->complete($str));
return $str;
} else {
do {
echo "<b>PROCESS: </b>";var_dump($str = $this->regex($str));echo "<br><br>";
} while ($this->complete($str));
return false;
}
}
function regex($str)
{
preg_match("/{[^{}]+?}/", $str, $match);
// Now spin the first captured string
$attack = explode("|", $match[0]);
$new_str = preg_replace("/[{}]/", "", $attack[rand(0,(count($attack)-1))]);
// $str = str_replace($match[0], $new_str, $str); //this line was replaced
$match_0 = str_replace("|", "\\|", $match[0]);
$match_0 = str_replace("{", "\\{", $match_0);
$match_0 = str_replace("}", "\\}", $match_0);
$reg_exp = "/".$match_0."/";
$str = preg_replace($reg_exp, $new_str, $str, 1);
return $str;
}
function complete($str)
{
$complete = preg_match("/{[^{}]+?}/", $str, $match);
return $complete;
}
}
When I tried using fransberns' suggested replacement "as is", because of the improper escaping of the \ character, I got an infinite loop. I assume that this is where your memory problem came from. After correcting fransberns' suggested replacement with the correct escaping of the \ character I did not enter an infinite loop.
Try the class above with the corrected augmentation and see if it works on your server (I can't see a reason why it shouldn't).
$string = 'test check one two test3';
$result = mb_eregi_replace ( 'test|test2|test3' , '<$1>' ,$string ,'i');
echo $result;
This should deliver: <test> check one two <test3>
Is it possible to get, that test and test3 was found, without using another match function ?
You can use preg_replace_callback instead:
$string = 'test check one two test3';
$matches = array();
$result = preg_replace_callback('/test|test2|test3/i' , function($match) use ($matches) {
$matches[] = $match;
return '<'.$match[0].'>';
}, $string);
echo $result;
Here preg_replace_callback will call the passed callback function for each match of the pattern (note that its syntax differs from POSIX). In this case the callback function is an anonymous function that adds the match to the $matches array and returns the substitution string that the matches are to be replaced by.
Another approach would be to use preg_split to split the string at the matched delimiters while also capturing the delimiters:
$parts = preg_split('/test|test2|test3/i', $string, null, PREG_SPLIT_DELIM_CAPTURE);
The result is an array of alternating non-matching and matching parts.
As far as I know, eregi is deprecated.
You could do something like this:
<?php
$str = 'test check one two test3';
$to_match = array("test", "test2", "test3");
$rep = array();
foreach($to_match as $val){
$rep[$val] = "<$val>";
}
echo strtr($str, $rep);
?>
This too allows you to easily add more strings to replace.
Hi following function used to found the any word from string
<?php
function searchword($string, $words)
{
$matchFound = count($words);// use tha no of word you want to search
$tempMatch = 0;
foreach ( $words as $word )
{
preg_match('/'.$word.'/',$string,$matches);
//print_r($matches);
if(!empty($matches))
{
$tempMatch++;
}
}
if($tempMatch==$matchFound)
{
return "found";
}
else
{
return "notFound";
}
}
$string = "test check one two test3";
/*** an array of words to highlight ***/
$words = array('test', 'test3');
$string = searchword($string, $words);
echo $string;
?>
If your string is utf-8, you could use preg_replace instead
$string = 'test check one two test3';
$result = preg_replace('/(test3)|(test2)|(test)/ui' , '<$1>' ,$string);
echo $result;
Oviously with this kind of data to match the result will be suboptimal
<test> check one two <test>3
You'll need a longer approach than a direct search and replace with regular expressions (surely if your patterns are prefixes of other patterns)
To begin with, the code you want to enhance does not seem to comply with its initial purpose (not at least in my computer). You can try something like this:
$string = 'test check one two test3';
$result = mb_eregi_replace('(test|test2|test3)', '<\1>', $string);
echo $result;
I've removed the i flag (which of course makes little sense here). Still, you'd still need to make the expression greedy.
As for the original question, here's a little proof of concept:
function replace($match){
$GLOBALS['matches'][] = $match;
return "<$match>";
}
$string = 'test check one two test3';
$matches = array();
$result = mb_eregi_replace('(test|test2|test3)', 'replace(\'\1\')', $string, 'e');
var_dump($result, $matches);
Please note this code is horrible and potentially insecure. I'd honestly go with the preg_replace_callback() solution proposed by Gumbo.
I'm using a Function to parse UBBC and I want to use a function to find data from a database to replace text (a [user] kind of function). However the code is ignoring the RegExp Variable. Is there any way I can get it to recognise the RegExp variable?
PHP Function:
function parse_ubbc($string){
$string = $string;
$tags = array(
"user" => "#\[user\](.*?)\[/user\]#is"
);
$html = array(
"user" => user_to_display("$1", 0)
);
return preg_replace($tags, $html, $string);
}
My function uses the username of the user to get their display name, 0 denotes that it is the username being used and can be ignored for the sake of this.
Any help would be greatly appreciated.
You either rewrite your code to use preg_replace_callback, as advised.
Or your rewrite the regex to use the #e flag:
function parse_ubbc($string){
$string = $string;
$tags = array(
"user" => "#\[user\](.*?)\[/user\]#ise"
);
$html = array(
"user" => 'user_to_display("$1", 0)'
);
return preg_replace($tags, $html, $string);
}
For that it's important that PHP does not execute the function in the replacement array immediately. That's why you have to put the function call into 'user_to_display("$1", 0)' single quotes. So preg_replace executes it later with the #e flag.
A significant gotcha here is, that the username may never contain " double quotes which would allow the regex placeholder $0 to break up the evaluated function call (cause havoc). Hencewhy you have to rewrite the regex itself to use \w+ instead of .*?. Or again just use preg_replace_callback for safety.
You need to use preg_replace_callback if you want to source replacements from a database.
function parse_ubbc($string){
$string = $string;
function get_user_to_display($m){
user_to_display($m[1], 0);
}
return preg_replace_callback('#\[user\](.*?)\[/user\]#is', 'get_user_to_display', $string);
}
You're calling user_to_display() with the string '$1', not the actual found string. Try:
function parse_ubbc($string){
$string = $string;
$tags = array(
"user" => "#\[user\](.*?)\[/user\]#ise"
);
$html = array(
"user" => 'user_to_display("$1", 0)'
);
return preg_replace($tags, $html, $string);
}
The changes are adding 'e' to the end of the regexp string, and putting the function call in quotes.
How can I use the PHP strtoupper function for the first two characters of a string? Or is there another function for that?
So the string 'hello' or 'Hello' must be converted to 'HEllo'.
$txt = strtoupper( substr( $txt, 0, 2 ) ).substr( $txt, 2 );
This works also for strings that are less than 2 characters long.
$string = "hello";
$string{0} = strtoupper($string{0});
$string{1} = strtoupper($string{1});
var_dump($string);
//output: string(5) "HEllo"
Assuming it's just a single word you need to do:
$ucfirsttwo = strtoupper(substr($word, 0, 2)) . substr($word, 2);
Basically, extract the first two characters and uppercase the, then attach the remaining characters.
If you need to handle multiple words in the string, then it gets a bit uglier.
Oh, and if you're using multi-byte characters, prefix the two functions with mb_ to get a multibyte-aware version.
$str = substr_replace($str, strtoupper($str[0].$str[1]), 1, 2);
Using preg_replace() with the e pattern modifier could be interesting here:
$str = 'HELLO';
echo preg_replace('/^(\w{1,2})/e', 'strtoupper(\\1)', strtolower($str));
EDIT: It is recommended that you not use this approach. From the PHP manual:
Use of this modifier is discouraged, as it can easily introduce security vulnerabilites:
<?php
$html = $_POST['html'];
// uppercase headings
$html = preg_replace(
'(<h([1-6])>(.*?)</h\1>)e',
'"<h$1>" . strtoupper("$2") . "</h$1>"',
$html
);
The above example code can be easily exploited by passing in a string
such as <h1>{${eval($_GET[php_code])}}</h1>. This gives the attacker
the ability to execute arbitrary PHP code and as such gives him nearly
complete access to your server.
To prevent this kind of remote code execution vulnerability the
preg_replace_callback() function should be used instead:
<?php
$html = $_POST['html'];
// uppercase headings
$html = preg_replace_callback(
'(<h([1-6])>(.*?)</h\1>)',
function ($m) {
return "<h$m[1]>" . strtoupper($m[2]) . "</h$m[1]>";
},
$html
);
As recommended, instead of using the e pattern, consider using preg_replace_callback():
$str = 'HELLO';
echo preg_replace_callback(
'/^(\w{1,2})/'
, function( $m )
{
return strtoupper($m[1]);
}
, strtolower($str)
);
This should work strtoupper(substr($target, 0, 2)) . substr($target, 2) where $target is your 'hello' or whatever.
ucfirstDocs does only the first, but substr access on strings works, too:
$str = ucfirst($str);
$str[1] = strtoupper($str[1]);
Remark: This works, but you will get notices on smaller strings if offset 1 is not defined, so not that safe, empty strings will even be converted to array. So it's merely to show some options.