I have a script that generates content containing certain tokens, and I need to replace each occurrence of a token, with different content resulting from a separate loop.
It's simple to use str_replace to replace all occurrences of the token with the same content, but I need to replace each occurrence with the next result of the loop.
I did see this answer: Search and replace multiple values with multiple/different values in PHP5?
however it is working from pre-defined arrays, which I don't have.
Sample content:
This is an example of %%token%% that might contain multiple instances of a particular
%%token%%, that need to each be replaced with a different piece of %%token%% generated
elsewhere.
I need to replace each occurrence of %%token%% with content generated, for argument's sake, by this simple loop:
for($i=0;$i<3;$i++){
$token = rand(100,10000);
}
So replace each %%token%% with a different random number value $token.
Is this something simple that I'm just not seeing?
Thanks!
I don't think you can do this using any of the search and replace functions, so you'll have to code up the replace yourself.
It looks to me like this problem works well with explode(). So, using the example token generator you provided, the solution looks like this:
$shrapnel = explode('%%token%%', $str);
$newStr = '';
for ($i = 0; $i < count($shrapnel); ++$i) {
// The last piece of the string has no token after it, so we special-case it
if ($i == count($shrapnel) - 1)
$newStr .= $shrapnel[$i];
else
$newStr .= $shrapnel[$i] . rand(100,10000);
}
I know this is an old thread, but I stumbled across it while trying to achieve something similar. If anyone else sees this, I think this is a little nicer:
Create some sample text:
$text="This is an example of %%token%% that might contain multiple instances of a particular
%%token%%, that need to each be replaced with a different piece of %%token%% generated
elsewhere.";
Find the search string with regex:
$new_text = preg_replace_callback("|%%token%%|", "_rand_preg_call", $text);
Define a callback function to change the matches
function _rand_preg_call($matches){
return rand(100,10000);
}
Echo the results:
echo $new_text;
So as a function set:
function _preg_replace_rand($text,$pattern){
return preg_replace_callback("|$pattern|", "_rand_preg_call", $text);
}
function _rand_preg_call($matches){
return rand(100,10000);
}
I had a similar issue where I had a file that I needed to read. It had multiple occurrences of a token, and I needed to replace each occurrence with a different value from an array.
This function will replace each occurrence of the "token"/"needle" found in the "haystack" and will replace it with a value from an indexed array.
function mostr_replace($needle, $haystack, $replacementArray, $needle_position = 0, $offset = 0)
{
$counter = 0;
while (substr_count($haystack, $needle)) {
$needle_position = strpos($haystack, $needle, $offset);
if ($needle_position + strlen($needle) > strlen($haystack)) {
break;
}
$haystack = substr_replace($haystack, $replacementArray[$counter], $needle_position, strlen($needle));
$offset = $needle_position + strlen($needle);
$counter++;
}
return $haystack;
}
By the way, 'mostr_replace' is short for "Multiple Occurrence String Replace".
You can use the following code:
$content = "This is an example of %%token%% that might contain multiple instances of a particular %%token%%, that need to each be replaced with a different piece of %%token%% generated elsewhere.";
while (true)
{
$needle = "%%token%%";
$pos = strpos($content, $needle);
$token = rand(100, 10000);
if ($pos === false)
{
break;
}
else
{
$content = substr($content, 0,
$pos).$token.substr($content, $pos + strlen($token) + 1);
}
}
Related
I've spent my last 4 hours figuring out how to ... I got to ask for your help now.
I'm trying to extract from a text multiple substring match my starting_words_array and ending_words_array.
$str = "Do you see that ? Indeed, I can see that, as well as this." ;
$starting_words_array = array('do','I');
$ending_words_array = array('?',',');
expected output : array ([0] => 'Do you see that ?' [1] => 'I can see that,')
I manage to write a first function that can find the first substring matching one of both arrays items. But i'm not able to find how to loop it in order to get all the substring matching my requirement.
function SearchString($str, $starting_words_array, $ending_words_array ) {
forEach($starting_words_array as $test) {
$pos = strpos($str, $test);
if ($pos===false) continue;
$found = [];
forEach($ending_words_array as $test2) {
$posStart = $pos+strlen($test);
$pos2 = strpos($str, $test2, $posStart);
$found[] = ($pos2!==false) ? $pos2 : INF;
}
$min = min($found);
if ($min !== INF)
return substr($str,$pos,$min-$pos) .$str[$min];
}
return '';
}
Do you guys have any idea about how to achieve such thing ?
I use preg_match for my solution. However, the start and end strings must be escaped with preg_quote. Without that, the solution will be wrong.
function searchString($str, $starting_words_array, $ending_words_array ) {
$resArr = [];
forEach($starting_words_array as $i => $start) {
$end = $ending_words_array[$i] ?? "";
$regEx = '~'.preg_quote($start,"~").".*".preg_quote($end,"~").'~iu';
if(preg_match_all($regEx,$str,$match)){
$resArr[] = $match[0];
}
}
return $resArr;
}
The result is what the questioner expects.
If the expressions can occur more than once, preg_match_all must also be used. The regex must be modify.
function searchString($str, $starting_words_array, $ending_words_array ) {
$resArr = [];
forEach($starting_words_array as $i => $start) {
$end = $ending_words_array[$i] ?? "";
$regEx = '~'.preg_quote($start,"~").".*?".preg_quote($end,"~").'~iu';
if(preg_match_all($regEx,$str,$match)){
$resArr = array_merge($resArr,$match[0]);
}
}
return $resArr;
}
The resut for the second variant:
array (
0 => "Do you see that ?",
1 => "Indeed,",
2 => "I can see that,",
)
I would definitely use regex and preg_match_all(). I won't give you a full working code example here but I will outline the necessary steps.
First, build a regex from your start-end-pairs like that:
$parts = array_map(
function($start, $end) {
return $start . '.+' . $end;
},
$starting_words_array,
$ending_words_array
);
$regex = '/' . join('|', $parts) . '/i';
The /i part means case insensitive search. Some characters like the ? have a special purpose in regex, so you need to extend above function in order to escape it properly.
You can test your final regex here
Then use preg_match_all() to extract your substrings:
preg_match_all($regex, $str, $matches); // $matches is passed by reference, no need to declare it first
print_r($matches);
The exact structure of your $matches array will be slightly different from what you asked for but you will be able to extract your desired data from it
Benni answer is best way to go - but let just point out the problem in your code if you want to fix those:
strpos is not case sensitive and find also part of words so you need to changes your $starting_words_array = array('do','I'); to $starting_words_array = array('Do','I ');
When finding a substring you use return which exit the function so you want find any other substring. In order to fix that you can define $res = []; at the beginning of the function and replace return substr($str,$pos,... with $res[] = substr($str,$pos,... and at the end return the $res var.
You can see example in 3v4l - in that example you get the output you wanted
If one is experienced in PHP, then one knows how to find whole words in a string and their position using a regex and preg_match() or preg_match_all. But, if you're looking instead for a lighter solution, you may be tempted to try with strpos(). The question emerges as to how one can use this function without it detecting substrings contained in other words. For example, how to detect "any" but not those characters occurring in "company"?
Consider a string like the following:
"Will *any* company do *any* job, (are there any)?"
How would one apply strpos() to detect each appearance of "any" in the string? Real life often involves more than merely space delimited words. Unfortunately, this sentence didn't appear with the non-alphabetical characters when I originally posted.
I think you could probably just remove all the whitespace characters you care about (e.g., what about hyphenations?) and test for " word ":
var_dump(firstWordPosition('Will company any do any job, (are there any)?', 'any'));
var_dump(firstWordPosition('Will *any* company do *any* job, (are there any)?', 'any'));
function firstWordPosition($str, $word) {
// There are others, maybe also pass this in or array_merge() for more control.
$nonchars = ["'",'"','.',',','!','?','(',')','^','$','#','\n','\r\n','\t',];
// You could also do a strpos() with an if and another argument passed in.
// Note that we're padding the $str to with spaces to match begin/end.
$pos = stripos(str_replace($nonchars, ' ', " $str "), " $word ");
// Have to account for the for-space on " $str ".
return $pos ? $pos - 1: false;
}
Gives 12 (offset from 0)
https://3v4l.org/qh9Rb
<?php
$subject = "any";
$b = " ";
$delimited = "$b$subject$b";
$replace = array("?","*","(",")",",",".");
$str = "Will *any* company do *any* job, (are there any)?";
echo "\nThe string: \"$str\"";
$temp = str_replace($replace,$b,$str);
while ( ($pos = strpos($temp,$delimited)) !== false )
{
echo "\nThe subject \"$subject\" occurs at position ",($pos + 1);
for ($i=0,$max=$pos + 1 + strlen($subject); $i <= $max; $i++) {
$temp[$i] = $b;
}
}
See demo
The script defines a word boundary as a blank space. If the string has non-alphabetical characters, they are replaced with blank space and the result is stored in $temp. As the loop iterates and detects $subject, each of its characters changes into a space in order to locate the next appearance of the subject. Considering the amount of work involved one may wonder if such effort really pays off compared to using a regex with a preg_ function. That is something that one will have to decide themselves. My purpose was to show how this may be achieved using strpos() without resorting to the oft repeated conventional wisdom of SO which advocates using a regex.
There is an option if you are loathe to create a replacement array of non-alphabetical characters, as follows:
<?php
function getAllWholeWordPos($s,$word){
$b = " ";
$delimited = "$b$word$b";
$retval = false;
for ($i=0, $max = strlen( $s ); $i < $max; $i++) {
if ( !ctype_alpha( $s[$i] ) ){
$s[$i] = $b;
}
}
while ( ( $pos = stripos( $s, $delimited) ) !== false ) {
$retval[] = $pos + 1;
for ( $i=0, $max = $pos + 1 + strlen( $word ); $i <= $max; $i++) {
$s[$i] = $b;
}
}
return $retval;
}
$whole_word = "any";
$str = "Will *$whole_word* company do *$whole_word* job, (are there $whole_word)?";
echo "\nString: \"$str\"";
$result = getAllWholeWordPos( $str, $whole_word );
$times = count( $result );
echo "\n\nThe word \"$whole_word\" occurs $times times:\n";
foreach ($result as $pos) {
echo "\nPosition: ",$pos;
}
See demo
Note, this example with its update improves the code by providing a function which uses a variant of strpos(), namely stripos() which has the added benefit of being case insensitive. Despite the more labor-intensive coding, the performance is speedy; see performance.
Try the following code
<!DOCTYPE html>
<html>
<body>
<?php
echo strpos("I love php, I love php too!","php");
?>
</body>
</html>
Output: 7
We want to censor certain words on our site but each word has different censored output.
For example:
PHP => P*P, javascript => j*vascript
(However not always the second letter.)
So we want a simple "one star" censor system but with keeping the original caps. The datas coming from the database are uncensored so we need the fastest way that possible.
$data="Javascript and php are awesome!";
$word[]="PHP";
$censor[]="H";//the letter we want to replace
$word[]="javascript";
$censor[]="a"//but only once (j*v*script would look wierd)
//Of course if it needed we can use the full censored word in $censor variables
Expected value:
J*vascript and p*p are awesome!
Thanks for all the answers!
You can put your censored words in key-based array, and value of the array should be the position of what char is replaced with * (see $censor array example bellow).
$string = 'JavaSCRIPT and pHp are testing test-ground for TEST ŠĐČĆŽ ŠĐčćŽ!';
$censor = [
'php' => 2,
'javascript' => 2,
'test' => 3,
'šđčćž' => 4,
];
function stringCensorSlow($string, array $censor) {
foreach ($censor as $word => $position) {
while (($pos = mb_stripos($string, $word)) !== false) {
$string =
mb_substr($string, 0, $pos + $position - 1) .
'*' .
mb_substr($string, $pos + $position);
}
}
return $string;
}
function stringCensorFast($string, array $censor) {
$pattern = [];
foreach ($censor as $word => $position) {
$word = '~(' . mb_substr($word, 0, $position - 1) . ')' . mb_substr($word, $position - 1, 1) . '(' . mb_substr($word, $position) . ')~iu';
$pattern[$word] = '$1*$2';
}
return preg_replace(array_keys($pattern), array_values($pattern), $string);
}
Use example :
echo stringCensorSlow($string, $censor);
# J*vaSCRIPT and p*p are te*ting te*t-ground for TE*T ŠĐČ*Ž ŠĐč*Ž!
echo stringCensorFast($string, $censor) . "\n";
# J*vaSCRIPT and p*p are te*ting te*t-ground for TE*T ŠĐČ*Ž ŠĐč*Ž!
Speed test :
foreach (['stringCensorSlow', 'stringCensorFast'] as $func) {
$time = microtime(true);
for ($i = 0; $i < 10000; $i++) {
$func($string, $censor);
}
$time = microtime(true) - $time;
echo "{$func}() took $time\n";
}
output on my localhost was :
stringCensorSlow() took 1.9752140045166
stringCensorFast() took 0.11587309837341
Upgrade #1: added multibyte character safe.
Upgrade #2: added example for preg_replace, which is faster than mb_substr. Tnx to AbsoluteƵERØ
Upgrade #3: added speed test loop and result on my local PC machine.
Make an array of words and replacements. This should be your fastest option in terms of processing, but a little more methodical to setup. Remember when you're setting up your patterns to use the i modifier to make each pattern case insensitive. You could ultimately pull these from a database into the arrays. I've hard-coded the arrays here for the example.
<!DOCTYPE html>
<html>
<meta content="text/html; charset=UTF-8" http-equiv="content-type">
<?php
$word_to_alter = array(
'!(j)a(v)a(script)(s|ing|ed)?!i',
'!(p)h(p)!i',
'!(m)y(sql)!i',
'!(p)(yth)o(n)!i',
'!(r)u(by)!i',
'!(ВЗЛ)О(М)!iu',
);
$alteration = array(
'$1*$2*$3$4',
'$1*$2',
'$1*$2',
'$1$2*$3',
'$1*$2',
'$1*$2',
);
$string = "Welcome to the world of programming. You can learn PHP, MySQL, Python, Ruby, and Javascript all at your own pace. If you know someone who uses javascripting in their daily routine you can ask them about becoming a programmer who writes JavaScripts. взлом прохладно";
$newstring = preg_replace($word_to_alter,$alteration,$string);
echo $newstring;
?>
</html>
Output
Welcome to the world of programming. You can learn P*P, M*SQL, Pyth*n,
R*by, and J*v*script all at your own pace. If you know someone who
uses j*v*scripting in their daily routine you can ask them about
becoming a programmer who writes J*v*Scripts. взл*м прохладно
Update
It works the same with UTF-8 characters, note that you have to specify a u modifier to make the pattern treated as UTF-8.
u (PCRE_UTF8)
This modifier turns on additional functionality of PCRE that is incompatible with Perl. Pattern strings are treated as UTF-8. This
modifier is available from PHP 4.1.0 or greater on Unix and from PHP
4.2.3 on win32. UTF-8 validity of the pattern is checked since PHP 4.3.5.
Why not just use a little helper function and pass it a word and the desired censor?
function censorWord($word, $censor) {
if(strpos($word, $censor)) {
return preg_replace("/$censor/",'*', $word, 1);
}
}
echo censorWord("Javascript", "a"); // returns J*avascript
echo censorWord("PHP", "H"); // returns P*P
Then you can check the word against your wordlist and if it is a word that should be censored, you can pass it to the function. Then, you also always have the original word as well as the censored one to play with or put back in your sentence.
This would also make it easy to change the number of letters censored by just changing the offset in the preg_replace. All you have to do is keep an array of words, explode the sentence on spaces or something, and then check in_array. If it is in the array, send it to censorWord().
Demo
And here's a more complete example doing exactly what you said in the OP.
function censorWord($word, $censor) {
if(strpos($word, $censor)) {
return preg_replace("/$censor/",'*', $word, 1);
}
}
$word_list = ['php','javascript'];
$data = "Javascript and php are awesome!";
$words = explode(" ", $data);
// pass each word by reference so it can be modified inside our array
foreach($words as &$word) {
if(in_array(strtolower($word), $word_list)) {
// this just passes the second letter of the word
// as the $censor argument
$word = censorWord($word, $word[1]);
}
}
echo implode(" ", $words); // returns J*vascript and p*p are awesome!
Another Demo
You could store a lowercase list of the censored words somewhere, and if you're okay with starring the second letter every time, do something like this:
if (in_array(strtolower($word), $censored_words)) {
$word = substr($word, 0, 1) . "*" . substr($word, 2);
}
If you want to change the first occurrence of a letter, you could do something like:
$censored_words = array('javascript' => 'a', 'php' => 'h', 'ruby' => 'b');
$lword = strtolower($word);
if (in_array($lword, array_keys($censored_words))) {
$ind = strpos($lword, $censored_words[$lword]);
$word = substr($word, 0, $ind) . "*" . substr($word, $ind + 1);
}
This is what I would do:
Create a simple database (text file) and make a "table" of all your censored words and expected censored results. E.G.:
PHP --- P*P
javascript --- j*vascript
HTML --- HT*L
Write PHP code to compare the database information to your simple censored file. You will have to use array explode to create an array of only words. Something like this:
/* Opening database of censored words */
$filename = "/files/censored_words.txt";
$file = fopen( $filename, "r" );
if( $file == false )
{
echo ( "Error in opening file" );
exit();
}
/* Creating an array of words from string*/
$data = explode(" ", $data); // What was "Javascript and PHP are awesome!" has
// become "Javascript", "and", "PHP", "are",
// "awesome!". This is useful.
If your script finds matching words, replace the word in your data with the censored word from your list. You would have to delimit the file first by \r\n and finally by ---. (Or whatever you choose for separating your table with.)
Hope this helped!
Assuming I have a string
$str="0000,1023,1024,1025,1024,1023,1027,1025,1024,1025,0000";
there are three 1024, I want to replace the third with JJJJ, like this :
output :
0000,1023,1024,1025,1024,1023,1027,1025,JJJJ,1025,0000
how to make str_replace can do it
thanks for the help
As your question asks, you want to use str_replace to do this. It's probably not the best option, but here's what you do using that function. Assuming you have no other instances of "JJJJ" throughout the string, you could do this:
$str = "0000,1023,1024,1025,1024,1023,1027,1025,1024,1025,0000";
$str = str_replace('1024','JJJJ',$str,3)
$str = str_replace('JJJJ','1024',$str,2);
Here is what I would do and it should work regardless of values in $str:
function replace_str($str,$search,$replace,$num) {
$pieces = explode(',',$str);
$counter = 0;
foreach($pieces as $key=>$val) {
if($val == $search) {
$counter++;
if($counter == $num) {
$pieces[$key] = $replace;
}
}
}
return implode(',',$pieces);
}
$str="0000,1023,1024,1025,1024,1023,1027,1025,1024,1025,0000";
echo replace_str($str, '1024', 'JJJJ', 3);
I think this is what you are asking in your comment:
function replace_element($str,$search,$replace,$num) {
$num = $num - 1;
$pieces = explode(',',$str);
if($pieces[$num] == $search) {
$pieces[$num] = $replace;
}
return implode(',',$pieces);
}
$str="0000,1023,1024,1025,1024,1023,1027,1025,1024,1025,0000";
echo replace_element($str,'1024','JJJJ',9);
strpos has an offset, detailed here: http://php.net/manual/en/function.strrpos.php
So you want to do the following:
1) strpos with 1024, keep the offset
2) strpos with 1024 starting at offset+1, keep newoffset
3) strpos with 1024 starting at newoffset+1, keep thirdoffset
4) finally, we can use substr to do the replacement - get the string leading up to the third instance of 1024, concatenate it to what you want to replace it with, then get the substr of the rest of the string afterwards and concatenate it to that. http://www.php.net/manual/en/function.substr.php
You can either use strpos() three times to get the position of the third 1024 in your string and then replace it, or you could write a regex to use with preg_replace() that matches the third 1024.
if you want to find the last occurence of your string you can used strrpos
Do it like this:
$newstring = substr_replace($str,'JJJJ', strrpos($str, '1024'), strlen('1024') );
See working demo
Here's a solution with less calls to one and the same function and without having to explode, iterate over the array and implode again.
// replace the first three occurrences
$replaced = str_replace('1024', 'JJJJ', $str, 3);
// now replace the firs two, which you wanted to keep
$final = str_replace('JJJJ', '1024', $replaced, 2);
I'm trying to translate a javascript script in PHP. So far is going good, but I stumbled across some code on which I'm clueless:
while (match = someRegex.exec(text)) {
m = match[0];
if (m === "-") {
var lastIndex = someRegex.lastIndex,
nextToken = someRegex.exec(parts.content);
if (nextToken) {
...
}
someRegex.lastIndex = lastIndex;
}
}
The someRegex variable looks like this:
/[^\\-]+|-|\\(?:[0-3][0-7]{0,2}|[4-7][0-7]?|x[0-9A-Fa-f]{2}|u[0-9A-Fa-f]{4}|c[A-Za-z]|[\S\s]?)/g
exec should be equivalent to preg_match_all in PHP:
preg_match_all($someRegex, $text, $match);
$match = $match[0]; // I get the same results so it works
foreach($match as $m){
if($m === '-'){
// here I don't know how to handle lastIndex and the 2nd exec :(
}
}
I wouldn't use that lastIndex magic at all - essentially you're executing the regex twice on each index. If you really want to do that, you'd need to set the PREG_OFFSET_CAPTURE flag in preg_match_all so that you can get the position, add capture length and use it as the next preg_match offset.
Better use something like this:
preg_match_all($someRegex, $text, $match);
$match = $match[0]; // get all matches (no groups)
$len = count($match);
foreach($match as $i=>$m){
if ($m === '-') {
if ($i+1 < $len) {
$nextToken = $match[$i+1];
…
}
…
}
…
}
Actually, exec is not equivalent to preg_match_all, as exec stops at the first match (the g modifier only sets the lastIndex value to loop through the string). It's equivalent to preg_match.
So you find the first match, get the value thanks to the $array argument, the offset of this value (contained in $flags) and continue your search by setting the offset (last argument).
I guess the second execution won't be a problem as you'll do exactly the same thing as in the javascript version.
Note that I haven't tried the loop, but it should be pretty straightforward once you've figured out how preg_match works exactly with the optional arguments (I'll run some test).
$lastIndex = 0;
while(preg_match($someRegex, $text, $match, PREG_OFFSET_CAPTURE, $lastIndex) {
$m = $match[0][0];
$lastIndex = $match[0][1] + strlen($m); //thanks Bergi for the correction
if($m === '-') {
// assuming the $otherText relate to the parts.content thing
if(preg_match($someRegex, $otherText, $secondMatch, 0, $lastIndex)) {
$nextToken = $secondMatch[0];
...
}
}
}
I guess that should be it (excuse any small error, haven't done php for a while).