How to Skip strrpos Entry - php

Is it possible to skip a strpos/strrpos position?
$string = "This is a cookie 'cookie'.";
$finder = "cookie";
$replacement = "monster";
if (strrpos($string, $finder) !== false)
str_replace($finder, $replacement, $string);
I want to skip the 'cookie' and replace the plain cookie so it'll result in "This is a monster 'cookie'."
I don't have qualms with it finding 'cookie' first and then checking it (Obviously necessary to determine it shouldn't be replaced), but I want to make sure that while 'cookie' is still there, I can use the same function to find the unquoted cookie.
Alternatively, is there a function I haven't found yet (Through hours of searching) to get all indices of a particular word so I can check them all through a loop without the use of regex?
It's important that it's the index, not the word itself, as there are other checks that have to be done based on where in the string the word's located.

You can try a regex instead:
Try the following:
$string = "This is a cookie 'cookie'.";
var_dump(preg_replace("/(?<!')(cookie)/", ' monster', $string));
This uses preg_replace instead of str_replace to replace the string.
Edit: You can use preg_match to get the position of the matched regex in the string like:
$string = "This is a cookie 'cookie'.";
$finder = "cookie";
preg_match("/(?<!')(" . preg_quote($finder) . ")/", $string, $matches, PREG_OFFSET_CAPTURE);
var_dump($matches);
And you can use preg_quote to make sure that preg_match and preg_replace doesn't treat the $finder var as a regex. And the difference in performance is very subtle between preg and other string functions in php. You can run some benchmarks to see how it varies in your case.

The following gives the required replacement as well as the position of the replaced word.
$string = "This is a cookie 'cookie'.";
$finder = "cookie";
$replacement = "monster";
$p = -1; // helps get position of current word
$position = -1; // the position of the word replaced
$arr = explode(' ',$string);
for($i = 0; $i < count($arr); $i += 1){
// Find the position $p of each word and
// Catch $position when a replacement is made
if($i == 0){$p = 0;} else { $w =$arr[$i - 1]; $p += strlen($w) + 1;}
if($arr[$i] == $finder){ if($position < 0){$position = $p;}$arr[$i] = $replacement;}}
$newstring = implode(' ', $arr);
echo $newstring; // gives: This is a monster 'cookie'
echo '<br/>';
echo $position; // gives 10, the position of replaced element.
For the position, the assumption is that the sentence has only single spaces because spaces are used in the explode and implode functions. Otherwise a case of double or larger spaces would require modifications, possibly by replacing spaces with a unique character or set of characters such as #$# which would be used as the first argument of the explode and implode functions.
The code could be modified to capture more than one replacement, e.g. by capturing each replaced position in an array instead of testing for if(position < 0). This would also require to change the way $position is computed because its values are affected by the lengths of previous replacements.

We can do also like this for short-hand :
Include also previous letter to function str_replace like this :
$string = "This is a cookie 'cookie'.";
echo str_replace('a cookie','a monster',$string);

Related

Replace text and add to trailing numerical characters as comma separated

I have a string as follows...
$myString = "2,4,5,8,9,11,Inventory2,Inventory3,Inventory4,Inventory5"
I want to search for anything with the prefix "Inventory" and replace with a number which is dynamically generated. As an example say the number is "24 it will add 24 to 2 making the first matching result 26.
The end result should turn the string to "2,4,5,8,9,11,26,27,28,29"
I know how to search and replace inventory however I am unable to figure out how to add to the trailing number. Thoughts?
$str = "$comma_separated";
$expression = 'Inventory(\*),';
$replace = '24';
$newStr = str_replace("Inventory","24","$comma_separated");
I am using a static number for testing purposes
preg_replace_callback can do it:
$v = 24;
$myString = "2,4,5,8,9,11,Inventory2,Inventory3,Inventory4,Inventory5";
echo preg_replace_callback(
'/Inventory(\d+)/',
function ($m) use ($v) {
return $v + $m[1];
},
$myString
);

preg replace would ignore non-letter characters when detecting words

I have an array of words and a string and want to add a hashtag to the words in the string that they have a match inside the array. I use this loop to find and replace the words:
foreach($testArray as $tag){
$str = preg_replace("~\b".$tag."~i","#\$0",$str);
}
Problem: lets say I have the word "is" and "isolate" in my array. I will get ##isolate at the output. this means that the word "isolate" is found once for "is" and once for "isolate". And the pattern ignores the fact that "#isoldated" is not starting with "is" anymore and it starts with "#".
I bring an example BUT this is only an example and I don't want to just solve this one but every other possiblity:
$str = "this is isolated is an example of this and that";
$testArray = array('is','isolated','somethingElse');
Output will be:
this #is ##isolated #is an example of this and that
You may build a regex with an alternation group enclosed with word boundaries on both ends and replace all the matches in one pass:
$str = "this is isolated is an example of this and that";
$testArray = array('is','isolated','somethingElse');
echo preg_replace('~\b(?:' . implode('|', $testArray) . ')\b~i', '#$0', $str);
// => this #is #isolated #is an example of this and that
See the PHP demo.
The regex will look like
~\b(?:is|isolated|somethingElse)\b~
See its online demo.
If you want to make your approach work, you might add a negative lookbehind after \b: "~\b(?<!#)".$tag."~i","#\$0". The lookbehind will fail all matches that are preceded with #. See this PHP demo.
A way to do that is to split your string by words and to build a associative array with your original array of words (to avoid the use of in_array):
$str = "this is isolated is an example of this and that";
$testArray = array('is','isolated','somethingElse');
$hash = array_flip(array_map('strtolower', $testArray));
$parts = preg_split('~\b~', $str);
for ($i=1; $i<count($parts); $i+=2) {
$low = strtolower($parts[$i]);
if (isset($hash[$low])) $parts[$i-1] .= '#';
}
$result = implode('', $parts);
echo $result;
This way, your string is processed only once, whatever the number of words in your array.

php regex replace each character with asterisk

I am trying to something like this.
Hiding users except for first 3 characters.
EX)
apple -> app**
google -> goo***
abc12345 ->abc*****
I am currently using php like this:
$string = "abcd1234";
$regex = '/(?<=^(.{3}))(.*)$/';
$replacement = '*';
$changed = preg_replace($regex,$replacement,$string);
echo $changed;
and the result be like:
abc*
But I want to make a replacement to every single character except for first 3 - like:
abc*****
How should I do?
Don't use regex, use substr_replace:
$var = "abcdef";
$charToKeep = 3;
echo strlen($var) > $charToKeep ? substr_replace($var, str_repeat ( '*' , strlen($var) - $charToKeep), $charToKeep) : $var;
Keep in mind that regex are good for matching patterns in string, but there is a lot of functions already designed for string manipulation.
Will output:
abc***
Try this function. You can specify how much chars should be visible and which character should be used as mask:
$string = "abcd1234";
echo hideCharacters($string, 3, "*");
function hideCharacters($string, $visibleCharactersCount, $mask)
{
if(strlen($string) < $visibleCharactersCount)
return $string;
$part = substr($string, 0, $visibleCharactersCount);
return str_pad($part, strlen($string), $mask, STR_PAD_RIGHT);
}
Output:
abc*****
Your regex matches all symbols after the first 3, thus, you replace them with a one hard-coded *.
You can use
'~(^.{3}|(?!^)\G)\K.~'
And replace with *. See the regex demo
This regex matches the first 3 characters (with ^.{3}) or the end of the previous successful match or start of the string (with (?!^)\G), and then omits the characters matched from the match value (with \K) and matches any character but a newline with ..
See IDEONE demo
$re = '~(^.{3}|(?!^)\G)\K.~';
$strs = array("aa","apple", "google", "abc12345", "asdddd");
foreach ($strs as $s) {
$result = preg_replace($re, "*", $s);
echo $result . PHP_EOL;
}
Another possible solution is to concatenate the first three characters with a string of * repeated the correct number of times:
$text = substr($string, 0, 3).str_repeat('*', max(0, strlen($string) - 3));
The usage of max() is needed to avoid str_repeat() issue a warning when it receives a negative argument. This situation happens when the length of $string is less than 3.

match whole word only without regex

Since i cant use preg_match (UTF8 support is somehow broken, it works locally but breaks at production) i want to find another way to match word against blacklist. Problem is, i want to search a string for exact match only, not first occurrence of the string.
This is how i do it with preg_match
preg_match('/\b(badword)\b/', strtolower($string));
Example string:
$string = "This is a string containing badwords and one badword";
I want to only match the "badword" (at the end) and not "badwords".
strpos('badword', $string) matches the first one
Any ideas?
Assuming you could do some pre-processing, you could use replace all your punctuation marks with white spaces and put everything in lowercase and then either:
Use strpos with something like so strpos(' badword ', $string) in a while loop to keep on iterating through your entire document;
Split the string at white spaces and compare each word with a list of bad words you have.
So if you where trying the first option, it would something like so (untested pseudo code)
$documet = body of text to process . ' '
$document.replace('!##$%^&*(),./...', ' ')
$document.toLowerCase()
$arr_badWords = [...]
foreach($word in badwords)
{
$badwordIndex = strpos(' ' . $word . ' ', $document)
while(!badWordIndex)
{
//
$badwordIndex = strpos($word, $document)
}
}
EDIT: As per #jonhopkins suggestion, adding a white space at the end should cater for the scenario where there wanted word is at the end of the document and is not proceeded by a punctuation mark.
If you want to mimic the \b modifier of regex you can try something like this:
$offset = 0;
$word = 'badword';
$matched = array();
while(($pos = strpos($string, $word, $offset)) !== false) {
$leftBoundary = false;
// If is the first char, it has a boundary on the right
if ($pos === 0) {
$leftBoundary = true;
// Else, if it is on the middle of the string, we must check the previous char
} elseif ($pos > 0 && in_array($string[$pos-1], array(' ', '-',...)) {
$leftBoundary = true;
}
$rightBoundary = false;
// If is the last char, it has a boundary on the right
if ($pos === (strlen($string) - 1)) {
$rightBoundary = true;
// Else, if it is on the middle of the string, we must check the next char
} elseif ($pos < (strlen($string) - 1) && in_array($string[$pos+1], array(' ', '-',...)) {
$rightBoundary = true;
}
// If it has both boundaries, we add the index to the matched ones...
if ($leftBoundary && $rightBoundary) {
$matched[] = $pos;
}
$offset = $pos + strlen($word);
}
You can use strrpos() instead of strpos:
strrpos — Find the position of the last occurrence of a substring in a string
$string = "This is a string containing badwords and one badword";
var_dump(strrpos($string, 'badword'));
Output:
45
A simple way to use word boundaries with unicode properties:
preg_match('/(?:^|[^pL\pN_])(badword)(?:[^pL\pN_]|$)/u', $string);
In fact it's much more complicated, have a look at here.

Parse text between 2 words

For sure this has already been asked by someone else, however I've searched here on SO and found nothing https://stackoverflow.com/search?q=php+parse+between+words
I have a string and want to get an array with all the words contained between 2 delimiters (2 words). I am not confident with regex so I ended up with this solution, but it is not appropiate because I need to get all the words that match those requirements and not only the first one.
$start_limiter = 'First';
$end_limiter = 'Second';
$haystack = $string;
# Step 1. Find the start limiter's position
$start_pos = strpos($haystack,$start_limiter);
if ($start_pos === FALSE)
{
die("Starting limiter ".$start_limiter." not found in ".$haystack);
}
# Step 2. Find the ending limiters position, relative to the start position
$end_pos = strpos($haystack,$end_limiter,$start_pos);
if ($end_pos === FALSE)
{
die("Ending limiter ".$end_limiter." not found in ".$haystack);
}
# Step 3. Extract the string between the starting position and ending position
# Our starting is the position of the start limiter. To find the string we must take
# the ending position of our end limiter and subtract that from the start limiter
$needle = substr($haystack, $start_pos+1, ($end_pos-1)-$start_pos);
echo "Found $needle";
I thought also about using explode() but I think a regex could be better and faster.
I'm not much familiar with PHP, but it seems to me that you can use something like:
if (preg_match("/(?<=First).*?(?=Second)/s", $haystack, $result))
print_r($result[0]);
(?<=First) looks behind for First but doesn't consume it,
.*? Captures everything in between First and Second,
(?=Second) looks ahead for Second but doesn't consume it,
The s at the end is to make the dot . match newlines if any.
To get all the text between those delimiters, you use preg_match_all and you can use a loop to get each element:
if (preg_match_all("/(?<=First)(.*?)(?=Second)/s", $haystack, $result))
for ($i = 1; count($result) > $i; $i++) {
print_r($result[$i]);
}
Not sure that the result will be faster than your code, but you can do it like this with regex:
$pattern = '~(?<=' . preg_quote($start, '~')
. ').+?(?=' . preg_quote($end, '~') . ')~si';
if (preg_match($pattern, $subject, $match))
print_r($match[0]);
I use preg_quote to escape all characters that have a special meaning in a regex (like +*|()[]{}.? and the pattern delimiter ~)
(?<=..) is a lookbehind assertion that check a substring before what you want to find.
(?=..) is a lookahead assertion (same thing for after)
.+? means all characters one or more times but the less possible (the question mark make the quantifier lazy)
s allows the dot to match newlines (not the default behavior)
i make the search case insensitive (you can remove it, if you don't need)
This allows you to run the same function with different parameters, just so you don't have to rewrite this bit of code all of the time. Also uses the strpos which you used. Has been working great for me.
function get_string_between($string, $start, $end){
$string = " ".$string;
$ini = strpos($string,$start);
if ($ini == 0) return "";
$ini += strlen($start);
$len = strpos($string,$end,$ini) - $ini;
return substr($string,$ini,$len);
}
$fullstring = 'This is a long set of words that I am going to use.';
$parsed = get_string_between($fullstring, 'This', "use");
echo $parsed;
Will output:
is a long set of words that I am going to
Here's a simple example for finding everything between the words 'mega' and 'yo' for the string $t.
PHP Example
$t = "I am super mega awesome-sauce, yo!";
$arr = [];
preg_match("/mega\ (.*?)\ yo/ims", $t, $arr);
echo $arr[1];
PHP Output
awesome-sauce,
You can also use two explode statements.
For example, say you want to get "z" in y=mx^z+b. To get z:
$formula="y=mx^z+b";
$z=explode("+",explode("^",$formula)[1])[0];
First I get everything after ^: explode("^",$formula)[1]
Then I get everything before +: explode("+",$previousExplode)[0]

Categories