REGEX - php preg_replace : - php

How to replace part of a string based on a "." (period) character
only if it appears after/before/between a word(s),
not when it is before/between any number(s).
Example:
This is a text string.
-Should be able to replace the "string." with "string ."
(note the Space between the end of the word and the period)
Example 2
This is another text string. 2.0 times longer.
-Should be able to replace "string." with "string ."
(note the Space between the end of the word and the period)
-Should Not replace "2.0" with "2 . 0"
It should only do the replacement if the "." appears at the end/start of a word.
Yes - I've tried various bits of regex.
But everything I do results in either nothing happening,
or the numbers are fine, but I take the last letter from the word preceeding the "."
(thus instead of "string." I end up with "strin g.")
Yes - I've looked through numerous posts here - I have seen Nothing that deals with the desire, nor the "strange" problem of grabbing the char before the ".".

You can use a lookbehind (?<=REXP)
preg_replace("/(?<=[a-z])\./i", "XXX", "2.0 test. abc") // <- "2.0 testXXX abc"
which will only match if the text before matches the corresponding regex (in this case [a-z]). You may use a lookahead (?=REXP) in the same way to test text after the match.
Note: There is also a negative lookbehind (?<!REXP) and a negative lookahead (?!REXP) available which will reject matches if the REXP does not match before or after.

$input = "This is another text string. 2.0 times longer.";
echo preg_replace("/(^\.|\s\.|\.\s|\.$)/", " $1", $input);
http://ideone.com/xJQzQ

I'm not too good with the regex, but this is what I would do to accomplish the task with basic PHP. Basically explode the entire string by it's . values, look at each variable to see if the last character is a letter or number and add a space if it's a number, then puts the variable back together.
<?
$string = "This is another text string. 2.0 times longer.";
print("$string<br>");
$string = explode(".",$string);
$stringcount = count($string);
for($i=0;$i<$stringcount;$i++){
if(is_numeric(substr($string[$i],-1))){
$string[$i] = $string[$i] . " ";
}
}
$newstring = implode('.',$string);
print("$newstring<br>");
?>

It should only do the replacement if the "." appears at the end/start of a word.
search: /([a-z](?=\.)|\.(?=[a-z]))/
replace: "$1 "
modifiers: ig (case insensitive, globally)
Test in Perl:
use strict;
use warnings;
my $samp = '
This is another text string. 2.0 times longer.
I may get a string of "this and that.that and this.this 2.34 then that 78.01."
';
$samp =~ s/([a-z](?=\.)|\.(?=[a-z]))/$1 /ig;
print "$samp";
Input:
This is another text string. 2.0 times longer.
I may get a string of "this and that.that and this.this 2.34 then that 78.01."
Output:
This is another text string . 2.0 times longer .
I may get a string of "this and that . that and this . this 2.34 then that 78.01."

Related

get the portion of a string between two positions with php

I have a string like "some words 12345cm some more words"
and I want to extract the 12345cm bit from that string. So I get the position of the first number:
$position_of_first_number = strcspn( "some words 12345cm some more words" , '0123456789' );
Then the position of the first space after $position_of_first_number
$position_of_space_after_numbers = strpos("some words 12345cm some more words", " ", $position_of_first_number);
Then I want to have a function which return the portion of the string between $position_of_first_number and $position_of_space_after_numbers.
How do I do it?
You can use the substr function. Note that it takes a starting position and a length, which you can calculate as the difference between the start and end positions.
Since you are looking for a pattern like blank-digits-letters-blank, I would recommend a regular expression using preg_match:
$s = "some words 12345cm some more words";
preg_match("/\s(?P<result>\d+[^\W\d_]+)\s/", $s, $matches);
echo $matches["result"];
12345cm
Explaining the pattern:
"/.../" limits the pattern in PHP
\s matches any whitespace character
(?P<name>...) names the following pattern
\d+ matches 1 or more digits
[^\W\d_]+ matches 1 or more Unicode-letters (i.e. any character that is not a non-alphanumeric character; see this answer)

Find and replace string with condition in php

I am newbie in PHP. I want to replace certain characters in a string. My code is in below:
$str="this 'is' a new 'string and i wanna' replace \"in\" \"it here\"";
$find = [
'\'',
'"'
];
$replace = [
['^', '*']
['#', '#']
];
$result = null;
$odd = true;
for ($i=0; $i < strlen($str); $i++) {
if (in_array($str[$i], $find)) {
$key = array_search($str[$i], $find);
$result .= $odd ? $replace[$key][0] : $replace[$key][1];
$odd = !$odd;
} else {
$result .= $str[$i];
}
}
echo $result;
the output of the above code is:
this ^is* a new ^string and i wanna* replace #in# #it here#.
but I want the output to be:
this ^is* a new 'string and i wanna' replace #in# "it here".
That means character will replace for both quotation(left quotation and right quotation- condition is for ' and "). for single quotation, string will not be replaced either if have left or right quotation. it will be replaced for left and right quotation.
Ok, I don't know what all that code is trying to accomplish.
But anyway here is my go at it
$str = "this 'is' a new 'string and i wanna' replace \"in\" \"it here\"";
$str = preg_replace(["/'([^']+)'/",'/"([^"]+)"/'], ["^$1*", "#$1#"], $str, 1);
print_r($str);
You can test it here
Ouptput
this ^is* a new 'string and i wanna' replace #in# "it here"
Using preg_replace and a fairly simple Regular expression, we can replace the quotes. Now the trick here is the fourth parameter of preg_replace is $count And is defined as this:
count If specified, this variable will be filled with the number of replacements done.
Therefore, setting this to 1 limits it to the first match only. In other words it will do $count replacements, or 1 in this case. Now because it's an array of patterns, each pattern is treated separately. So each one is basically treated as a separate operation, and thus each is allowed $count matches, or each get 1 match/replacement.
Now rather or not this fits every use case you have I cannot say, but it's the most straight forward way to do it for the example you provided.
As for the match itself /'([^']+)'/
/ opening and closing "delimiters" for the Expression (its a required thing, although it doesn't have to be /)
' literal match, matches ' one time (the opening quote)
( ... ) capture group (group1) so we can use it in the replacement, as $1
[^']+ character set with a [^ not modifier, match anything not in the set, so anything that is not a ' one or more times, greedy
' literal match, matches ' one time (the ending quote)
The replacement "^$1*"
^ literal, adds this char in
$1 use the contents of the capture group (group1)
* literal, adds the char in
Hope that helps understand how it works.
UPDATE
Ok I think I finally deciphered what you want:
string will be replaced for if any word have left and right quotation. example..'word'..here string will be changed..but 'word...in this case not change or word' also not be changed.
This seems like you are trying to say only "whole" words with no spaces.
So in that case we have to adjust our regular expression like this:
$str = preg_replace(["/'([-\w]+)'/",'/"([-\w]+)"/'], ["^$1*", "#$1#"], $str);
So we removed the limit $count and we changed what is in the character group to be more strict:
[-\w]+ the \w means the working set, or in other words a-zA-Z0-9_ then the - is a literal (it has to/should go first in this case)
What we are saying with this is to match only strings that start and end with a quote(single|double) and only if the string within them match the working set plus the hyphen. This does not include the space. This way in the first case, your example, it produces the same result, but if you were to flip it to
//[ORIGINAL] this 'is' a new 'string and i wanna' replace \"in\" \"it here\"
this a new 'string and i wanna' replace 'is' \"it here\" \"in\"
You would get his output
this a new 'string and i wanna' replace ^is* \"it here\" #in#
Before this change you would have gotten
this a new ^string and i wanna* replace 'is' #it here# "in"
In other words it would have only replaced the first occurrence, now it will replace anything between the quotes if and only if it's a whole word.
As a final note you can be even more strict if you only want alpha characters by changing the character set to this [a-zA-Z]+, then it will match only a to z, upper or lower case. Whereas the example above will match 0 to 9 (or any combination of them) the - hyphen, the _ underline and the previously mentioned alpha sets.
Hope that is what you need.

PHP regex replace 'minus'

I am trying to remove the 'minus' from a string, but if there are three in sequence I want to keep one.
For example:
today-is---sunny--but-yesterday---it-wasnt
Become:
today is - sunny but yesterday - it wasnt
I was trying to str_replace the - but obviously is removin all of them.
Basicaly I want to remove maximum 2 in sequence.. If there's more keep it.
Not smart enough to make it into one regex so here's 2:
$string = "today-is---sunny--but-yesterday---it-wasnt";
$string = preg_replace("/\b-{1,2}\b/", " ", $string);
$string = preg_replace("/\b-{3,}\b/", " - ", $string);
Seems to work
I would do this in 2 steps using regex.
First, replace the minus symbol with a space if there is 1 or 2 surrounded by word boundries.
preg_replace("/(\b(-){1,2}\b)/", " ", $string);
Pattern (regex101):
word boundry | minus sign (1 or 2) | word boundry
Then, replace all instances of 3 or more minus signs with a minus sign surrounded by spaces.
preg_replace("/(\b(-){3,}\b)/", " - ", $string);
Pattern (regex101):
word boundry | minus sign (3 or more) | word boundry
Note: None of the parenthesis in my example code patterns are required, buet I believe they help readability.
I personally love the way regex101 lays out exactly what is happening in the top right corner of the website with a given pattern, so if you'd like to learn more about how this (or other regex patterns) work, then regex101 is a wonderful resource.
Solution with a callback:
$new = preg_replace_callback(
'/[-]+/',
function ($m) {
return 2 < strlen($m[0])? ' - ' : ' ';
},
'today-is---sunny--but-yesterday---it-wasnt'
);
// today is - sunny but yesterday - it wasnt
Okay, I think this handles the cases mentioned by your updates
$string = "today-is---sunny--but-yesterday---it-wasnt----nothing-----five";
$newstring = preg_replace("/(\-{1,2})(?!\-)/", " ", $string);
$newstring = preg_replace("/(\-+)/", " $1", $newstring);
echo $newstring;
Output is:
today is - sunny but yesterday - it wasnt -- nothing --- five
DEMO
So it matches 1 or 2 dashes that are not followed by a dash and replaces with a space. In the case of more than 2 consecutive dashes, this means it matches only the last 2 in the consecutive string. Then we match a group of 1 or more dashes and precede it with a space.
Do it in three steps:
first replace '/---+/' by '#minus#' (or some other recognisable placeholder)
then replace all /[- ]+/ by ' ' (a single blank)
replace all'#minus#' by ' - '
Just replace all "--" Occurence, and you should get only "-" after that
Have you tried str_replace("---","-",$string)?. This way if there are three minus in sequence, they will be replaced by only one.

How to get the index of last word with an uppercase letter in PHP

Considering this input string:
"this is a Test String to get the last index of word with an uppercase letter in PHP"
How can I get the position of the last uppercase letter (in this example the position of the first "P" (not the last one "P") of "PHP" word?
I think this regex works. Give it a try.
https://regex101.com/r/KkJeho/1
$pattern = "/.*\s([A-Z])/";
//$pattern = "/.*\s([A-Z])[A-Z]+/"; pattern to match only all caps word
Edit to solve what Wiktor wrote in comments I think you could str_replace all new lines with space as the input string in the regex.
That should make the regex treat it as a single line regex and still give the correct output.
Not tested though.
To find the position of the letter/word:
$str = "this is a Test String to get the last index of word with an uppercase letter in PHP";
$pattern = "/.*\s([A-Z])(\w+)/";
//$pattern = "/.*\s([A-Z])([A-Z]+)/"; pattern to match only all caps word
preg_match($pattern, $str, $match);
$letter = $match[1];
$word = $match[1] . $match[2];
$position = strrpos($str, $match[1].$match[2]);
echo "Letter to find: " . $letter . "\nWord to find: " . $word . "\nPosition of letter: " . $position;
https://3v4l.org/sJilv
If you also want to consider a non-regex version: You can try splitting the string at the whitespace character, iterating the resulting string array backwards and checking if the current string's first character is an upper case character, something like this (you may want to add index/null checks):
<?php
$str = "this is a Test String to get the last index of word with an uppercase letter in PHP";
$explodeStr = explode(" ",$str);
$i = count($explodeStr) - 1;
$characterCount=0;
while($i >= 0) {
$firstChar = $explodeStr[$i][0];
if($firstChar == strtoupper($firstChar)){
echo $explodeStr[$i]. ' at index: ';
$idx = strlen($str)-strlen($explodeStr[$i] -$characterCount);
echo $idx;
break;
}
$characterCount += strlen($explodeStr[i]) +1; //+1 for whitespace
$i--;
}
This prints 80 which is indeed the index of the first P in PHP (including whitespaces).
Andreas' pattern looks pretty solid, but this will find the position faster...
.* \K[A-Z]{2,}
Pattern Demo
Here is the PHP implementation: Demo
$str='this is a Test String to get the last index of word with an uppercase letter in PHP test';
var_export(preg_match('/.* \K[A-Z]{2,}/',$str,$out,PREG_OFFSET_CAPTURE)?$out[0][1]:'fail');
// 80
If you want to see a condensed non-regex method, this will work:
Code: Demo
$str='this is a Test String to get the last index of word with an uppercase letter in PHP test';
$allcaps=array_filter(explode(' ',$str),'ctype_upper');
echo "Position = ",strrpos($str,end($allcaps));
Output:
Position = 80
This assumes that there is an all caps word in the input string. If there is a possibility of no all-caps words, then a conditional would sort it out.
Edit, after re-reading the question, I am unsure what exactly makes PHP the targeted substring -- whether it is because it is all caps, or just the last word to start with a capitalized letter.
If just the last word starting with an uppercase letter then this pattern will do: /.* \K[A-Z]/
If the word needs to be all caps, then it is possible that /b word boundaries may be necessary.
Some more samples and explanation from the OP would be useful.
Another edit, you can declare a set of characters to exclude and use just two string functions. I am using a-z and a space with rtrim() then finding the right-most space, and adding 1 to it.
$str='this is a Test String to get the last index of word with an uppercase letter in PHP test';
echo strrpos(rtrim($str,'abcdefghijklmnopqrstuvwxyz '),' ')+1;
// 80

Attempting to understand handling regular expressions with php

I am trying to make sense of handling regular expression with php. So far my code is:
PHP code:
$string = "This is a 1 2 3 test.";
$pattern = '/^[a-zA-Z0-9\. ]$/';
$match = preg_match($pattern, $string);
echo "string: " . $string . " regex response: " , $match;
Why is $match always returning 0 when I think it should be returning a 1?
[a-zA-Z0-9\. ] means one character which is alphanumeric or "." or " ". You will want to repeat this pattern:
$pattern = '/^[a-zA-Z0-9. ]+$/';
^
"one or more"
Note: you don't need to escape . inside a character group.
Here's what you're pattern is saying:
'/: Start the expressions
^: Beginning of the string
[a-zA-Z0-9\. ]: Any one alphanumeric character, period or space (you should actually be using \s for spaces if your intention is to match any whitespace character).
$: End of the string
/': End the expression
So, an example of a string that would yield a match result is:
$string = 'a'
Of other note, if you're actually trying to get the matches from the result, you'll want to use the third parameter of preg_match:
$numResults = preg_match($pattern, $string, $matches);
You need a quantifier on the end of your character class, such as +, which means match 1 or more times.
Ideone.

Categories