Having trouble with preg_split and using regular expressions - php

I'm having a tough time learning regex and preg_split.
I'm trying to apply what i've learned and can't seem to get a simple search going..
I've tried many variations, but can't separate between the bold tags, and only bold tags
<?php
$string = "<b>this is</b> <i>not</b> <b>bold</b>";
$find = '/<b>/'; // works as expected, separating at <b>
$find = '/<b>|<\/b>/'; // works as expected, separating at either <b> or </b>
$find = '/<b>*<\/b>/'; // why doesn't this work?
$find = '/^<b>*<\/b>/'; // why doesn't this work?
$find = '/<b>.<\/b>/'; // why doesn't this work
$result = preg_split($find, $string);
print_r($result);
?>
As you can see, i'm trying to incorporate the . + or start ^/ finish $ characters.
What am I doing so very wrong where it isn't working the way I expected?
Thanks for all your help!
p.s. found this which is very helpful too

The first two "why doesn't this work" are matching <b followed by zero or more > characters, followed by </b>. The last one matches <b> then any single character then </b>.
I'm not sure what you're trying to do exactly, but this would split on start and end bold tags: <\/?b> - it matches <, followed by an optional /, followed by b>.

$find = '/<b>*<\/b>/'; // why doesn't this work?
Matches "<b", zero or more ">", followed by "</b>".
Perhaps you meant this:
$find = '/<b>.*?<\/b>/';
That would match "<b>", followed by a string of unknown length, ending at the first occurrence of "</b>". I'm not sure why you would split on that though; applied on the above you would get an array of three elements:
" "
"<i>not</b> "
""
To match everything inside "<b>" and "</b>" you need preg_match_all():
preg_match_all('#<b>(.*?)</b>#i', $str, $matches);
// $matches[1] will contain the patterns inside the bold tag, theoratically
Do note that nested tags are not a great fit for regular expressions and you'd be wanting to use DOMDocument.
$find = '/^<b>*<\/b>/'; // why doesn't this work?
Matches "<b" at the start of the string, zero or more ">", followed by "</b>".
$find = '/<b>.<\/b>/'; // why doesn't this work
Matches "<b>", followed by any character, followed by "</b>".

Related

Replace all occurrences using preg_replace

The code below works perfectly:
$string = '(test1)';
$new = preg_replace('/^\(+.+\)+$/','word',$string);
echo $new;
Output:
word
If the code is this:
$string = '(test1) (test2) (test3)';
How to generate output:
word word word?
Why my regex do not work ?
^ and $ are anchors which means match should start from start of string and expand upto end of string
. means match anything except newline, + means one or more, by default regex is greedy in nature so it tries to match as much as possible where as we want to match ( ) so we need to change the pattern a bit
You can use
\([^)]+\)
$string = '(test1) (test2) (test3)';
$new = preg_replace('/\([^)]+\)/','word',$string);
echo $new;
Regex Demo

How to get the index of last word with an uppercase letter in PHP

Considering this input string:
"this is a Test String to get the last index of word with an uppercase letter in PHP"
How can I get the position of the last uppercase letter (in this example the position of the first "P" (not the last one "P") of "PHP" word?
I think this regex works. Give it a try.
https://regex101.com/r/KkJeho/1
$pattern = "/.*\s([A-Z])/";
//$pattern = "/.*\s([A-Z])[A-Z]+/"; pattern to match only all caps word
Edit to solve what Wiktor wrote in comments I think you could str_replace all new lines with space as the input string in the regex.
That should make the regex treat it as a single line regex and still give the correct output.
Not tested though.
To find the position of the letter/word:
$str = "this is a Test String to get the last index of word with an uppercase letter in PHP";
$pattern = "/.*\s([A-Z])(\w+)/";
//$pattern = "/.*\s([A-Z])([A-Z]+)/"; pattern to match only all caps word
preg_match($pattern, $str, $match);
$letter = $match[1];
$word = $match[1] . $match[2];
$position = strrpos($str, $match[1].$match[2]);
echo "Letter to find: " . $letter . "\nWord to find: " . $word . "\nPosition of letter: " . $position;
https://3v4l.org/sJilv
If you also want to consider a non-regex version: You can try splitting the string at the whitespace character, iterating the resulting string array backwards and checking if the current string's first character is an upper case character, something like this (you may want to add index/null checks):
<?php
$str = "this is a Test String to get the last index of word with an uppercase letter in PHP";
$explodeStr = explode(" ",$str);
$i = count($explodeStr) - 1;
$characterCount=0;
while($i >= 0) {
$firstChar = $explodeStr[$i][0];
if($firstChar == strtoupper($firstChar)){
echo $explodeStr[$i]. ' at index: ';
$idx = strlen($str)-strlen($explodeStr[$i] -$characterCount);
echo $idx;
break;
}
$characterCount += strlen($explodeStr[i]) +1; //+1 for whitespace
$i--;
}
This prints 80 which is indeed the index of the first P in PHP (including whitespaces).
Andreas' pattern looks pretty solid, but this will find the position faster...
.* \K[A-Z]{2,}
Pattern Demo
Here is the PHP implementation: Demo
$str='this is a Test String to get the last index of word with an uppercase letter in PHP test';
var_export(preg_match('/.* \K[A-Z]{2,}/',$str,$out,PREG_OFFSET_CAPTURE)?$out[0][1]:'fail');
// 80
If you want to see a condensed non-regex method, this will work:
Code: Demo
$str='this is a Test String to get the last index of word with an uppercase letter in PHP test';
$allcaps=array_filter(explode(' ',$str),'ctype_upper');
echo "Position = ",strrpos($str,end($allcaps));
Output:
Position = 80
This assumes that there is an all caps word in the input string. If there is a possibility of no all-caps words, then a conditional would sort it out.
Edit, after re-reading the question, I am unsure what exactly makes PHP the targeted substring -- whether it is because it is all caps, or just the last word to start with a capitalized letter.
If just the last word starting with an uppercase letter then this pattern will do: /.* \K[A-Z]/
If the word needs to be all caps, then it is possible that /b word boundaries may be necessary.
Some more samples and explanation from the OP would be useful.
Another edit, you can declare a set of characters to exclude and use just two string functions. I am using a-z and a space with rtrim() then finding the right-most space, and adding 1 to it.
$str='this is a Test String to get the last index of word with an uppercase letter in PHP test';
echo strrpos(rtrim($str,'abcdefghijklmnopqrstuvwxyz '),' ')+1;
// 80

Php select from string

Hi I'm new to php and I need a little help
I need to change the text that is between ** in php string and put it between html tag
$text = "this is an *example*";
But I really don't know how and i need help
personally I would use explode, you can then piece the sentence back together if the example appears in the middle of a sentence
<?php
$text = "this is an *example*";
$pieces = explode("*", $text);
echo $pieces[0];
?>
Edit:
Since you're looking for what basically amounts to custom BB Code use this
$text = "this is an *example*";
$find = '~[\*](.*?)[\*]~s';
$replace = '<span style="color: green">$1</span>';
echo preg_replace($find,$replace,$text);
You can add this to a function and have it parse any text that gets passed to it, you can also make the find and replace variables into arrays and add more codes to it
You really should use a DOM parser for things like this, but if you can guaratee it will always be the * character you can use some regex:
$text = "this is an *example*";
$regex = '/(?<=\*)(.*?)(?=\*)/';
$replacement = 'ostrich';
$new_text = preg_replace($regex, $replacement, $text);
echo $new_text;
Returns
this is an *ostrich*
Here is how the regex works:
Positive Lookbehind (?<=\*)
\* matches the character * literally (case sensitive)
1st Capturing Group (.*?)
.*? matches any character (except for line terminators)
*? Quantifier — Matches between zero and unlimited times, as few times as possible, expanding as needed (lazy)
Positive Lookahead (?=\*)
\* matches the character * literally (case sensitive)
This regex essentially starts and ends by looking at what is ahead of and behind the search character you specified and leaves those characters intact during the replacement with preg_replace().

REGEX - php preg_replace :

How to replace part of a string based on a "." (period) character
only if it appears after/before/between a word(s),
not when it is before/between any number(s).
Example:
This is a text string.
-Should be able to replace the "string." with "string ."
(note the Space between the end of the word and the period)
Example 2
This is another text string. 2.0 times longer.
-Should be able to replace "string." with "string ."
(note the Space between the end of the word and the period)
-Should Not replace "2.0" with "2 . 0"
It should only do the replacement if the "." appears at the end/start of a word.
Yes - I've tried various bits of regex.
But everything I do results in either nothing happening,
or the numbers are fine, but I take the last letter from the word preceeding the "."
(thus instead of "string." I end up with "strin g.")
Yes - I've looked through numerous posts here - I have seen Nothing that deals with the desire, nor the "strange" problem of grabbing the char before the ".".
You can use a lookbehind (?<=REXP)
preg_replace("/(?<=[a-z])\./i", "XXX", "2.0 test. abc") // <- "2.0 testXXX abc"
which will only match if the text before matches the corresponding regex (in this case [a-z]). You may use a lookahead (?=REXP) in the same way to test text after the match.
Note: There is also a negative lookbehind (?<!REXP) and a negative lookahead (?!REXP) available which will reject matches if the REXP does not match before or after.
$input = "This is another text string. 2.0 times longer.";
echo preg_replace("/(^\.|\s\.|\.\s|\.$)/", " $1", $input);
http://ideone.com/xJQzQ
I'm not too good with the regex, but this is what I would do to accomplish the task with basic PHP. Basically explode the entire string by it's . values, look at each variable to see if the last character is a letter or number and add a space if it's a number, then puts the variable back together.
<?
$string = "This is another text string. 2.0 times longer.";
print("$string<br>");
$string = explode(".",$string);
$stringcount = count($string);
for($i=0;$i<$stringcount;$i++){
if(is_numeric(substr($string[$i],-1))){
$string[$i] = $string[$i] . " ";
}
}
$newstring = implode('.',$string);
print("$newstring<br>");
?>
It should only do the replacement if the "." appears at the end/start of a word.
search: /([a-z](?=\.)|\.(?=[a-z]))/
replace: "$1 "
modifiers: ig (case insensitive, globally)
Test in Perl:
use strict;
use warnings;
my $samp = '
This is another text string. 2.0 times longer.
I may get a string of "this and that.that and this.this 2.34 then that 78.01."
';
$samp =~ s/([a-z](?=\.)|\.(?=[a-z]))/$1 /ig;
print "$samp";
Input:
This is another text string. 2.0 times longer.
I may get a string of "this and that.that and this.this 2.34 then that 78.01."
Output:
This is another text string . 2.0 times longer .
I may get a string of "this and that . that and this . this 2.34 then that 78.01."

Attempting to understand handling regular expressions with php

I am trying to make sense of handling regular expression with php. So far my code is:
PHP code:
$string = "This is a 1 2 3 test.";
$pattern = '/^[a-zA-Z0-9\. ]$/';
$match = preg_match($pattern, $string);
echo "string: " . $string . " regex response: " , $match;
Why is $match always returning 0 when I think it should be returning a 1?
[a-zA-Z0-9\. ] means one character which is alphanumeric or "." or " ". You will want to repeat this pattern:
$pattern = '/^[a-zA-Z0-9. ]+$/';
^
"one or more"
Note: you don't need to escape . inside a character group.
Here's what you're pattern is saying:
'/: Start the expressions
^: Beginning of the string
[a-zA-Z0-9\. ]: Any one alphanumeric character, period or space (you should actually be using \s for spaces if your intention is to match any whitespace character).
$: End of the string
/': End the expression
So, an example of a string that would yield a match result is:
$string = 'a'
Of other note, if you're actually trying to get the matches from the result, you'll want to use the third parameter of preg_match:
$numResults = preg_match($pattern, $string, $matches);
You need a quantifier on the end of your character class, such as +, which means match 1 or more times.
Ideone.

Categories