I'm trying to use regular expressions (preg_match and preg_replace) to do the following:
Find a string like this:
{%title=append me to the title%}
Then extract out the title part and the append me to the title part. Which I can then use to perform a str_replace(), etc.
Given that I'm terrible at regular expressions, my code is failing...
preg_match('/\{\%title\=(\w+.)\%\}/', $string, $matches);
What pattern do I need? :/
I think it's because the \w operator doesn't match spaces. Because everything after the equal sign is required to fit in before your closing %, it all has to match whatever is inside those brackets (or else the entire expression fails to match).
This bit of code worked for me:
$str = '{%title=append me to the title%}';
preg_match('/{%title=([\w ]+)%}/', $str, $matches);
print_r($matches);
//gives:
//Array ([0] => {%title=append me to the title%} [1] => append me to the title )
Note that the use of the + (one or more) means that an empty expression, ie. {%title=%} won't match. Depending on what you expect for white space, you might want to use the \s after the \w character class instead of an actual space character. \s will match tabs, newlines, etc.
You can try:
$str = '{%title=append me to the title%}';
// capture the thing between % and = as title
// and between = and % as the other part.
if(preg_match('#{%(\w+)\s*=\s*(.*?)%}#',$str,$matches)) {
$title = $matches[1]; // extract the title.
$append = $matches[2]; // extract the appending part.
}
// find these.
$find = array("/$append/","/$title/");
// replace the found things with these.
$replace = array('IS GOOD','TITLE');
// use preg_replace for replacement.
$str = preg_replace($find,$replace,$str);
var_dump($str);
Output:
string(17) "{%TITLE=IS GOOD%}"
Note:
In your regex: /\{\%title\=(\w+.)\%\}/
There is no need to escape % as its
not a meta char.
There is no need to escape { and }.
These are meta char but only when
used as a quantifier in the form of
{min,max} or {,max} or {min,}
or {num}. So in your case they are treated literally.
Try this:
preg_match('/(title)\=(.*?)([%}])/s', $string, $matches);
The match[1] has your title and match[2] has the other part.
Related
thanks by your help.
my target is use preg_replace + pattern for remove very sample strings.
then only using preg_replace in this string or others, I need remove ANY content into <tag and next symbol >, the pattern is so simple, then:
$x = '#<\w+(\s+[^>]*)>#is';
$s = 'DATA<td class="td1">111</td><td class="td2">222</td>DATA';
preg_match_all($x, $s, $Q);
print_r($Q[1]);
[1] => Array
(
[0] => class="td1"
[1] => class="td2"
)
work greath!
now I try remove strings using the same pattern:
$new_string = '';
$Q = preg_replace($x, "\\1$new_string", $s);
print_r($Q);
result is completely different.
what is bad in my use of preg_replace?
using only preg_replace() how I can remove this strings?
(we can use foreach(...) for remove each string, but where is the error in my code?)
my result expected when I intro this value:
$s = 'DATA<td class="td1">111</td><td class="td2">222</td>DATA';
is this output:
$Q = 'DATA<td>111</td><td>222</td>DATA';
Let's break down your RegEx, #<\w+(\s+[^>]*)>#is, and see if that helps.
# // Start delimiter
< // Literal `<` character
\w+ // One or more word-characters, a-z, A-Z, 0-9 or _
( // Start capturing group
\s+ // One or more spaces
[^>]* // Zero or more characters that are not the literal `>`
) // End capturing group
> // Literal `>` character
# // End delimiter
is // Ignore case and `.` matches all characters including newline
Given the input DATA<td class="td1">DATA this matches <td class="td1"> and captures class="td1". The difference between match and capture is very important.
When you use preg_match you'll see the entire match at index 0, and any subsequent captures at incrementing indexes.
When you use preg_replace the entire match will be replaced. You can use the captures, if you so choose, but you are replacing the match.
I'm going to say that again: whatever you pass as the replacement string will replace the entirety of the found match. If you say $1 or \\=1, you are saying replace the entire match with just the capture.
Going back to the sample after the breakdown, using $1 is the equivalent of calling:
str_replace('<td class="td1">', ' class="td1"', $string);
which you can see here: https://3v4l.org/ZkPFb
To your question "how to change [0] by $new_string", you are doing it correctly, it is your RegEx itself that is wrong. To do what you are trying to do, your pattern must capture the tag itself so that you can say "replace the HTML tag with all of the attributes with just the tag".
As one of my comments noted, this is where you'd invert the capturing. You aren't interesting in capturing the attributes, you are throwing those away. Instead, you are interested in capturing the tag itself:
$string = 'DATA<td class="td1">DATA';
$pattern = '#<(\w+)\s+[^>]*>#is';
echo preg_replace($pattern, '<$1>', $string);
Demo: https://3v4l.org/oIW7d
I am trying to search this coincidence in a string:
1. I need to take only numbers after the chracter '#' as long as this coincidence has not spaces, for example:
String = 'This is a test #VVC345RR, text, and more text 12345';
I want to take only this from my string -> 345.
My example:
$s = '\"access_token=103782364732640461|2. myemail#domain1.com ZmElnDTiZlkgXbT8e3 #DD234 4Jrw__.3600.1281891600-10000186237005';
$matches = array();
$s = preg_match('/#([0-9]+)/', $s, $matches);
print_r($matches);
This only works when I have one # and numbers.
Thanks!
Maybe:
#\D*\K(\d+)
Accomplishes what you want?
This will look for an #, any non-numbers, and then capture the numbers. The \K ignores the early match.
https://regex101.com/r/gNTccx/1/
I'm unclear what you mean by has not spaces, there are no spaces in the example string.
I have a question. I need to add a + before every word and see all between quotes as one word.
A have this code
preg_replace("/\w+/", '+\0', $string);
which results in this
+test +demo "+bla +bla2"
But I need
+test +demo +"bla bla2"
Can someone help me :)
And is it possible to not add a + if there is already one? So you don't get ++test
Thanks!
Maybe you can use this regex:
$string = '+test demo between "double quotes" and between \'single quotes\' test';
$result = preg_replace('/\b(?<!\+)\w+|["|\'].+?["|\']/', '+$0', $string);
var_dump($result);
// which will result in:
string '+test +demo +between +"double quotes" +and +between +'single quotes' +test' (length=74)
I've used a 'negative lookbehind' to check for the '+'.
Regex lookahead, lookbehind and atomic groups
I can't test this but could you try it and let me know how it goes?
First the regex: choose from either, a series of letters which may or may not be preceded by a '+', or, a quotation, followed by any number of letters or spaces, which may be preceded by a '+' followed by a quotation.
I would hope this matches all your examples.
We then get all the matches of the regex in your string, store them in the variable "$matches" which is an array. We then loop through this array testing if there is a '+' as the first character. If there is, do nothing, otherwise add one.
We then implode the array into a string, separating the elements by a space.
Note: I believe $matches in created when given as a parameter to preg_match.
$regex = '/[((\+)?[a-zA-z]+)(\"(\+)?[a-zA-Z ]+\")]/';
preg_match($regex, $string, $matches);
foreach($matches as $match)
{
if(substr($match, 0, 1) != "+") $match = "+" + $match;
}
$result = implode($matches, " ");
How can I check if a string has the format [group|any_title] and give me the title back?
[group|This is] -> This is
[group|just an] -> just an
[group|example] -> example
I would do that with explode and [group| as the delimiter and remove the last ]. If length (of explode) is > 0, then the string has the correct format.
But I think that is not quite a good way, isn't it?
So you want to check if a string matches a regex?
if(preg_match('/^\[group\|(.+)\]$/', $string, $m)) {
$title = $m[1];
}
If the group part is supposed to be dynamic as well:
if(preg_match('/^\[(.+)\|(.+)\]$/', $string, $m)) {
$group = $m[1];
$title = $m[2];
}
Use regular expression matching using PHP function preg_match.
You can use for example regexr.com to create and test a regular expression and when you're done, then implement it in your PHP script (replace the first parameter of preg_match with your regular expression):
$text = '[group|This is]';
// replace "pattern" with regular expression pattern
if (preg_match('/pattern/', $text, $matches)) {
// OK, you have parts of $text in $matches array
}
else {
// $text doesn't contain text in expected format
}
Specific regular expression pattern depends on how strictly you want to check your input string. It can be for example something like /^\[.+\|(.+)\]$/ or /\|([A-Za-z ]+)\]$/. First checks if string starts with [, ends with ] and contains any characters delimited by | in between. Second one just checks if string ends with | followed by upper and lower case alphabetic characters and spaces and finally ].
I have a String which contains substrings which I have to replace. The substrings are stored in an array. When I loop through the array everything works fine, until the array has more than 120 entries.
foreach ( $activeTags as $k => $v ) {
$find = $activeTags[$k]['Tag']['tag'];
$replace = 'that';
$pattern = "/\#\#[a-zA-Z][a-zA-Z]\#\#.*\b$find\b.*\#\#END_[a-zA-Z][a-zA-Z]\#\#|$find/";
$sText = '<p>Do not replace ##DS## this ##END_DS## replace this.</p>';
$sText = preg_replace_callback($pattern, function($match) use($find, $replace){
if($match[0] == $find){
return($replace);
}else{
return($match[0]);
}
}, $sText);
}
when count($activeTags) == 121 i only get an empty string.
Has onyone an idea why this happens?
Try this improved pattern:
$pattern = "~##([a-zA-Z]{2})##.*?\b$find\b.*?##END_\1##|$find~s";
Description
Discussion
The ~s flag indicates that dot (.) should match newlines. In your example, p tags are metionned. So I guess its an html fragment. Since newlines are alloed in html, I have added the ~s flag. More over, I have made the pattern more readable by:
using custom pattern boundaries: / becomes ~, you avoid escape anything...
replacing duplicate subpatterns: [a-zA-Z][a-zA-Z] becomes [a-zA-Z]{2}
taking advantage of the sequence ##DS## ##END_DS##. I use a backreference (\1) for matching what was found in the first matching group (Group 1 in the above image).