Looking to use preg_replace to remove characters from my strings - php

I have the right function, just not finding the right regex pattern to remove (ID:999999) from the string. This ID value varies but is all numeric. I like to remove everything including the brackets.
$string = "This is the value I would like removed. (ID:17937)";
$string = preg_replace('#(ID:['0-9']?)#si', "", $string);
Regex is not more forte! And need help with this one.

Try this:
$string = preg_replace('# \(ID:[0-9]+\)#si', "", $string);
You need to escape the parenthesis using backslashes \.
You shouldn't use quotes around the number range.
You should use + (one or more) instead of ? (zero or one).
You can add a space at the start, to avoid having a space at the end of the resulting string.

In PHP regex is in / and not #, after that, parentheses are for capture group so you must escape them to match them.
Also to use preg_replace replacement you will need to use capture group so in your case /(\(ID:[0-9]+\))/si will be the a nice regular expression.

Here are two options:
Code: (Demo)
$string = "This is the value I would like removed. (ID:17937)";
var_export(preg_replace('/ \(ID:\d+\)/',"",$string));
echo "\n\n";
var_export(strstr($string,' (ID:',true));
Output: (I used var_export() to show that the technique is "clean" and gives no trailing whitespaces)
'This is the value I would like removed.'
'This is the value I would like removed.'
Some points:
Regex is a better / more flexible solution if your ID substring can exist anywhere in the string.
Your regex pattern doesn't need a character class if you use the shorthand range character \d.
Regex generally speaking should only be used when standard string function will not suffice or when it is proven to be more efficient for a specific case.
If your ID substring always occurs at the end of the string, strstr() is an elegant/perfect function.
Both of my methods write a (space) before ID to make the output clean.
You don't need either s or i modifiers on your pattern, because s only matters if you use a . (dot) and your ID is probably always uppercase so you don't need a case-insensitive search.

Related

'Delimiter must not be alphanumeric or backslash' and preg_replace() [duplicate]

I am trying to take a string of text like so:
$string = "This (1) is (2) my (3) example (4) text";
In every instance where there is a positive integer inside of parentheses, I'd like to replace that with simply the integer itself.
The code I'm using now is:
$result = preg_replace("\((\d+)\)", "$0", $string);
But I keep getting a
Delimiter must not be alphanumeric or backslash.
Warning
Any thoughts? I know there are other questions on here that sort of answer the question, but my knowledge of regex is not enough to switch it over to this example.
You are almost there. You are using:
$result = preg_replace("((\d+))", "$0", $string);
The regex you specify as the 1st
argument to preg_* family of function
should be delimited in pair of
delimiters. Since you are not using
any delimiters you get that error.
( and ) are meta char in a regex,
meaning they have special meaning.
Since you want to match literal open
parenthesis and close parenthesis,
you need to escape them using a \.
Anything following \ is treated
literally.
You can capturing the integer
correctly using \d+. But the captured
integer will be in $1 and not $0. $0
will have the entire match, that is
integer within parenthesis.
If you do all the above changes you'll get:
$result = preg_replace("#\((\d+)\)#", "$1", $string);
1) You need to have a delimiter, the / works fine.
2) You have to escape the ( and ) characters so it doesn't think it's another grouping.
3) Also, the replace variables here start at 1, not 0 (0 contains the FULL text match, which would include the parentheses).
$result = preg_replace("/\((\d+)\)/", "\\1", $string);
Something like this should work. Any further questions, go to PHP's preg_replace() documentation - it really is good.
Check the docs - you need to use a delimiter before and after your pattern: "/\((\d+)\)/"
You'll also want to escape the outer parentheses above as they are literals, not a nested matching group.
See: preg_replace manual page
Try:
<?php
$string = "This (1) is (2) my (3) example (4) text";
$output = preg_replace('/\((\d)\)/i', '$1', $string);
echo $output;
?>
The parenthesis chars are special chars in a regular expression. You need to escape them to use them.
Delimiter must not be alphanumeric or backslash.,
try typing your parameters inside "/ .... /" as shown bellow. Else the code will output >>> Delimiter must not be alphanumeric or backslash.
$yourString='hi there, good friend';
$dividorString='there';
$someSstring=preg_replace("/$dividorString/",'', $yourString);
echo($someSstring);
// hi, good friend
.
.
worked for me.

Getting regular expression

How can i extract https://domain.com/gamer?hid=.115f12756a8641 from the below string ,i.e from url
rrth:'http://www.google.co',cctp:'323',url:'https://domain.com/gamer?hid=.115f12756a8641',rrth:'https://another.com'
P.s :I am new to regular expression, I am learning .But above string seems to be formatted..so some sort of shortcut must be there.
If your input string is called $str:
preg_match('/url:\'(.*?)\'/', $str, $matches);
$url = $matches[1];
(.*?) captures everything between url:' and ' and can later be retrieved with $matches[1].
The ? is particularly important. It makes the repetition ungreedy, otherwise it would consume everything until the very last '.
If your actual input string contains multiple url:'...' section, use preg_match_all instead. $matches[1] will then be an array of all required values.
Simple regex:
preg_match('/url\s*\:\s*\'([^\']+)/i',$theString,$match);
echo $match[1];//should be the url
How it works:
/url\s*\:\s*: matches url + [any number of spaces] + : (colon)+ [any number of spaces]But we don't need this, that's where the second part comes in
\'([^\']+)/i: matches ', then the brackets (()) create a group, that will be stored separately in the $matches array. What will be matches is [^']+: Any character, except for the apostrophe (the [] create a character class, the ^ means: exclude these chars). So this class will match any character up to the point where it reaches the closing/delimiting apostrophe.
/i: in case the string might contain URL:'http://www.foo.bar', I've added that i, which is the case-insensitive flag.
That's about it.Perhaps you could sniff around here to get a better understanding of regex's
note: I've had to escape the single quotes, because the pattern string uses single quotes as delimiters: "/url\s*\:\s*'([^']+)/i" works just as well. If you don't know weather or not you'll be dealing with single or double quotes, you could replace the quotes with another char class:
preg_match('/url\s*\:\s*[\'"]([^\'"]+)/i',$string,$match);
Obviously, in that scenario, you'll have to escape the delimiters you've used for the pattern string...

Insert separators into a string in regular intervals

I have the following string in php:
$string = 'FEDCBA9876543210';
The string can be have 2 or more (I mean more) hexadecimal characters
I wanted to group string by 2 like :
$output_string = 'FE:DC:BA:98:76:54:32:10';
I wanted to use regex for that, I think I saw a way to do like "recursive regex" but I can't remember it.
Any help appreciated :)
If you don't need to check the content, there is no use for regex.
Try this
$outputString = chunk_split($string, 2, ":");
// generates: FE:DC:BA:98:76:54:32:10:
You might need to remove the last ":".
Or this :
$outputString = implode(":", str_split($string, 2));
// generates: FE:DC:BA:98:76:54:32:10
Resources :
www.w3schools.com - chunk_split()
www.w3schools.com - str_split()
www.w3schools.com - implode()
On the same topic :
Split string into equal parts using PHP
Sounds like you want a regex like this:
/([0-9a-f]{2})/${1}:/gi
Which, in PHP is...
<?php
$string = 'FE:DC:BA:98:76:54:32:10';
$pattern = '/([0-9A-F]{2})/gi';
$replacement = '${1}:';
echo preg_replace($pattern, $replacement, $string);
?>
Please note the above code is currently untested.
You can make sure there are two or more hex characters doing this:
if (preg_match('!^\d*[A-F]\d*[A-F][\dA-F]*$!i', $string)) {
...
}
No need for a recursive regex. By the way, recursive regex is a contradiction in terms. As a regular language (which a regex parses) can't be recursive, by definition.
If you want to also group the characters in pairs with colons in between, ignoring the two hex characters for a second, use:
if (preg_match('!^[\dA-F]{2}(?::[A-F][\dA-F]{2})*$!i', $string)) {
...
}
Now if you want to add the condition requiring tow hex characters, use a positive lookahead:
if (preg_match('!^(?=[\d:]*[A-F][\d:]*[A-F])[\dA-F]{2}(?::[A-F][\dA-F]{2})*$!i', $string)) {
...
}
To explain how this works, the first thing it does it that it checks (with a positive lookahead ie (?=...) that you have zero or more digits or colons followed by a hex letter followed by zero or more digits or colons and then a letter. This will ensure there will be two hex letters in the expression.
After the positive lookahead is the original expression that makes sure the string is pairs of hex digits.
Recursive regular expressions are usually not possible. You may use a regular expression recursively on the results of a previous regular expression, but most regular expression grammars will not allow recursivity. This is the main reason why regular expressions are almost always inadequate for parsing stuff like HTML. Anyways, what you need doesn't need any kind of recursivity.
What you want, simply, is to match a group multiple times. This is quite simple:
preg_match_all("/([a-z0-9]{2})+/i", $string, $matches);
This will fill $matches will all occurrences of two hexadecimal digits (in a case-insensitive way). To replace them, use preg_replace:
echo preg_replace("/([a-z0-9]{2})/i", $string, '\1:');
There will probably be one ':' too much at the end, you can strip it with substr:
echo substr(preg_replace("/([a-z0-9]{2})/i", $string, '\1:'), 0, -1);
While it is not horrible practice to use rtrim(chunk_split($string, 2, ':'), ':'), I prefer to use direct techniques that avoid "mopping up" after making modifications.
Code: (Demo)
$string = 'FEDCBA9876543210';
echo preg_replace('~[\dA-F]{2}(?!$)\K~', ':', $string);
Output:
FE:DC:BA:98:76:54:32:10
Don't be intimidated by the regex. The pattern says:
[\dA-F]{2} # match exactly two numeric or A through F characters
(?!$) # that is not located at the end of the string
\K # restart the fullstring match
When I say "restart the fullstring match" I mean "forget the previously matched characters and start matching from this point forward". Because there are no additional characters matched after \K, the pattern effectively delivers the zero-width position where the colon should be inserted. In this way, no original characters are lost in the replacement.

PHP using preg_replace : "Delimiter must not be alphanumeric or backslash" error

I am trying to take a string of text like so:
$string = "This (1) is (2) my (3) example (4) text";
In every instance where there is a positive integer inside of parentheses, I'd like to replace that with simply the integer itself.
The code I'm using now is:
$result = preg_replace("\((\d+)\)", "$0", $string);
But I keep getting a
Delimiter must not be alphanumeric or backslash.
Warning
Any thoughts? I know there are other questions on here that sort of answer the question, but my knowledge of regex is not enough to switch it over to this example.
You are almost there. You are using:
$result = preg_replace("((\d+))", "$0", $string);
The regex you specify as the 1st
argument to preg_* family of function
should be delimited in pair of
delimiters. Since you are not using
any delimiters you get that error.
( and ) are meta char in a regex,
meaning they have special meaning.
Since you want to match literal open
parenthesis and close parenthesis,
you need to escape them using a \.
Anything following \ is treated
literally.
You can capturing the integer
correctly using \d+. But the captured
integer will be in $1 and not $0. $0
will have the entire match, that is
integer within parenthesis.
If you do all the above changes you'll get:
$result = preg_replace("#\((\d+)\)#", "$1", $string);
1) You need to have a delimiter, the / works fine.
2) You have to escape the ( and ) characters so it doesn't think it's another grouping.
3) Also, the replace variables here start at 1, not 0 (0 contains the FULL text match, which would include the parentheses).
$result = preg_replace("/\((\d+)\)/", "\\1", $string);
Something like this should work. Any further questions, go to PHP's preg_replace() documentation - it really is good.
Check the docs - you need to use a delimiter before and after your pattern: "/\((\d+)\)/"
You'll also want to escape the outer parentheses above as they are literals, not a nested matching group.
See: preg_replace manual page
Try:
<?php
$string = "This (1) is (2) my (3) example (4) text";
$output = preg_replace('/\((\d)\)/i', '$1', $string);
echo $output;
?>
The parenthesis chars are special chars in a regular expression. You need to escape them to use them.
Delimiter must not be alphanumeric or backslash.,
try typing your parameters inside "/ .... /" as shown bellow. Else the code will output >>> Delimiter must not be alphanumeric or backslash.
$yourString='hi there, good friend';
$dividorString='there';
$someSstring=preg_replace("/$dividorString/",'', $yourString);
echo($someSstring);
// hi, good friend
.
.
worked for me.

Regex Question: Matching this pattern with hard or soft quotes

I have this anchor locating regex working pretty well:
$p = '%<a.*\s+name="(.*)"\s*>(?:.*)</a>%im';
It matches <a followed by zero or more of anything followed by a space and name="
It is grabbing the names even if a class or an id precedes the name in the anchor.
What I would like to add is the ability to match on name=' with a single quote (') as well since sooner or later someone will have done this.
Obviously I could just add a second regex written for this but it seems inelegant.
Anyone know how to add the single quote and just use one regex? Any other improvements or recommendations would be very welcome. I can use all the regex help I can get!
Thanks very much for reading,
function findAnchors($html) {
$names = array();
$p = '%<a.*\s+name="(.*)"\s*>(?:.*)</a>%im';
$t = preg_match_all($p, $html, $matches, PREG_SET_ORDER);
if ($matches) {
foreach ($matches as $m) {
$names[] = $m[1];
}
return $names;
}
}
James' comment is actually a very popular, but wrong regex used for string matching. It's wrong because it doesn't allow for escaping of the string delimiter. Given that the string delimiter is ' or " the following regex works
$regex = '([\'"])(.*?)(.{0,2})(?<![^\\\]\\\)(\1)';
\1 is the starting delimeter, \2 is the contents (minus 2 characters) and \3 is the last 2 characters and the ending delimiter. This regex allows for escaping of delimiters as long as the escape character is \ and the escape character hasn't been escaped. IE.,
'Valid'
'Valid \' String'
'Invalid ' String'
'Invalid \\' String'
Try this:
/<a(?:\s+(?!name)[^"'>]+(?:"[^"]*"|'[^']*')?)*\s+name=("[^"]*"|'[^']*')\s*>/im
Here you just have to strip the surrounding quotes:
substr($match[1], 1, -1)
But using a real parser like DOMDocument would be certainly better that this regular expression approach.
Use [] to match character sets:
$p = "%<a.*\s+name=['\"](.*)['\"]\s*>(?:.*)</a>%im";
Your current solution won't match anchors with other attributes following 'name' (e.g. <a name="foo" id="foo">).
Try:
$regex = '%<a\s+\S*\s*name=["']([^"']+)["']%i';
This will extract the contents of the 'name' attribute into the back reference $1.
The \s* will also allow for line breaks between attributes.
You don't need to finish off with the rest of the 'a' tag as the negated character class [^"']+ will be lazy.
Here's another approach:
$rgx='~<a(?:\s+(?>name()|\w+)=(?|"([^"]*)"|\'([^\']*)\'))+?\1~i';
I know this question is old, but when it resurfaced just now I thought up another use for the "empty capturing groups as checkboxes" idiom from the Cookbook. The first, non-capturing group handles the matching of all "name=value" pairs under the control of a reluctant plus (+?). If the attribute name is literally name, the empty group (()) matches nothing, then the backreference (\1) matches nothing again, breaking out of the loop. (The backreference succeeds because the group participated in the match, even though it didn't consume any characters.)
The attribute value is captured each time in group #2, overwriting whatever was captured on the previous iteration. (The branch-reset construct ((?|(...)|(...)) enables us to "re-use" group #2 to capture the value inside the quotes, whichever kind of quotes they were.) Since the loop quits after the name name comes up, the final captured value corresponds to that attribute.
See a demo on Ideone

Categories