preg_match doesn't capture the content - php

what is wrong with my preg_match ?
preg_match('numVar("XYZ-(.*)");',$var,$results);
I want to get all the CONTENT from here:
numVar("XYZ-CONTENT");
Thank you for any help!

I assume this is PHP? If so there are three problems with your code.
PHP's PCRE functions require that regular expressions be formatted with a delimiter. The usual delimiter is /, but you can use any matching pair you want.
You did not escape your parentheses in your regular expression, so you're not matching a ( character but creating a RE group.
You should use non-greedy matching in your RE. Otherwise a string like numVar("XYZ-CONTENT1");numVar("XYZ-CONTENT2"); will match both, and your "content" group will be CONTENT1");numVar("XYZ-CONTENT2.
Try this:
$var = 'numVar("XYZ-CONTENT");';
preg_match('/numVar\("XYZ-(.*?)"\);/',$var,$results);
var_dump($results);

Paste your example string into http://txt2re.com and look at the PHP result.
It will show that you need to escape characters that have special meaning to the regex engine (such as the parentheses).

You should escape some chars:
preg_match('numVar\("XYZ-(.*)"\);',$var,$results);

preg_match("/XYZ\-(.+)\b/", $string, $result);
print_r($result[0]); // full matches ie XYZ-CONTENT
print_r($result[1]); // matches in the first paren set (.*)

Related

Looking to use preg_replace to remove characters from my strings

I have the right function, just not finding the right regex pattern to remove (ID:999999) from the string. This ID value varies but is all numeric. I like to remove everything including the brackets.
$string = "This is the value I would like removed. (ID:17937)";
$string = preg_replace('#(ID:['0-9']?)#si', "", $string);
Regex is not more forte! And need help with this one.
Try this:
$string = preg_replace('# \(ID:[0-9]+\)#si', "", $string);
You need to escape the parenthesis using backslashes \.
You shouldn't use quotes around the number range.
You should use + (one or more) instead of ? (zero or one).
You can add a space at the start, to avoid having a space at the end of the resulting string.
In PHP regex is in / and not #, after that, parentheses are for capture group so you must escape them to match them.
Also to use preg_replace replacement you will need to use capture group so in your case /(\(ID:[0-9]+\))/si will be the a nice regular expression.
Here are two options:
Code: (Demo)
$string = "This is the value I would like removed. (ID:17937)";
var_export(preg_replace('/ \(ID:\d+\)/',"",$string));
echo "\n\n";
var_export(strstr($string,' (ID:',true));
Output: (I used var_export() to show that the technique is "clean" and gives no trailing whitespaces)
'This is the value I would like removed.'
'This is the value I would like removed.'
Some points:
Regex is a better / more flexible solution if your ID substring can exist anywhere in the string.
Your regex pattern doesn't need a character class if you use the shorthand range character \d.
Regex generally speaking should only be used when standard string function will not suffice or when it is proven to be more efficient for a specific case.
If your ID substring always occurs at the end of the string, strstr() is an elegant/perfect function.
Both of my methods write a (space) before ID to make the output clean.
You don't need either s or i modifiers on your pattern, because s only matters if you use a . (dot) and your ID is probably always uppercase so you don't need a case-insensitive search.

RegEx expression to hit only words with a-z and no aumlats

Can you help me out with this one? I have a list of words like this:
sachbearbeiter/-in
referent/-in
anlagenführer/-in
it-projektleiter/-in
I want to select only:
sachbearbeiter/-in
referent/-in
This is my current regex: ([a-z]+)/-(in)
The problem is it hits all even the ones with - and with ü
Thank you in advance.
You can use anchors to match the word you want:
^([a-z]+)/-(in)$
^---- Here ----^
Working demo
Update: for your comment, if you want to accept aumlats you can use unicode flag with \w like this:
^(\w+)/-(in)$
Working demo
You need to specify beginning & end of string so that it can match exact chars
change your regex to
^([a-z]+)/-(in)$
^ -> stands for beginning of string
$-> for end of string
Your current regex i.e. ([a-z]+)/-(in) does escape the / character and also trying to look into substrings that matches the pattern, so it'll show each of them.
Regex should be : ^([a-z]+)\/-(in) i.e. it should start with only small case alphabets with escaped /

regular expressions for url parser

<?php
$string = 'user34567';
if(preg_match('/user(^[0-9]{1,8}+$)/', $string)){
echo 1;
}
?>
I want to check if the string have the word user follows by number that can be 8 symbols max.
You're very close actually:
if(preg_match('/^user[0-9]{1,8}$/', $string)){
The anchor for "must match at start of string" should be all the way in front, followed by the "user" literal; then you specify the character set [0-9] and multiplier {1,8}. Finally, you end off with the "must match at end of string" anchor.
A few comments on your original expression:
The ^ matches the start of a string, so writing it anywhere else inside this expression but the beginning will not yield the expected results
The + is a multiplier; {1,8} is one too, but only one multiplier can be used after an expression
Unless you're intending to use the numbers that you found in the expression, you don't need parentheses.
Btw, instead of [0-9] you could also use \d. It's an automatic character group that shortens the regular expression, though this particular one doesn't save all too many characters ;-)
By using ^ and $, you are only matching when the pattern is the only thing on the line. Is that what you want? If so, use the following:
preg_match( '/^user[0-9]{1,8}[^0-9]$/' , $string );
If you want to find this pattern anywhere in a line, I would try:
preg_match( '/user[0-9]{1,8}[^0-9]/' , $string );
As always, you should use a reference tool like RegexPal to do your regular expression testing in isolation.
You were close, here is your regex : /^user[0-9]{1,8}$/
try the following regex instead:
/^user([0-9]{1,8})$/
Use this regex:
/^user\d{1,8}$/

How to match 2nd instance in regex

get_by_my_column
If I only want to match the get_by portion of the above string, how can I do this? I keep reading on this regex cheatsheet that I should use \n but I can't figure out how to implement it properly...
I've tried variations of the following...
/((_){2})/
/(_+){2}/
/(\w+?_\w+?)_\w+/ (use non greedy quantifiers, your substring should be in capture group 1)
or just /\w+?_\w+?/ <---(edit: won't work, you do need that second underscore as regex structure to force the non greedy \w up to it :])
Do you need to use a regex for this? You could use explode() and just grab the first two elements of the resulting array.
Try
preg_match('/(^[a-z]+[_][a-z]+)/', $string, $results);
This matches a string that starts with a group of letters followed by an underscore followed by another set of letters.
Edit: (lowercase letters)
try /^get_by. ^ for the condition that g must be the starting character.

Getting a random string within a string

I need to find a random string within a string.
My string looks as follows
{theme}pink{/theme} or {theme}red{/theme}
I need to get the text between the tags, the text may differ after each refresh.
My code looks as follows
$str = '{theme}pink{/theme}';
preg_match('/{theme}*{\/theme}/',$str,$matches);
But no luck with this.
* is only the quantifier, you need to specify what the quantifier is for. You've applied it to }, meaning there can be 0 or more '}' characters. You probably want "any character", represented by a dot.
And maybe you want to capture only the part between the {..} tags with (.*)
$str = '{theme}pink{/theme}';
preg_match('/{theme}(.*){\/theme}/',$str,$matches);
var_dump($matches);
'/{theme}(.*?){\/theme}/' or even more restrictive '/{theme}(\w*){\/theme}/' should do the job
preg_match_all('/{theme}(.*?){\/theme}/', $str, $matches);
You should use ungreedy matching here. $matches[1] will contain the contents of all matched tags as an array.
$matches = array();
$str = '{theme}pink{/theme}';
preg_match('/{([^}]+)}([^{]+){\/([^}]+)}/', $str, $matches);
var_dump($matches);
That will dump out all matches of all "tags" you may be looking for. Try it out and look at $matches and you'll see what I mean. I'm assuming you're trying to build your own rudimentary template language so this code snippet may be useful to you. If you are, I may suggest looking at something like Smarty.
In any case, you need parentheses to capture values in regular expressions. There are three captured values above:
([^}]+)
will capture the value of the opening "tag," which is theme. The [^}]+ means "one or more of any character BUT the } character, which makes this non-greedy by default.
([^{]+)
Will capture the value between the tags. In this case we want to match all characters BUT the { character.
([^}]+)
Will capture the value of the closing tag.
preg_match('/{theme}([^{]*){\/theme}/',$str,$matches);
[^{] matches any character except the opening brace to make the regex non-greedy, which is important, if you have more than one tag per string/line

Categories