regular expressions for url parser

regular expressions for url parser - php

<?php
$string = 'user34567';
if(preg_match('/user(^[0-9]{1,8}+$)/', $string)){
echo 1;
}
?>
I want to check if the string have the word user follows by number that can be 8 symbols max.

You're very close actually:
if(preg_match('/^user[0-9]{1,8}$/', $string)){
The anchor for "must match at start of string" should be all the way in front, followed by the "user" literal; then you specify the character set [0-9] and multiplier {1,8}. Finally, you end off with the "must match at end of string" anchor.
A few comments on your original expression:
The ^ matches the start of a string, so writing it anywhere else inside this expression but the beginning will not yield the expected results
The + is a multiplier; {1,8} is one too, but only one multiplier can be used after an expression
Unless you're intending to use the numbers that you found in the expression, you don't need parentheses.
Btw, instead of [0-9] you could also use \d. It's an automatic character group that shortens the regular expression, though this particular one doesn't save all too many characters ;-)

By using ^ and $, you are only matching when the pattern is the only thing on the line. Is that what you want? If so, use the following:
preg_match( '/^user[0-9]{1,8}[^0-9]$/' , $string );
If you want to find this pattern anywhere in a line, I would try:
preg_match( '/user[0-9]{1,8}[^0-9]/' , $string );
As always, you should use a reference tool like RegexPal to do your regular expression testing in isolation.

You were close, here is your regex : /^user[0-9]{1,8}$/

try the following regex instead:
/^user([0-9]{1,8})$/

Use this regex:
/^user\d{1,8}$/

Related

Looking to use preg_replace to remove characters from my strings

I have the right function, just not finding the right regex pattern to remove (ID:999999) from the string. This ID value varies but is all numeric. I like to remove everything including the brackets.
$string = "This is the value I would like removed. (ID:17937)";
$string = preg_replace('#(ID:['0-9']?)#si', "", $string);
Regex is not more forte! And need help with this one.

Try this:
$string = preg_replace('# \(ID:[0-9]+\)#si', "", $string);
You need to escape the parenthesis using backslashes \.
You shouldn't use quotes around the number range.
You should use + (one or more) instead of ? (zero or one).
You can add a space at the start, to avoid having a space at the end of the resulting string.

In PHP regex is in / and not #, after that, parentheses are for capture group so you must escape them to match them.
Also to use preg_replace replacement you will need to use capture group so in your case /(\(ID:[0-9]+\))/si will be the a nice regular expression.

Here are two options:
Code: (Demo)
$string = "This is the value I would like removed. (ID:17937)";
var_export(preg_replace('/ \(ID:\d+\)/',"",$string));
echo "\n\n";
var_export(strstr($string,' (ID:',true));
Output: (I used var_export() to show that the technique is "clean" and gives no trailing whitespaces)
'This is the value I would like removed.'
'This is the value I would like removed.'
Some points:
Regex is a better / more flexible solution if your ID substring can exist anywhere in the string.
Your regex pattern doesn't need a character class if you use the shorthand range character \d.
Regex generally speaking should only be used when standard string function will not suffice or when it is proven to be more efficient for a specific case.
If your ID substring always occurs at the end of the string, strstr() is an elegant/perfect function.
Both of my methods write a (space) before ID to make the output clean.
You don't need either s or i modifiers on your pattern, because s only matters if you use a . (dot) and your ID is probably always uppercase so you don't need a case-insensitive search.

Regex to detect the colon and sides of it?

First see my string please:
$a = "[ child : parent ]";
How can I detect that the pattern is:
[(optional space)word or character(optional space) : (optional space)word or character(optional space)]

You can catch this as follows in PHP:
Your regular expression is /\[ *\w+ *: *\w+ *]/
You would write code that would look like this to see if it matched.
if (preg_match('/regex/', $string)) {
// do things
}
Explanation of the Regular Expression
There is a backslash (\) before the open bracket because
[ has special meaning in regular expressions. The backslash
prevents its special meaning from being used.
The asterisk (*) matches 0 or more of the previous character expression. In this
case, it matches 0 or more spaces. If you instead used the
expression \s*, it would match 0 or more white-space characters
(space, tab, line break). Finally, if you wanted it to match 0 or 1
of the previous character, you would use ? instead of *.
The plus (+) matches 1 or more of the previous character expression. The \w character expression matches a letter, digit, or underscore. If you don't want underscores to match, you should instead use a character class. For example, you could use [A-Za-z0-9].
You can find more information on regular expressions at http://www.regular-expressions.info and http://www.regular-expressions.info/php.html

From your sample text I'd say you mean a human word and not \w regex word
preg_match('/\[ ?([a-z]+) ?: ?([a-z]+) ?\]/i', $a, $matches);
Explained demo: http://regex101.com/r/hB2oV9
$matches will save both values, test with var_dump($matches);

I'm not sure on the php-specific version of regex, but this should work:
\[ ?\w+ ? : ?\w+ ?\]
Here is a site that I've used in the past to find regular expressions for my needed patterns.

use this regex \[\s*\w+\s*:\s*\w+\s*\]

I would probably do it like this
preg_match('/^\[\s?\w+\s+:\s+\w+\s?\]$/', $string)

Regex to match a specific expression format

I'm trying to find a regex that will match a specific expression in the following format:
name = value
However, I need it to not match:
name.extra = value
I have the following regex:
([\w\#\-]+) *(\=|\>|\>\=|\<|\<\=) *([^\s\']+)
which matches the first expression, but also matches the second expression (extra = value).
I need a regex that will match only the first expression and not the second (i.e. with a dot).

Just add ^ beginning and $ ending to your expression
^([\w\#\-]+) *(\=|\>|\>\=|\<|\<\=) *([^\s\']+)$

Negative lookbehind assertion (?<!) might be what you are looking for.
For a simple assignment: (?<!\.)\b(\w+)\s*=\s*(\w+)
summary:
(?<!\.) = prevent the character . at that location
\b = beginning of a word
The captured words are:
\1 = destination name
\2 = source name
and using the regex you specified, this should give something near this:
(?<!\.)\b([\w\#\-]+) *(\=|\>|\>\=|\<|\<\=) *([^\s\']+)

You don't say what language you're using, but it sounds like you don't need to use regexes at all.
If you're using PHP, then use the explode function to break apart on the =. Then check to see if the argument name has a period in it.

preg_match doesn't capture the content

what is wrong with my preg_match ?
preg_match('numVar("XYZ-(.*)");',$var,$results);
I want to get all the CONTENT from here:
numVar("XYZ-CONTENT");
Thank you for any help!

I assume this is PHP? If so there are three problems with your code.
PHP's PCRE functions require that regular expressions be formatted with a delimiter. The usual delimiter is /, but you can use any matching pair you want.
You did not escape your parentheses in your regular expression, so you're not matching a ( character but creating a RE group.
You should use non-greedy matching in your RE. Otherwise a string like numVar("XYZ-CONTENT1");numVar("XYZ-CONTENT2"); will match both, and your "content" group will be CONTENT1");numVar("XYZ-CONTENT2.
Try this:
$var = 'numVar("XYZ-CONTENT");';
preg_match('/numVar\("XYZ-(.*?)"\);/',$var,$results);
var_dump($results);

Paste your example string into http://txt2re.com and look at the PHP result.
It will show that you need to escape characters that have special meaning to the regex engine (such as the parentheses).

You should escape some chars:
preg_match('numVar\("XYZ-(.*)"\);',$var,$results);

preg_match("/XYZ\-(.+)\b/", $string, $result);
print_r($result[0]); // full matches ie XYZ-CONTENT
print_r($result[1]); // matches in the first paren set (.*)

matching the regular expression with the whole string

im kinda strumped in a situation where i need to match a whole string with a regular expression rather than finding if the pattern exists in the string.
suppose if i have a regular expression
/\\^^\\w+\\$^/
what i want is that the code will run through various strings , compare the strings with the regular expression and perform some task if the strings start and end with a ^.
Examples
^hello world^ is a match
my ^hello world^ should not be a match
the php function preg_match matches both of the results
any clues ???

Anchor the ends.
/^...$/

Here is a way to do the job:
$strs = array('^hello world^', 'my ^hello world^');
foreach($strs as $str) {
echo $str, preg_match('/^\^.*\^$/', $str) ? "\tmatch\n" : "\tdoesn't match\n";
}
Output:
^hello world^ match
my ^hello world^ doesn't match

Actually, ^\^\w+\^$ will not match "^hello world^" because you have two words there; the regex is only looking for a single word enclosed by "^"s.
What you are looking for is: ^\^.*\^$
This will match "^^", "^hello world^", "^a very long string of characters^", etc. while not matching "hello ^world^".

You can use the regex:
^\^[\w\s]+\^$
^ is a regex meta-character which is used as start anchor. To match a literal ^ you need to escape it as \^.
So we have:
^ : Start anchor
\^: A literal ^
[\w\s]+ : space separated words.
\^: A literal ^
$ : End anchor.
Ideone Link

Another pattern is: ^\^[^\^]*\^$ if you want match "^hello world^" and not "hello ^world^" , while \^[^\^]*\^ if you want match "^hello world^" and world in the "hello ^world^" string.
For Will: ^\^.*\^$ this match also "^hello^wo^rld^" i think isn't correct.

Try
/^\^\s*(\w+\s*)+\^$/

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

regular expressions for url parser - php

<?php $string = 'user34567'; if(preg_match('/user(^[0-9]{1,8}+$)/', $string)){ echo 1; } ?> I want to check if the string have the word user follows by number that can be 8 symbols max.

You were close, here is your regex : /^user[0-9]{1,8}$/

try the following regex instead: /^user([0-9]{1,8})$/

Use this regex: /^user\d{1,8}$/

Related

Looking to use preg_replace to remove characters from my strings

Regex to detect the colon and sides of it?

Regex to match a specific expression format

preg_match doesn't capture the content

matching the regular expression with the whole string

Categories

Resources