PHP Regex (different result between + and * ) - php

If I put * in my regex, it doesn't work, if I put +, it works.
* should mean 0 or more, + should mean 1 or more.
Case with *
$num = ' 527545855 ';
var_dump( preg_match( '/\d*/', substr( $num, 0, 18 ), $coincidencias ) );
var_dump($coincidencias);exit;
Result:
int(1)
array(1) {
[0]=>
string(0) ""
}
Case with +
$num = ' 527545855 ';
var_dump( preg_match( '/\d+/', substr( $num, 0, 18 ), $coincidencias ) );
var_dump($coincidencias);exit;
Result:
int(1)
array(1) {
[0]=>
string(9) "527545855"
}
I thought both should work, but I don't understand why the * doesn't work.

* means 0 or more occurences thus the first occurence is the void string in your test string

The engine will attempting to match starting from index 0.
Since \d* can match an empty string, the engine will return the empty string at index 0.
In contrast, \d+ must match at least one digit, and since the quantifier is greedy, it will return the nearest sequence of digits, while taking as many digits as possible (in your case, it is the whole sequence of digits in the input string).

You answered this in your question:
* should mean 0 or more, + should mean 1 or more.
The first thing that it matched was 0 or more digits.

It would match the digits or nothing
\d* means match 0 to many digits
For Example in
hello 123
the regex \d*
would match 8 times i.e 1 at start(^) ,6 times for hello and 1 time for 123

The * in a regex takes the character before it and looks for 0 or more instances.
The + in a regex looks for 1 or more instances.
Since you aren't really checking around the number (caring for other items) you will get more items back from the regex with * since it allows for the 0 instance clause to be met, which is valid for all your spaces before the number (a space is matching 0 instances of a number). The + ensures at least one is found.

Related

How to split a string into two parts then join them in reverse order as a new string?

This is an example:
$str="this is string 1 / 4w";
$str=preg_replace(?); var_dump($str);
I want to capture 1 / 4w in this string and move this portion to the begin of string.
Result: 1/4W this is string
Just give me the variable that contains the capture.
The last portion 1 / 4W may be different.
e.g. 1 / 4w can be 1/ 16W , 1 /2W , 1W , or 2w
The character W may be an upper case or a lower case.
Use capture group if you want to capture substring:
$str = "this is string 1 / 4w"; // "1 / 4w" can be 1/ 16W, 1 /2W, 1W, 2w
$str = preg_replace('~^(.*?)(\d+(?:\s*/\s*\d+)?w)~i', "$2 $1", $str);
var_dump($str);
Without seeing some different sample inputs, it seems as though there are no numbers in the first substring. For this reason, I use a negated character class to capture the first substring, leave out the delimiting space, and then capture the rest of the string as the second substring. This makes my pattern very efficient (6x faster than Toto's and with no linger white-space characters).
Pattern Demo
Code:
$str="this is string 1 / 4w";
$str=preg_replace('/([^\d]+) (.*)/',"$2 $1",$str);
var_export($str);
Output:
'1 / 4w this is string'

Regex matching to a terminator plus a variable character sequence

Sorry to bother, I feel permanently lost when it comes to regex...
I have to match a string which occurs in a longer sequence of hex-values. My test-string is this:
BF1301020302000017BF1301030101010300FF6ABF130201010300FFC0BF1303010303030100FF98
Pattern is this:
starts with BF13
followed by an unknown amount of "01", "02" or "03" repetitions (\w\w)
00 marks the termination of the sequence between BF13 and 00
after the 00-terminator, there are always 4 additional chars
I tried BF13(\w\w)+?00(\w\w){1} but it's obviously wrong.
The test-string is supposed to match and output these values:
BF1301020302000017
BF1301030101010300FF6A
BF130201010300FFC0
BF1303010303030100FF98
Thanks, guys!
This one will do the job :
BF13(?:0[123])+00[A-Z0-9]{4}
Explanation
BF13 BF13 literally
(?:...)+ Followed by something (non capturing group) at least one time (+)
0[123] a zero followed by 1, 2 or 3
00 Followed by 00
[A-Z0-9]{4} Followed by uppercase char or a digit 4 times
RegExp Demo
Sample PHP code Test online
$re = '/BF13(?:0[123])+00[A-Z0-9]{4}/';
$str = 'BF1301020302000017BF1301030101010300FF6ABF130201010300FFC0BF1303010303030100FF98';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
foreach ($matches as $val) {
echo "matched: " . $val[0] . "\n";
}
You have a couple of options:
Input:
$in = 'BF1301020302000017BF1301030101010300FF6ABF130201010300FFC0BF1303010303030100FF98';
Method #1 - preg_match_all() (Regex Pattern Explanation/Demo):
var_export(preg_match_all('/BF13(?:0[123])+0{2}[A-F0-9]{4}/', $in, $out) ? $out[0] : []);
// *my pattern is a couple of steps faster than stej4n's
// and doesn't make the mistake of putting commas in the character class
Method #2: - preg_split() (Regex Pattern Explanation/Demo):
var_export(preg_split('/0{2}[A-F0-9]{4}\K/', $in, 0, PREG_SPLIT_NO_EMPTY));
// K moves the match starting point -- preserving all characters when splitting
// I prefer this method because it requires a small pattern and
// it returns an array, as opposed to true/false with a variable declaration
// Another pattern for preg_split() is just slightly slower, but needs less parameters:
// preg_split('/0{2}[A-F0-9]{4}\K(?!$)/', $in)
Output (either way):
array (
0 => 'BF1301020302000017',
1 => 'BF1301030101010300FF6A',
2 => 'BF130201010300FFC0',
3 => 'BF1303010303030100FF98',
)

PHP how to get the second number present in a string

I would like to get the second number present in a string.
Maybe you have better ideas than mine.
From this example:
1 PLN = 0.07 Gold
I would like to get only "0.07" from that string.
I obtain it from web scraping so it returns me as a string.
The problem that I have is the following.
The "second number in string" might be with ".", without it, might be composed by only one number ex. "1", or by two "12", might have decimals ex. "1.2", even position may change, because for some currency I will have "1 PLN", "1 USD" for others I will have "1 TW".
So I can't work on position, I can't extract only numbers (I have the "1" at the beginning of the string), I can't extract only INT cause I could have also decimals...
So the only constant of that string - I think (but if you have better ideas pls suggest me) - is that I need the second number I find in the string.
How could I get it?
Sorry If I wasn't enough clear.
Try this:
<?php
$string = '1 PLN = 0.07 Gold';
$pattern = '/\d+\.\d+/';
$matches = array();
$r = preg_match($pattern, $string, $matches);
var_dump($matches); //$matches is array with results
Output:
array(1) {
[0]=>
string(4) "0.07"
}
If its always in this format 1 PLN = 0.07 Gold You can just
$array = explode(" ", $string) with a Space and then get the required number with
$number = $array[3]
Try it out, let me know if it works
You can use a simple regex patter to isolate all numbers, including decimals if any after the equals sign.
The below works if the string will have the same structure, meaning an = a space and then the number that you are after.
= - matches the equals sign
\s - matches the space character immediately after
(\d*\.?\d*) - matches any number of digits followed by an optional period . and then any number of digits
$str = '1 PLN = 0.07 Gold';
preg_match('#=\s(\d*\.?\d*)#',$str,$matches);
print $matches[1];
Will output
0.07
This works regardless of what you have before the = sign.

Struggling with regular expression, removing number from string

I have a number of strings such as these:
Virtus.pro (13)
mousesports (16)
Natus Vincere (12)
As you can see these is no really common way splitting the name from the number in all cases.
I'm really new to Regex. Does anyone have any ideas how I could split these strings to contain 2 variables?
Virtus.pro and 13. then mousesports and 16?
As you can see the Natus Vincere one has a space between the two parts of the name.
Really struggling, I've only been able to come up with a regex for extracting the number. But this doesn't work everytime.
I think you're looking for something like this:
$data = [
"Virtus.pro (13)",
"mousesports (16)",
"Natus Vincere (12)"
];
foreach ($data as $string) {
$matches = [];
preg_match('/(.*)\s\((\d+)\)/', $string, $matches);
list(, $team, $score) = $matches;
var_dump($team, $score);
}
Output:
string(10) "Virtus.pro"
string(2) "13"
string(11) "mousesports"
string(2) "16"
string(13) "Natus Vincere"
string(2) "12"
The idea is to look for a substring followed by a space, opening parenthesis, some digits, and a closing parenthesis. The leading substring and the digits are snagged up in capturing groups then spit out into $team and $score.
r'([a-zA-Z. ]+) (\(\d{1,2}\))'
I tried this one in python, it works for me.
You'd better provide more details I think, for example, the format of the names, which kind of punctuation it contains, and the number, how many digits it has, etc.
In my answer above, the name string can contains '.' and ' ', and the number will be 1 or 2 digits.
you can change it to
r'([a-zA-Z. ]+) \((\d+)\)'
to match a number that you don't know how many digits it contains.
it groups the match results by the way, the second group (index 1) is the name, the third group (index 2) is the number.
>>> import re
>>> are=re.compile(r'([a-zA-Z. ]+) \((\d{1,2})\)')
>>> d=are.search('Virtus.pro (13)')
>>> d.group()
'Virtus.pro (13)'
>>> d.group(1)
'Virtus.pro'
>>> d.group(2)
'13'
hope it helps.
Hi you can use something like this
#!/usr/bin/env python
import re
regex = re.compile('^(.*)\((\d+)\)$')
my_match = regex.match('Virtus.pro (13)')
You can then do:
m.group(1) #to get 'Virtus.pro '
m.group(2) #to get '13'
This is implemented in python btw

How do i break string into words at the position of number

I have some string data with alphanumeric value. like us01name, phc01name and other i.e alphabates + number + alphabates.
i would like to get first alphabates + number in first string and remaining on second.
How can i do it in php?
You can use a regular expression:
// if statement checks there's at least one match
if(preg_match('/([A-z]+[0-9]+)([A-z]+)/', $string, $matches) > 0){
$firstbit = $matches[1];
$nextbit = $matches[2];
}
Just to break the regular expression down into parts so you know what each bit does:
( Begin group 1
[A-z]+ As many alphabet characters as there are (case agnostic)
[0-9]+ As many numbers as there are
) End group 1
( Begin group 2
[A-z]+ As many alphabet characters as there are (case agnostic)
) End group 2
Try this code:
preg_match('~([^\d]+\d+)(.*)~', "us01name", $m);
var_dump($m[1]); // 1st string + number
var_dump($m[2]); // 2nd string
OUTPUT
string(4) "us01"
string(4) "name"
Even this more restrictive regex will also work for you:
preg_match('~([A-Z]+\d+)([A-Z]+)~i', "us01name", $m);
You could use preg_split on the digits with the pattern capture flag. It returns all pieces, so you'd have to put them back together. However, in my opinion is more intuitive and flexible than a complete pattern regex. Plus, preg_split() is underused :)
Code:
$str = 'user01jason';
$pieces = preg_split('/(\d+)/', $str, -1, PREG_SPLIT_DELIM_CAPTURE);
print_r($pieces);
Output:
Array
(
[0] => user
[1] => 01
[2] => jason
)

Categories