Not entirely sure this is possible, but hoping to be pleasantly surprised.
I have a regular expression that looks like this:
$pattern = '#post/date/(\d\d\d\d)-(\d\d)-(\d\d)#';
And, I even have a string that matches it:
$string = 'post/date/2012-01-01';
Of course, I don't know the exact pattern and string beforehand, but they will look something like this.
I need to end up with an array that looks like this:
$groups = array('2012', '01', '01);
The array should contain the parts of the string that matched the three regular expression groups that are within parentheses. Is this possible?
those things between parentheses are subpatterns and you can get the results for each of them when you apply the regex. you can use preg_match_all like this
$string = 'post/date/2012-01-01';
$pattern = '#post/date/(\d\d\d\d)-(\d\d)-(\d\d)#';
preg_match_all($pattern, $string, $matches);
$groups = array($matches[1][0], $matches[2][0],$matches[3][0]);
echo '<pre>';
print_r($groups);
echo '</pre>';
sure, this is an example that just shows the behavior, you will need to check first if the string matched the pattern and if it did, how many times..
you can see that piece of code working here: http://ideone.com/zVu75
Not a regular expression but will do what you want to achieve
<?php
$string = 'post/date/2012-01-01';
$date= basename($string);
$array=explode('-',$date);
print_r($array);
?>
The output array will look like this
Array
(
[0] => 2012
[1] => 01
[2] => 01
)
$str = 'post/date/2012-01-01';
preg_match('#post/date/(\d+)-(\d+)-(\d+)#', $str, $m);
array_shift($m);
$group = $m;
print_r($group);
Output
Array
(
[0] => 2012
[1] => 01
[2] => 01
)
If you're looking for something concise:
$matches = sscanf($string, '%*[^0-9]%d-%d-%d');
makes $matches:
array(3) {
[0]=> int(2012)
[1]=> int(1)
[2]=> int(1)
}
Related
I'm writing a PHP function to extract numeric ids from a string like:
$test = '123_123_Foo'
At first I took two different approaches, one with preg_match_all():
$test2 = '123_1256_Foo';
preg_match_all('/[0-9]{1,}/', $test2, $matches);
print_r($matches[0]); // Result: 'Array ( [0] => 123 [1] => 1256 )'
and other with preg_replace() and explode():
$test = preg_replace('/[^0-9_]/', '', $test);
$output = array_filter(explode('_', $test));
print_r($output); // Results: 'Array ( [0] => 123 [1] => 1256 )'
Any of them works well as long as the string does not content mixed letters and numbers like:
$test2 = '123_123_234_Foo2'
The evident result is Array ( [0] => 123 [1] => 1256 [2] => 2 )
So I wrote another regex to get rid off of mixed strings:
$test2 = preg_replace('/([a-zA-Z]{1,}[0-9]{1,}[a-zA-Z]{1,})|([0-9]{1,}[a-zA-Z]{1,}[0-9]{1,})|([a-zA-Z]{1,}[0-9]{1,})|([0-9]{1,}[a-zA-Z]{1,})|[^0-9_]/', '', $test2);
$output = array_filter(explode('_', $test2));
print_r($output); // Results: 'Array ( [0] => 123 [1] => 1256 )'
The problem is evident too, more complicated paterns like Foo2foo12foo1 would pass the filter. And here's where I got a bit stuck.
Recap:
Extract a variable ammount of chunks of numbers from string.
The string contains at least 1 number, and may contain other numbers
and letters separated by underscores.
Only numbers not preceded or followed by letters must be extracted.
Only the numbers in the first half of the string matter.
Since only the first half is needed I decided to split in the first occurrence of letter or mixed number-letter with preg_split():
$test2 = '123_123_234_1Foo2'
$output = preg_split('/([0-9]{1,}[a-zA-Z]{1,})|[^0-9_]/', $test, 2);
preg_match_all('/[0-9]{1,}/', $output[0], $matches);
print_r($matches[0]); // Results: 'Array ( [0] => 123 [1] => 123 [2] => 234 )'
The point of my question is if is there a simpler, safer or more efficient way to achieve this result.
If I understand your question correctly, you want to split an underscore-delimited string, and filter out any substrings that are not numeric. If so, this can be achieved without regex, with explode(), array_filter() and ctype_digit(); e.g:
<?php
$str = '123_123_234_1Foo2';
$digits = array_filter(explode('_', $str), function ($substr) {
return ctype_digit($substr);
});
print_r($digits);
This yields:
Array
(
[0] => 123
[1] => 123
[2] => 234
)
Note that ctype_digit():
Checks if all of the characters in the provided string are numerical.
So $digits is still an array of strings, albeit numeric.
Hope this helps :)
Getting just the numeric part of the string after the explode
$test2 = "123_123_234_1Foo2";
$digits = array_filter(explode('_', $test2 ), 'is_numeric');
var_dump($digits);
Result
array(3) { [0]=> string(3) "123" [1]=> string(3) "123" [2]=> string(3) "234" }
Use strtok
Regex isn't a magic bullet, and there are FAR simpler fixes for your problem, especially considering you're trying to split on a delimiter.
Any of the following approaches would be cleaner, and more maintainable, and the strtok() approach would probably perform better:
Use explode to create and loop through an array, checking each value.
Use preg_split to do the same, but with more a adaptable approach.
Use strtok, as it is designed exactly for this use-case.
Basic exmple for your case:
function strGetInts(string $str, str $delim) {
$word = strtok($str, $delim);
while (false !== $word) {
if (is_integer($word) {
yield (int) $word;
}
$word = strtok($delim);
}
}
$test2 = '123_1256_Foo';
foreach(strGetInts($test2, '_-') as $key {
print_r($key);
}
Note: the second argument to strtok is string containing ANY delimiter to split the string on. Thus, my example will group results into strings separated by underscores or dashes.
Additional Note: If and only if the string only needs to be split on a single delimiter (underscore only), a method using explode will likely result in better performance. For such a solution, see the other answer in this thread: https://stackoverflow.com/a/46937452/1589379 .
I would like to remove substrings from a string that have delimiters.
Example:
$string = "Hi, I want to buy an [apple] and a [banana].";
How do I get "apple" and "banana" out of this string and in an array? And the other parts of the string "Hi, I want to buy an" and "and a" in another array.
I apologize if this question has already been answered. I searched this site and couldn't find anything that would help me. Every situation was just a little different.
You could use preg_split() thus:
<?php
$pattern = '/[\[\]]/'; // Split on either [ or ]
$string = "Hi, I want to buy an [apple] and a [banana].";
echo print_r(preg_split($pattern, $string), true);
which outputs:
Array
(
[0] => Hi, I want to buy an
[1] => apple
[2] => and a
[3] => banana
[4] => .
)
You can trim the whitespace if you like and/or ignore the final fullstop.
preg_match_all('(?<=\[)([a-z])*(?=\])', $string, $matches);
Should do what you want. $matches will be an array with each match.
I assume you want words as values in the array:
$words = explode(' ', $string);
$result = preg_grep('/\[[^\]]+\]/', $words);
$others = array_diff($words, $result);
Create an array of words using explode() on a space
Use a regex to find [somethings] using preg_grep()
Find the difference of all words and [somethings] using array_diff(), which will be the "other" parts of the string
I'm terrible at regex, hard to understand for me so I need some help. I have a variable which looks something like this:
["...=", "...=", "...="]
Those are 3 values which I want to split into an array. The way I see it, I want to split it at the comma which comes after a quote ", ". Can someone please help me with the regex for preg_split?
You could try the below code to split the input string according to ", "
<?php
$yourstring = '["...=", "...=", "...="]';
$regex = '~", "~';
$splits = preg_split($regex, $yourstring);
print_r($splits);
?>
Output:
Array
(
[0] => ["...=
[1] => ...=
[2] => ...="]
)
If you don't want "[,]" in the output then you could try the below code.
<?php
$data = '["...=", "...=", "...="]';
$regex = '~(?<=\["|", ")[^"]*~';
preg_match_all($regex, $data, $matches);
print_r($matches);
?>
Output:
Array
(
[0] => Array
(
[0] => ...=
[1] => ...=
[2] => ...=
)
)
$string = '["...=", "...=", "...="]';
$parts = preg_split('/,\s/', $string);
var_dump($parts);
Program output:
array(3) {
[0]=>
string(34) ""...=""
[1]=>
string(36) ""...=""
[2]=>
string(37) ""...=""
}
So long as the double-quote symbol cannot occur within the double-quotes that contain the content, this pattern should validate and capture the three values:
^\["([^"]+)"\], \["([^"]+)"\], \["([^"]+)"\]$
If double-quotes can appear within the content, or the number of values is variable, then this pattern will not work.
I am trying to explode / preg_split a string so that I get an array of all the values that are enclosed in ( ). I've tried the following code but I always get an empty array, I have tried many things but I cant seem to do it right
Could anyone spot what am I missing to get my desired output?
$pattern = "/^\(.*\)$/";
$string = "(y3,x3),(r4,t4)";
$output = preg_split($pattern, $string);
print_r($output);
Current output Array ( [0] => [1] => )
Desired output Array ( [0] => "(y3,x3)," [1] => "(r4,t4)" )
With preg_split() your regex should be matching the delimiters within the string to split the string into an array. Your regex is currently matching the values, and for that, you can use preg_match_all(), like so:
$pattern = "/\(.*?\)/";
$string = "(y3,x3),(r4,t4)";
preg_match_all($pattern, $string, $output);
print_r($output[0]);
This outputs:
Array
(
[0] => (y3,x3)
[1] => (r4,t4)
)
If you want to use preg_split(), you would want to match the , between ),(, but without consuming the parenthesis, like so:
$pattern = "/(?<=\)),(?=\()/";
$string = "(y3,x3),(r4,t4)";
$output = preg_split($pattern, $string);
print_r($output);
This uses a positive lookbehind and positive lookahead to find the , between the two parenthesis groups, and split on them. It also output the same as the above.
You can use a simple regex like \B,\B to split the string and improve the performance by avoiding lookahead or lookbehind regex.
\B is a non-word boundary so it will match only the , between ) and (
Here is a working example:
http://regex101.com/r/cV7bO7/1
$pattern = "/\B,\B/";
$string = "(y3,x3),(r4,t4),(r5,t5)";
$result = preg_split($pattern, $string);
$result will contain:
Array
(
[0] => (y3,x3)
[1] => (r4,t4)
[2] => (r5,t5)
)
Suppose I have the following:
$string = "(a) (b) (c)";
How would I explode it to get the contents inside the parenthesis. If the string's contents were separated by just one symbol instead of 2 I would have used:
$string = "a-b-c";
explode("-", $string);
But how to do this when 2 delimiters are used to encapsulate the items to be exploded?
You have to use preg_split or preg_match instead.
Example:
$string = "(a) (b) (c)";
print_r(preg_split('/\\) \\(|\\(|\\)/', $string, -1, PREG_SPLIT_NO_EMPTY));
Array
(
[0] => a
[1] => b
[2] => c
)
Notice the order is important.
If there is no nesting parenthesis, you can use regular expression.
$string = "(a) (b) (c)";
$res = 0;
preg_match_all("/\\(([^)]*)\\)/", $string, $res);
var_dump($res[1]);
Result:
array(3) {
[0]=>
string(1) "a"
[1]=>
string(1) "b"
[2]=>
string(1) "c"
}
See http://www.ideone.com/70ZlQ
If you know for a fact that the strings will always be of the form (a) (b) (c), with precisely one space between each pair of parentheses and with no characters at the beginning or end, you can avoid having to use regexp functions:
$myarray = explode(') (', substr($mystring, 1, -1));
Try the below code:
<?php
$s="Welcome to (London) hello ";
$data = explode('(' , $s );
$d=explode(')',$data[1]);
print_r($data);
print_r($d);
?>
Output:
Array ( [0] => Welcome to [1] => London) hello )
Array ( [0] => London [1] => hello )
Perhaps use preg_split with an alternation pattern:
http://www.php.net/manual/en/function.preg-split.php
If your delimiters are consistent like that, then you can do this
$string = "(a) (b) (c)";
$arr = explode(") (", $string);
// Then simply trim remaining parentheses off.
$arr[0] = trim($arr[0], "()");
end($arr) = trim($arr[0], "()");
You can try preg_split() function.