Split a string before the first number - php

I would like to split a string like this:
Mystreetway 123, 1.th
So that I can have the following output array:
0 => Mystreetway
1 => 123, 1.th
The code must split the string before the first integer found. The substring from the first integer to the end of the string should become the second element upon splitting
I have tried the following found solution:
$key = "Mystreetway 123, 1.th";
$pattern = "/(\d+)/";
$array = preg_split($pattern, $key, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
print_r($array);
But it returns the following:
[0] => Mystreetway [1] => 123 [2] => , [3] => 1 [4] => .th

Split with a lookahead pattern instead (so you won't have to juggle captured delimiters), then limit the number of groups to just two (with third param of preg_split:
$key= "Mystreetway 123, 1.th";
$pattern = '/(?=\d)/';
$array = preg_split($pattern, $key, 2);
print_r($array);

Use a lookahead assertion to match a zero-width (empty) expression before a digit:
/(?=\d)/

The parameter you're looking for is limit. Change your third parameter from -1 (meaning, split as many times as necessary) to 1 (meaning, split only once) and the string will be split on the first integer.
$key= "Mystreetway 123, 1.th";
$pattern = "/(\d+)/";
$array = preg_split($pattern, $key, 1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
print_r($array);
Edit: For your case, this doesn't actually work, but if we change your regular expression slightly it works as desired;
$key= "Mystreetway 123, 1.th";
$pattern = "/(\d.*)/"; // Note we're now looking for a digit followed by anything
$array = preg_split($pattern, $key, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
print_r($array);
Output:
Array
(
[0] => Mystreetway
[1] => 123, 1.th
)

Using preg_match
preg_match('/(\D*)(\d*.*)/', $input_line, $output_array);
Output
array(3
0 => Mystreetway 123, 1.th
1 => Mystreetway
2 => 123, 1.th
)

For the cleanest output, with no lookaround, match all leading non-digit characters, then forget them with \K, then consume the unneeded space character before the first occurring digit. Limit the explosions to only produce 2 elements.
Code: (Demo)
$string = "Mystreetway 123, 1.th";
var_export (
preg_split('/\D+\K /', $string, 2)
);
Output:
array (
0 => 'Mystreetway',
1 => '123, 1.th',
)

Related

How can I split a string into an array of substrings with regular expressions in PHP?

I am trying to break a string of binary ones and zeros into groups of four. After reading the manual and the posts I am missing something:
$subject = "101010101";
$pattern = "/.{1,4}/";
$blocks = preg_split ($pattern, $subject);
print_r($blocks);
The result is an empty array.
Array
(
[0] =>
[1] =>
[2] =>
[3] =>
)
php >
You could just use str_split() which will split your string into an array of strings of size n.
$subject = "101010101";
$split = str_split($subject, 4);
print_r($split);
Output:
Array
(
[0] => 1010
[1] => 1010
[2] => 1
)
You get that result because you are matching 1 - 4 characters to split on. It will match all the characters in the string leaving nothing to display.
If you want to use a regex to break it up into groups of 4 (and the last one because there are 9 characters) you could use preg_match_all and match only 0 or 1 using a character class instead of using a dot which will match any character except a newline.
[01]{1,4}
Regex demo | Php demo
$subject = "101010101";
$pattern = "/[01]{1,4}/";
$blocks = preg_match_all ($pattern, $subject, $matches);
print_r($matches[0]);
Result
Array
(
[0] => 1010
[1] => 1010
[2] => 1
)
Any char in a string match your pattern, in other words, any string contains only delimiters. And result contains zero-sized spaces.
The get expected result you need capture only delimiters. You can do that adding two flags
$blocks = preg_split ($pattern, $subject, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY );
demo
You can set PREG_SPLIT_DELIM_CAPTURE flag to get captured pattern in output
If this flag is set, parenthesized expression in the delimiter pattern
will be captured and returned as well. PHP reference
Note:- You need to add the pattern into capturing group () to get it in ouput
$subject = "101010101";
$pattern = "/(.{1,4})/";
$blocks = preg_split ($pattern, $subject, null, PREG_SPLIT_DELIM_CAPTURE);
print_r($blocks);
preg_split() returns an array containing substrings of subject split along boundaries matched by pattern, or FALSE on failure.. But you are trying to grab 1-4 characters group from that string. So preg_match_all() can be used for this purpose. Example:
$subject = "101010101";
$pattern = "/[01]{1,4}/";
preg_match_all($pattern, $subject, $match);
echo '<pre>', print_r($match[0]);

How to get everything except brackets from string using php preg_split?

$str = "[10:42-23:10]part1[11:30-13:20]part2"
I wish to split it into something like:
[1] 10:42-23:10
[2] part1
[3] 11:30-13:20
[4] part2
The best I managed to come up with is:
$parts = preg_split("/(\\[*\\])\w+/", $str );
But this returns
[0] => [10:42-23:10
[1] => [11:30-13:20
[2] =>
Also you can use regex in preg_match_all() instead of preg_split()
$str = "[10:42-23:10]part1[11:30-13:20]part2";
preg_match_all("/[^\[\]]+/", $str, $parts);
print_r($parts[0]);
See result in demo
Split on alternative between [ and ], and use the flag PREG_SPLIT_NO_EMPTY to not catch empty parts.
$str = "[10:42-23:10]part1[11:30-13:20]part2";
$parts = preg_split("/\[|\]/", $str, -1, PREG_SPLIT_NO_EMPTY );
print_r($parts);
Output:
Array
(
[0] => 10:42-23:10
[1] => part1
[2] => 11:30-13:20
[3] => part2
)
NB.
Thank to #WiktorStribiżew , his regex /[][]/ is much more efficient, I've some benchmark, it is about 40% faster.
$str = "[10:42-23:10]part1[11:30-13:20]part2";
$parts = preg_split("/[][]/", $str, -1, PREG_SPLIT_NO_EMPTY );
print_r($parts);
Here is the perl script I have used to do the benchmark:
#!/usr/bin/perl
use Benchmark qw(:all);
my $str = "[10:42-23:10]part1[11:30-13:20]part2";
my $count = -5;
cmpthese($count, {
'[][]' => sub {
my #parts = split(/[][]/, $str);
},
'\[|\]' => sub {
my #parts = split(/\[|\]/, $str);
},
});
Result: (2 runs)
>perl -w benchmark.pl
Rate \[|\] [][]
\[|\] 536640/s -- -40%
[][] 891396/s 66% --
>Exit code: 0
>perl -w benchmark.pl
Rate \[|\] [][]
\[|\] 530867/s -- -40%
[][] 885242/s 67% --
>Exit code: 0
Use a simple regex to match any [...] substring (\[[^][]*]) and wrap the whole pattern with a capturing group - then you can use it with preg_split and PREG_SPLIT_DELIM_CAPTURE flag to get both the captures and the substrings in between matches:
$re = '/(\[[^][]*])/';
$str = '[10:42-23:10]part1[11:30-13:20]part2';
$matches = preg_split($re, $str, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
print_r($matches);
See the PHP demo
With this approach, you may have a better control of what you match inside square brackets, as you may adjust the pattern to only match time ranges, e.g.
(\[\d{2}:\d{2}-\d{2}:\d{2}])
A [10:42-23:10]part1[11:30-13:20]part2[4][5] will get split into [10:42-23:10], part1, [11:30-13:20] and part2[4][5] (note the [4][5] are not split out).
See this regex demo
Without regex, you can use strtok:
$result = [];
$tok = strtok($str, '[]');
do {
if (!empty($tok))
$result[] = $tok;
} while (false !== $tok = strtok('[]'));

Regex to match 2 or more words

I have a regex that tries to match for 2 or more words, but it isn't working as it's suppose to. What am I doing wrong?
$string = "i dont know , do you know?";
preg_match("~([a-z']+\b){2,}~", $string, $match);
echo "<pre>";
print_r($match);
echo "</pre>";
Expected Result:
Array ( i dont know )
Actual Result:
Array ( )
This will match for string that contains exactly 2 words or more:
/([a-zA-Z]+\s?\b){2,}/g you can go http://www.regexr.com/ and test it
PHP:
$string = "i dont know , do you know?";
preg_match("/([a-zA-Z]+\s?\b){2,}/", $string, $match);
echo "<pre>";
print_r($match);
echo "</pre>";
Note: do not use the /g in the PHP code
This one should work: ~([\w']+(\s+|[^\w\s])){2,}~g, which also match string like "I do!"
Test it here
I think you are missing how the {} are used, to match two words
preg_match_all('/([a-z]+)/i', 'one two', $match );
if( $match && count($match[1]) > 1 ){
....
}
Match is
array (
0 =>
array (
0 => 'one',
1 => 'two',
),
1 =>
array (
0 => 'one',
1 => 'two',
),
)
Match will have all matches of the pattern, so then its trivial to just count them up...
When using
preg_match('/(\w+){2,}/', 'one two', $match );
Match is
array (
0 => 'one',
1 => 'e',
)
clearly not what you want.
The only way I see with preg_match is with this /([a-z]+\s+[a-z]+)/
preg_match ([a-z']+\b){2,} http://www.phpliveregex.com/p/frM
preg_match ([a-z]+\s+[a-z]+) http://www.phpliveregex.com/p/frO
Suggested
preg_match_all ([a-z]+) http://www.phpliveregex.com/p/frR ( may have to select preg_match_all on the site )

Custom regular expression pattern

What's the right pattern to obtain something like that using preg_split.
Input:
Src.[VALUE1] + abs(Src.[VALUE2])
Output:
Array (
[0] => Src.[VALUE1]
[1] => Src.[VALUE2]
)
Instead of using preg_split, using preg_match_all makes more sense in this case:
preg_match_all('/\w+\.\[\w+\]/', $str, $matches);
$matches = $matches[0];
Result of $matches:
Array
(
[0] => Src.[VALUE1]
[1] => Src.[VALUE2]
)
This regex should be fine
Src\.\[[^\]]+\]
But instead of preg_split I'd suggest using preg_match_all
$string = 'Src.[VALUE1] + abs(Src.[VALUE2])';
$matches = array();
preg_match_all('/Src\.\[[^\]]+\]/', $string, $matches);
All matches you're looking for will be bound to $matches[0] array.
I guess preg_match_all is what you want. This works -
$string = "Src.[VALUE1] + abs(Src.[VALUE2])";
$regex = "/Src\.\[.*?\]/";
preg_match_all($regex, $string, $matches);
var_dump($matches[0]);
/*
OUTPUT
*/
array
0 => string 'Src.[VALUE1]' (length=12)
1 => string 'Src.[VALUE2]' (length=12)

PHP: Split string into array, like explode with no delimiter

I have a string such as:
"0123456789"
And I need to split each character into an array.
I, for the hell of it, tried:
explode('', '123545789');
But it gave me the obvious: Warning: No delimiter defined in explode) ..
How would I come across this? I can't see any method off hand, especially just a function.
$array = str_split("0123456789bcdfghjkmnpqrstvwxyz");
str_split takes an optional 2nd param, the chunk length (default 1), so you can do things like:
$array = str_split("aabbccdd", 2);
// $array[0] = aa
// $array[1] = bb
// $array[2] = cc etc ...
You can also get at parts of your string by treating it as an array:
$string = "hello";
echo $string[1];
// outputs "e"
You can access characters in a string just like an array:
$s = 'abcd';
echo $s[0];
prints 'a'
Try this:
$str = '123456789';
$char_array = preg_split('//', $str, -1, PREG_SPLIT_NO_EMPTY);
str_split can do the trick. Note that strings in PHP can be accessed just like a character array. In most cases, you won't need to split your string into a "new" array.
Here is an example that works with multibyte (UTF-8) strings.
$str = 'äbcd';
// PHP 5.4.8 allows null as the third argument of mb_strpos() function
do {
$arr[] = mb_substr( $str, 0, 1, 'utf-8' );
} while ( $str = mb_substr( $str, 1, mb_strlen( $str ), 'utf-8' ) );
It can be also done with preg_split() (preg_split( '//u', $str, null, PREG_SPLIT_NO_EMPTY )), but unlike the above example, that runs almost as fast regardless of the size of the string, preg_split() is fast with small strings, but a lot slower with large ones.
Try this:
$str = '546788';
$char_array = preg_split('//', $str, -1, PREG_SPLIT_NO_EMPTY);
Try this:
$str = "Hello Friend";
$arr1 = str_split($str);
$arr2 = str_split($str, 3);
print_r($arr1);
print_r($arr2);
The above example will output:
Array
(
[0] => H
[1] => e
[2] => l
[3] => l
[4] => o
[5] =>
[6] => F
[7] => r
[8] => i
[9] => e
[10] => n
[11] => d
)
Array
(
[0] => Hel
[1] => lo
[2] => Fri
[3] => end
)
If you want to split the string, it's best to use:
$array = str_split($string);
When you have a delimiter, which separates the string, you can try,
explode('', $string);
Where you can pass the delimiter in the first variable inside the explode such as:
explode(',', $string);
$array = str_split("$string");
will actually work pretty fine, but if you want to preserve the special characters in that string, and you want to do some manipulation with them, then I would use
do {
$array[] = mb_substr($string, 0, 1, 'utf-8');
} while ($string = mb_substr($string, 1, mb_strlen($string), 'utf-8'));
because for some of mine personal uses, it has been shown to be more reliable when there is an issue with special characters.

Categories