PHP preg_split() for new line comma colon and space - php

This is my code in .php:
$new_split = preg_split("/\s*[:, ]\s*/",$full_list,2);
print_r ($new_split);
Input ($full_list) is:
abcd : xyz
abcd efgh, ijk ,lmn
abcd lmnop
abcd: efghijk
abcd,efgh
Output is:
Array (
[0] => abcd
[1] => xyz abcd efgh, ijk ,lmn abcd lmnop abcd: efghijk abcd,efgh *
)
I want to split based on new line comma (,) colon (:) and space. Please let me know how to get the below output.
Expected output is:
Array (
[0] => abcd
[1] => xyz
[2] => abcd
[3] => efgh
[4] => ijk
[5] => lmn
[6] => abcd
[7] => lmnop
[8] => abcd
[9] => efghijk
[10] => abcd
[11] =>efgh
)

Remove the \s* around the character class and change single space by \s inside the character class, add also a quantifier (ie. + for 1 or more):
$new_split = preg_split("/[:,\s]+/",$full_list,2);
print_r ($new_split);

Add \s inside the brackets like this: $new_split = preg_split("/\s*[:,\s]\s*/",$full_list);

Related

Splitting a single string to an array on more than one delimiter

Is it possible to explode the following:
08 1.2/3(1(1)2.1-1
to an array of {08, 1, 2, 3, 1, 1, 2, 1, 1}?
I tried using preg_split("/ (\s|\.|\-|\(|\)) /g", '08 1.2/3(1(1)2.1-1') but it returned nothing. I tried checking my regex here and it matched well. What am I missing here?
You should use a character class containing all the delimiters which you want to use for splitting. Regex character classes appear inside [...]:
<?php
$keywords = preg_split("/[\s,\/().-]+/", '08 1.2/3(1(1)2.1-1');
print_r($keywords);
Result:
Array ( [0] => 08 [1] => 1 [2] => 2 [3] => 3 [4] => 1 [5] => 1 [6] => 2 [7] => 1 [8] => 1 )
You can use preg_match_all():
$str = '08 1.2/3(1(1)2.1-1';
preg_match_all('!\d+!', $str, $matches);
print_r($matches);

How capture part of string included delimiter?

Having a string so formed:
#foo1 foo2# foo3 foo4 #foo5# ##foo6# #foo7## #foo8 foo9#
The expected should be an array so formed:
array (
[0] => #foo1 foo2#
[1] => foo3
[2] => foo4
[3] => #foo5#
[4] => ##foo6# #foo7##
[5] => #foo8 foo9#
);
Or more simply splitting for space but capuring all which inside a delimiter, included it... in a array.
NOTE: The string can to have repeated it.
You can use preg_match_all using this alternation regex:
/(#+).*?\1|\S+/
RegEx Demo
RegEx Breakup:
(#+) - Match 1 or more # in captured group #1
.*? - Match 0 or more of any characters (non-greedy)
\1 - Back-reference to captured group #1 to make sure we have same #s on RHS
| - OR
\S+ - one or more non-white-space characters
Code:
$str = '#foo1 foo2# foo3 foo4 #foo5# ##foo6# #foo7## #foo8 foo9#';
preg_match_all('/(#+).*?\1|\S+/', $str, $matches);
print_r($matches[0]);
Output:
Array
(
[0] => #foo1 foo2#
[1] => foo3
[2] => foo4
[3] => #foo5#
[4] => ##foo6# #foo7##
[5] => #foo8 foo9#
)

How to split a string in multiple ones (Php)?

I want to split a big number/string for example 123456789123456789 into 6 smaller strings/numbers of 3 characters each. So the result would be 123 456 789 123 456 789. How can I do this?
Use chunk_split():
$var = "123456789123456789";
$split_string = chunk_split($var, 3); // 3 is the length of each chunk
If you want your result as an array, you can use str_split():
$var = "123456789123456789";
$array = str_split($var, 3); // 3 is the length of each chunk
You may use chunk_split() function.
It splits a string into smaller
$string = "123456789123456789";
echo chunk_split ($string, 3, " ");
will output
123 456 789 123 456 789
First parameter is the string to be chunked. The second is the chunk length and the third is what you want at the end of each chunk.
See PHP manual for further information
You could do something like this:
$string = '123456789123456789';
preg_match_all('/(\d{3})/', $string, $matches);
print_r($matches[1]);
Output:
Array
(
[0] => 123
[1] => 456
[2] => 789
[3] => 123
[4] => 456
[5] => 789
)
\d is a number and {3} is 3 of the previously found character (in this case a number.
....
or if there won't always be even groupings:
$string = '12345678912345678922';
preg_match_all('/(\d{1,3})/', $string, $matches);
print_r($matches[1]);
Output:
Array
(
[0] => 123
[1] => 456
[2] => 789
[3] => 123
[4] => 456
[5] => 789
[6] => 22
)
Demo: https://regex101.com/r/rX0pJ1/1

How to manipulate complex strings in php?

I am trying to group bunch of texts from a string and create an array for it.
The string is something like this:
<em>string</em> and the <em>test</em> here.
tableBegin rowNumber:2, columnNumber:2 11 22 33 44 tableEnd
<em>end</em> text here
I was hoping to get an array like the following results
array (0 => '<em>string</em> and the <em>test</em> here.',
1=>'rowNumber:5',
2=>'columnNumber:3',
3=>'11',
4=>'22',
5=>'33',
6=>'44'
7=>'<em>end</em> text here')
11,22,33,44 are the table cell data the user enters. I want to make them have unique index but keep the rest of texts together.
tableBegin and tableEnd are just the check for the table cell data
Any help or tips? Thanks a lot!
You may try the following, note that you need PHP 5.3+:
$string = '<em>string</em> and the <em>test</em> here.
tableBegin rowNumber:2, columnNumber:2 11 22 33 44 tableEnd
SOme other text
tableBegin rowNumber:3, columnNumber:3 11 22 33 44 55 tableEnd
<em>end</em> text here';
$array = array();
preg_replace_callback('#tableBegin\s*(.*?)\s*tableEnd\s*|.*?(?=tableBegin|$)#s', function($m)use(&$array){
if(isset($m[1])){ // If group 1 exists, which means if the table is matched
$array = array_merge($array, preg_split('#[\s,]+#s', $m[1])); // add the splitted string to the array
// split by one or more whitespace or comma --^
}else{// Else just add everything that's matched
if(!empty($m[0])){
$array[] = $m[0];
}
}
}, $string);
print_r($array);
Output
Array
(
[0] => string and the test here.
[1] => rowNumber:2
[2] => columnNumber:2
[3] => 11
[4] => 22
[5] => 33
[6] => 44
[7] => SOme other text
[8] => rowNumber:3
[9] => columnNumber:3
[10] => 11
[11] => 22
[12] => 33
[13] => 44
[14] => 55
[15] => end text here
)
Regex explanation
tableBegin : match tableBegin
\s* : match a whitespace zero or more times
(.*?) : match everything ungreedy and put it in group 1
\s* : match a whitespace zero or more times
tableEnd : match tableEnd
\s* : match a whitespace zero or more times
| : or
.*?(?=tableBegin|$) : match everything until tableBegin or end of line
The s modifier : make dots also match newlines
Here is the ugly way to do it, if you can't find a Regex guru out ther.
So, this is your text
$string = "<em>string</em> and the <em>test</em> here.
tableBegin rowNumber:2, columnNumber:2 11 22 33 44 tableEnd
<em>end</em> text here";
And this is my code
$E = explode(' ', $string);
$A = $E[0].$E[1].$E[2].$E[3].$E[4].$E[5];
$B = $E[17].$E[18].$E[19];
$All = [$A, $E[8],$E[9], $E[11], $E[12], $E[13], $E[14], $B];
print_r($All);
And this is the output
Array
(
[0] => stringandthetesthere.
[1] => rowNumber:2,
[2] => columnNumber:2
[3] => 11
[4] => 22
[5] => 33
[6] => 44
[7] => endtexthere
)
off-course, the <em> tags won't be visible, unless view the source code.

RegEx Statement Issues - PHP

I am attempting to use RegEx to strip down the following data:
mlb_s_left1=Baltimore 3 ^NY Yankees 12 (FINAL)&mlb_s_right1_1=W: Hughes L: Britton&mlb_s_right1_count=1&mlb_s_url1=http://sports.espn.go.com/mlb/boxscore?gameId=320801110&mlb_s_left2=^Chicago Sox 3 Minnesota 2 (FINAL)&mlb_s_right2_1=W: Peavy L: Diamond S: Reed&mlb_s_right2_count=1&mlb_s_url2=http://sports.espn.go.com/mlb/boxscore?gameId=320801109
I am hoping to split it apart by home team (first city), home score (first digit), away team (second city), away score (second digit), and where in the game it is (in parenthesis). This is the RegEx I have currently, but am feeling is very wrong.
preg_match_all('/mlb_s_left[0-9]=(?P<hometeam>.*?) (?P<homescore>.*?) (?P<awayteam>.*?) (?P<awayscore>.*?)\((?P<time>.*?)\)/', $content, $matches);
I would appreciate any and all help in getting this working.
I have tested following code snippet in php 5.4.5:
<?php
$foo = 'mlb_s_left1=Baltimore 3 ^NY Yankees 12 (FINAL)&mlb_s_right1_1=W: Hughes L: Britton&mlb_s_right1_count=1&mlb_s_url1=http://sports.espn.go.com/mlb/boxscore?gameId=320801110&mlb_s_left2=^Chicago Sox 3 Minnesota 2 (FINAL)&mlb_s_right2_1=W: Peavy L: Diamond S: Reed&mlb_s_right2_count=1&mlb_s_url2=http://sports.espn.go.com/mlb/boxscore?gameId=320801109';
preg_match_all('/mlb_s_left\d=\^?(?P<hometeam>[a-zA-Z]+(?:\s+[a-zA-Z]+)*)\s+(?P<homescore>\d+)\s+\^?(?P<awayteam>[a-zA-Z]+(?:\s+[a-zA-Z]+)*)\s+(?P<awayscore>\d+)\s+\((?P<time>\w+)\)/', $foo, $matches, PREG_SET_ORDER);
print_r($matches);
?>
output:
Array
(
[0] => Array
(
[0] => mlb_s_left1=Baltimore 3 ^NY Yankees 12 (FINAL)
[hometeam] => Baltimore
[1] => Baltimore
[homescore] => 3
[2] => 3
[awayteam] => NY Yankees
[3] => NY Yankees
[awayscore] => 12
[4] => 12
[time] => FINAL
[5] => FINAL
)
[1] => Array
(
[0] => mlb_s_left2=^Chicago Sox 3 Minnesota 2 (FINAL)
[hometeam] => Chicago Sox
[1] => Chicago Sox
[homescore] => 3
[2] => 3
[awayteam] => Minnesota
[3] => Minnesota
[awayscore] => 2
[4] => 2
[time] => FINAL
[5] => FINAL
)
)
Something like this should get you close.
preg_match_all('/mlb_s_left\d+=(?P<hometeam>\D+)\s+(?P<homescore>\d+)\s+(?P<awayteam>\D+)\s+(?P<awayscore>\d+)\s*\((?P<time>[^)]+)\)/',
$content, $matches);
Note that \d matches any digit, and \D matches anything that is not a digit.
[^)]+ matches one or more non-close parens characters; \s+ matches one or more whitespace chars, and \s* matches zero or more whitespace characters.
This wouldn't work very well if you have a city name with a number in it, and if you have a huge string, it's possible it could get hung up somewhere; you might consider splitting it up and matching a bit more piecemeal.
Generally speaking I would avoid .*? as a pattern match, as it basically matches almost anything. It's best for your regular expression to be as specific as possible, based on what you know about the data.

Categories