How to keep the delimiter in the next item in preg_split?

How to keep the delimiter in the next item in preg_split? - php

I split a string by a set of characters as
$str = 'a-1 90 b55 0 -4 4 c9';
$array = preg_split('#(?<=[abc])#',$str, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
it preserves the delimiter in the previous element as (demo)
Array
(
[0] => a
[1] => -1 90 b
[2] => 55 0 -4 4 c
[3] => 9
)
but I want to keep it in the next item as
Array
(
[0] => a-1 90
[1] => b55 0 -4 4
[2] => c9
)

Use lookahead instead of lookbehind:
$str = 'a-1 90 b55 0 -4 4 c9';
$array = preg_split('#(?=[abc])#',$str, -1, PREG_SPLIT_NO_EMPTY);
print_r($array);
Since you're not using any capture group in your regex, therefore there is no need to use PREG_SPLIT_DELIM_CAPTURE flag.
Code Demo

Related

Parse strictly formatted text containing multiple entries with no delimiting character

I have a string containing multiple products orders which have been joined together without a delimiter.
I need to parse the input string and convert sets of three substrings into separate rows of data.
I tried splitting the string using split() and strstr() function, but could not generate the desired result.
How can I convert this statement into different columns?
RM is Malaysian Ringgit
From this statement:
"2 x Brew Coffeee Panas: RM7.42 x Tongkat Ali Ais: RM8.6"
Into seperate row:
2 x Brew Coffeee Panas: RM7.4
2 x Tongkat Ali Ais: RM8.6
And this 2 row into this table in DB:
Table: Products
Product Name
Quantity
Total Amount (RM)
Brew Coffeee Panas
2
7.4
Tongkat Ali Ais
2
8.6
*Note: the "total amount" substrings will reliably have a numeric value with precision to one decimal place.

You could use regex if your string format is consistent. Here's an expression that could do that:
(\d) x (.+?): RM(\d+\.\d)
Basic usage
$re = '/(\d) x (.+?): RM(\d+\.\d)/';
$str = '2 x Brew Coffeee Panas: RM7.42 x Tongkat Ali Ais: RM8.6';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
var_export($matches);
Which gives
array (
0 =>
array (
0 => '2 x Brew Coffeee Panas: RM7.4',
1 => '2',
2 => 'Brew Coffeee Panas',
3 => '7.4',
),
1 =>
array (
0 => '2 x Tongkat Ali Ais: RM8.6',
1 => '2',
2 => 'Tongkat Ali Ais',
3 => '8.6',
),
)
Group 0 will always be the full match, after that the groups will be quantity, product and price.
Try it online

Capture one or more digits
Match the space, x, space
Capture one or more non-colon characters until the first occuring colon
Match the colon, space, then RM
Capture the float value that has a max decimal length of 1OP says in comment under question: it only take one decimal place for the amount
There are no "lazy quantifiers" in my pattern, so the regex can move most swiftly.
This regex pattern is as Accurate as the sample data and requirement explanation allows, as Efficient as it can be because it only contains greedy quantifiers, as Concise as it can be thanks to the negated character class, and as Readable as the pattern can be made because there are no superfluous characters.
Code: (Demo)
var_export(
preg_match_all('~(\d+) x ([^:]+): RM(\d+\.\d)~', $string, $m)
? array_slice($m, 1) // omit the fullstring matches
: [] // if there are no matches
);
Output:
array (
0 =>
array (
0 => '2',
1 => '2',
),
1 =>
array (
0 => 'Brew Coffeee Panas',
1 => 'Tongkat Ali Ais',
),
2 =>
array (
0 => '7.4',
1 => '8.6',
),
)
You can add the PREG_SET_ORDER argument to the preg_match_all() call to aid in iterating the matches as rows.
preg_match_all('~(\d+) x ([^:]+): RM(\d+\.\d)~', $string, $matches, PREG_SET_ORDER);
foreach ($matches as $match) {
echo '<tr><td>' . implode('</td><td>', array_slice($match, 1)) . '</td></tr>';
}

You can use a regex like this:
/(\d+)\sx\s([^:]+):\sRM(\d+\.?\d?)(?=\d|$)/
Explanation:
(\d+) captures one or more digits
\s matches a whitespace character
([^:]+): captures one or more non : characters that come before a : character (you can also use something like [a-zA-Z0-9\s]+): if you know exactly which characters can exist before the : character - in this case lower case and upper case letters, digits 0 through 9 and whitespace characters)
(\d+\.?\d?) captures one or more digits, followed by a . and another digit if they exist
(?=\d|$) is a positive lookahead which matches a digit after the main expression without including it in the result, or the end of the string
You can also add the PREG_SET_ORDER flag to preg_match_all() to group the results:
PREG_SET_ORDER
Orders results so that $matches[0] is an array of first set of matches, $matches[1] is an array of second set of matches, and so on.
Code example:
<?php
$txt = "2 x Brew Coffeee Panas: RM7.42 x Tongkat Ali Ais: RM8.62 x B026 Kopi Hainan Kecil: RM312 x B006 Kopi Hainan Besar: RM19.5";
$pattern = "/(\d+)\sx\s([^:]+):\sRM(\d+\.?\d?)(?=\d|$)/";
if(preg_match_all($pattern, $txt, $matches, PREG_SET_ORDER)) {
print_r($matches);
}
?>
Output:
Array
(
[0] => Array
(
[0] => 2 x Brew Coffeee Panas: RM7.4
[1] => 2
[2] => Brew Coffeee Panas
[3] => 7.4
)
[1] => Array
(
[0] => 2 x Tongkat Ali Ais: RM8.6
[1] => 2
[2] => Tongkat Ali Ais
[3] => 8.6
)
[2] => Array
(
[0] => 2 x B026 Kopi Hainan Kecil: RM31
[1] => 2
[2] => B026 Kopi Hainan Kecil
[3] => 31
)
[3] => Array
(
[0] => 2 x B006 Kopi Hainan Besar: RM19.5
[1] => 2
[2] => B006 Kopi Hainan Besar
[3] => 19.5
)
)
See it live here php live editor and here regex tester.

The first thing I would do would be to perform a simple replacement using preg_replace to insert, with the aid of a a back-reference to the captured item, based upon the known format of a single decimal point. Anything beyond that single decimal point forms part of the next item - the quantity in this case.
$str="2 x Brew Coffeee Panas: RM7.42 x Tongkat Ali Ais: RM8.625 x Koala Kebabs: RM15.23 x Fried Squirrel Fritters: RM32.4";
# qty price
# 2 7.4
# 2 8.6
# 25 15.2
# 3 32.4
/*
Our RegEx to find the decimal precision,
to split the string apart and the quantity
*/
$pttns=(object)array(
'repchar' => '#(RM\d{1,}\.\d{1})#',
'splitter' => '#(\|)#',
'combo' => '#^((\d{1,}) x)(.*): RM(\d{1,}\.\d{1})$#'
);
# create a new version of the string with our specified delimiter - the PIPE
$str = preg_replace( $pttns->repchar, '$1|', $str );
# split the string intp pieces - discard empty items
$a=array_filter( preg_split( $pttns->splitter, $str, null ) );
#iterate through matches - find the quantity,item & price
foreach($a as $str){
preg_match($pttns->combo,$str,$matches);
$qty=$matches[2];
$item=$matches[3];
$price=$matches[4];
printf('%s %d %d<br />',$item,$qty,$price);
}
Which yields:
Brew Coffeee Panas 2 7
Tongkat Ali Ais 2 8
Koala Kebabs 25 15
Fried Squirrel Fritters 3 32

php split / cluster binary into chunks based on next 1 in cycle

I need to figure out a method using PHP to chunk the 1's and 0's into sections.
1001 would look like: array(100,1)
1001110110010011 would look like: array(100,1,1,10,1,100,100,1,1)
It gets different when the sequence starts with 0's... I would like it to segment the first 0's into their own blocks until the first 1 is reached)
00110110 would look like (0,0,1,10,1,10)
How would this be done with PHP?

You can use preg_match_all to split your string, using the following regex:
10*|0
This matches either a 1 followed by some number of 0s, or a 0. Since a regex always tries to match the parts of an alternation in the order they occur, the second part will only match 0s that are not preceded by a 1, that is those at the start of the string. PHP usage:
$beatstr = '1001110110010011';
preg_match_all('/10*|0/', $beatstr, $m);
print_r($m);
$beatstr = '00110110';
preg_match_all('/10*|0/', $beatstr, $m);
print_r($m);
Output:
Array
(
[0] => Array
(
[0] => 100
[1] => 1
[2] => 1
[3] => 10
[4] => 1
[5] => 100
[6] => 100
[7] => 1
[8] => 1
)
)
Array
(
[0] => Array
(
[0] => 0
[1] => 0
[2] => 1
[3] => 10
[4] => 1
[5] => 10
)
)
Demo on 3v4l.org

Splitting a single string to an array on more than one delimiter

Is it possible to explode the following:
08 1.2/3(1(1)2.1-1
to an array of {08, 1, 2, 3, 1, 1, 2, 1, 1}?
I tried using preg_split("/ (\s|\.|\-|\(|\)) /g", '08 1.2/3(1(1)2.1-1') but it returned nothing. I tried checking my regex here and it matched well. What am I missing here?

You should use a character class containing all the delimiters which you want to use for splitting. Regex character classes appear inside [...]:
<?php
$keywords = preg_split("/[\s,\/().-]+/", '08 1.2/3(1(1)2.1-1');
print_r($keywords);
Result:
Array ( [0] => 08 [1] => 1 [2] => 2 [3] => 3 [4] => 1 [5] => 1 [6] => 2 [7] => 1 [8] => 1 )

You can use preg_match_all():
$str = '08 1.2/3(1(1)2.1-1';
preg_match_all('!\d+!', $str, $matches);
print_r($matches);

Php preg_split seperates number with comma in two different numbers

$line = "Type:Bid, End Time: 12/20/2018 08:10 AM (PST), Price: $8,000,Bids: 14, Age: 0, Description: , Views: 120270, Valuation: $10,75, IsTrue: false";
I need to get this array:
Array ( [0] => Bid [1] => 12/20/2018 08:10 AM (PST) [2] => $8,000 [3] => 14 [4] => 0 [5] => [6] => 120270 [7] => $10,75 [8] => false )

I agree with Andreas about using preg_match_all(), but not with his pattern.
For stability, I recommend consuming the entire string from the beginning.
Match the label and its trailing colon. [^:]+:
Match zero or more spaces. \s*
Forget what you matched so far \K
Lazily match zero or more characters (giving back when possible -- make minimal match). .*?
"Look Ahead" and demand that the matched characters from #4 are immediately followed by a comma, then 1 or more non-comma&non-colon character (the next label), then a colon ,[^,:]+: OR the end of the string $.
Code: (Demo)
$line = "Type:Bid, End Time: 12/20/2018 08:10 AM (PST), Price: $8,000,Bids: 14, Age: 0, Description: , Views: 120270, Valuation: $10,75, IsTrue: false";
var_export(
preg_match_all(
'/[^:]+:\s*\K.*?(?=\s*(?:$|,[^,:]+:))/',
$line,
$out
)
? $out[0] // isolate fullstring matches
: [] // no matches
);
Output:
array (
0 => 'Bid',
1 => '12/20/2018 08:10 AM (PST)',
2 => '$8,000',
3 => '14',
4 => '0',
5 => '',
6 => '120270',
7 => '$10,75',
8 => 'false',
)

New answer according to new request:
I use he same regex for spliting the string and I replace after what is before the colon:
$line = "Type:Bid, End Time: 12/20/2018 08:10 AM (PST), Price: $8,000,Bids: 14, Age: 0, Description: , Views: 120270, Valuation: $10,75, IsTrue: false";
$parts = preg_split("/(?<!\d),|,(?!\d)/", $line);
$result = array();
foreach($parts as $elem) {
$result[] = preg_replace('/^[^:]+:\h*/', '', $elem);
}
print_r ($result);
Output:
Array
(
[0] => Bid
[1] => 12/20/2018 08:10 AM (PST)
[2] => $8,000
[3] => 14
[4] => 0
[5] =>
[6] => 120270
[7] => $10,75
[8] => false
)

I'd use preg_match instead.
Here the pattern looks for digit(s) comma digit(s) or just digit(s) or a word and a comma.
I append a comma to the string to make the regex simpler.
$line = "TRUE,59,m,10,500";
preg_match_all("/(\d+,\d+|\d+|\w+),/", $line . ",", $match);
var_dump($match);
https://3v4l.org/HQMgu
Even with a different order of the items this code will still produce a correct output: https://3v4l.org/SRJOf

much bettter idea:
$parts=explode(',',$line,4); //explode has a limit you can use in this case 4
same result less code.
I would keep it simple and do this
$line = "TRUE,59,m,10,500";
$parts = preg_split("/,/", $line);
//print_r ($parts);
$parts[3]=$parts[3].','.$parts[4]; //create a new part 3 from 3 and 4
//$parts[3].=','.$parts[4]; //alternative syntax to the above
unset($parts[4]);//remove old part 4
print_r ($parts);
i would also just use explode(), rather than a regular expression.

How to split a string in multiple ones (Php)?

I want to split a big number/string for example 123456789123456789 into 6 smaller strings/numbers of 3 characters each. So the result would be 123 456 789 123 456 789. How can I do this?

Use chunk_split():
$var = "123456789123456789";
$split_string = chunk_split($var, 3); // 3 is the length of each chunk
If you want your result as an array, you can use str_split():
$var = "123456789123456789";
$array = str_split($var, 3); // 3 is the length of each chunk

You may use chunk_split() function.
It splits a string into smaller
$string = "123456789123456789";
echo chunk_split ($string, 3, " ");
will output
123 456 789 123 456 789
First parameter is the string to be chunked. The second is the chunk length and the third is what you want at the end of each chunk.
See PHP manual for further information

You could do something like this:
$string = '123456789123456789';
preg_match_all('/(\d{3})/', $string, $matches);
print_r($matches[1]);
Output:
Array
(
[0] => 123
[1] => 456
[2] => 789
[3] => 123
[4] => 456
[5] => 789
)
\d is a number and {3} is 3 of the previously found character (in this case a number.
....
or if there won't always be even groupings:
$string = '12345678912345678922';
preg_match_all('/(\d{1,3})/', $string, $matches);
print_r($matches[1]);
Output:
Array
(
[0] => 123
[1] => 456
[2] => 789
[3] => 123
[4] => 456
[5] => 789
[6] => 22
)
Demo: https://regex101.com/r/rX0pJ1/1

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

How to keep the delimiter in the next item in preg_split? - php

Use lookahead instead of lookbehind: $str = 'a-1 90 b55 0 -4 4 c9'; $array = preg_split('#(?=[abc])#',$str, -1, PREG_SPLIT_NO_EMPTY); print_r($array); Since you're not using any capture group in your regex, therefore there is no need to use PREG_SPLIT_DELIM_CAPTURE flag. Code Demo

Related

Parse strictly formatted text containing multiple entries with no delimiting character

php split / cluster binary into chunks based on next 1 in cycle

Splitting a single string to an array on more than one delimiter

Php preg_split seperates number with comma in two different numbers

How to split a string in multiple ones (Php)?

Categories

Resources