PHP split comma-separated values but retain quotes - php

I'm trying to split a string that contains comma-separated set of values. This can be achieved simply by using str_getcsv but I have an additional requirement where it falls short of. I need to retain quotes.
With an input string of:
string(30) "Hello, "San Diego, California""
I tried two approaches:
explode
$result = explode(",", $string);
Which results in
array(3) {
[0]=>
string(5) "Hello"
[1]=>
string(11) " "San Diego"
[2]=>
string(12) " California""
}
str_getcsv
$result = str_getcsv($string, ",");
This one results in
array(2) {
[0]=>
string(5) "Hello"
[1]=>
string(21) "San Diego, California"
}
I prefer using str_getcsv because it splits the values properly but it trims the enclosing quotes out. I need those quotes so I'm hoping I could call the function without it automatically removing the quotes.
Additional Info
I am actually open for a regex solution but I am clueless in that area.
I tried the solution here and it didn't work.

This pushed the limits of my regex knowledge, and I was unable to come up with an elegant regex which covers all possible cases for the input string (e.g. If the string ends with a comma) without leaving empty matches at the end.
$parts = preg_split('/(?:"[^"]*"|)\K\s*(,\s*|$)/', $string);
By itself, this gives:
Array
(
[0] => Hello
[1] => "San Diego, California"
[2] =>
)
And you can clean up the empty elements like this:
$result = array_filter($parts, function ($value) {
return ($value !== '');
});
Note: The regex trims white-space from the start/end of each match. Remove the \s* parts if you don't want that.

$wrapped = array_map(function($value) {
return "\"$value\"";
}, str_getcsv($string, ","));
UPDATE: You could try something like this:
$value = preg_split('~(?:\'[^\']*\'|"[^"]*"|)\K(,|$)~', $string);
I lifted this from PHP get comma separated values which are not enclosed

Related

How to get numbers between a long space (PHP Regex)

I'd like to extract the numbers specifically with a PHP regex expression, I don't get the regex very much although I'm currently trying with the regex101 website. Thing is, I have this:
66
28006 MadridVer teléfono
(Literally that, it's seen with a lot of more spaces and 28006 MadridVer teléfono is presented in the next line actually). And I'd like to extract the number 28006 or at least split the findings of the expression in a way I have the 28006 separately in one of the groups. What would be my php regex expresion like? Maybe apart from capturing spaces I should capture a new line or something. But I am totally lost in this (yes, I'm an absolute regex novice yet).
I don't see a need for regex.
Remove the new line and explode on space.
Then use array_filter to remove empty values from the array and rearrange the array with array_values.
$str = "66
28006 MadridVer teléfono";
$str = str_replace("\n", " ", $str);
$arr = explode(" ", $str);
$arr = array_values(array_filter($arr));
var_dump($arr);
Returns:
array(4) {
[0]=>
string(2) "66"
[1]=>
string(5) "28006"
[2]=>
string(9) "MadridVer"
[3]=>
string(9) "teléfono"
}

split by special char and remove empty elements in php and javascript array

I have a merged string merged by numbers and each number element has the & character in the beginning and end.
Actual string &1&&3&&5&
If you add 6 to this string the final string will be &1&&3&&5&&6&
The problem is when I want to get numbers in this string of arrays, too many empty element in the array also I don't need them.
When I split explode(',', actualstr) the array is ["1","","3","","5","","6"] but I need this ["1","3","5","6"]
I will do this many times so need most efficient way.
There is a similar scenario in js too if there is special way need to know, if not it's ok with manual check.
Remove the leading and trailing &, then explode by double &&.
$array = explode('&&',trim($str,'&'));
print_r($array);
Array
(
[0] => 1
[1] => 3
[2] => 5
[3] => 6
)
One quickfix to that is using regex, but only if you know and are 100% about the data you are working with
preg_match_all("/[0-9]/", "&1&&3&&5&&6&", $numbers);
var_dump($numbers);
array(1) {
[0]=>
array(4) {
[0]=>
string(1) "1"
[1]=>
string(1) "3"
[2]=>
string(1) "5"
[3]=>
string(1) "6"
}
}
Another way would be to use array filter, if the data between the '&' is not fit for filtering by regex
array_filter(explode("&", "&1&&3&&5&&6&"))
You can use trim() function to remove a spacial character or removing space character from the string.
$str = "&1&&3&&5&&6&";
$str_clear = trim($str, '&');
$array = explode('&&',$str_clear);
print_r($array);
I don't know how you extract the number of this string, but if you do like this you can get an array of the numbers:
preg_match_all('/&([0-9])&/','&1&&3&&5&',$matches);
var_dump($matches);
In JavaScript you can do something like #AlexAndrei has done in PHP:
var str='&1&&3&&5&';
var result=str.substr(0,str.length-1).substr(1).split('&&');
console.log(result);
I think this is what you're trying to do.
$str = "&1&&3&&5&&6&";
$numArray = preg_split('/&/', $str, -1, PREG_SPLIT_NO_EMPTY);
print_r($numArray);

PHP and RegEx: how to split a string including comma,space,colon to some substring

I'm trying to split a string that can either be comma, space or semi-colon delimitted. It could also contain a space or spaces after each delimitter. For example
chr1:22222-333333 or
chr1 22222 333333 or
chr1 22222 333333 or
chr1:22,222-33,333
Any one of these would produce an array with three values ["chr1","22222","33333"], I have tried some method, but it not all complete. especially the fourth case.
Thank you very much for help me.
$yourString = "chr1:22222-33333"; // for instance
$output = preg_split("/:| |;/", $yourString);
This acts as an equivalent of explode() but when you want multiple delimiters.
Explanation of the characters in the preg_split statement:
/ acts to enclose the regular expression, as to say ok, that's happening here
| acts as a OR statement, as if to tell this OR this OR that
So that in the end, /:| |;/ means select anything that is ":" or " " or ";"
If you want to practice or simply understand better the principles of RegEx, you can have a look to this nice collection of RegEx tutorials
you can use str_replace with explode
$str = array('chr1:22222-333333', 'chr1 22222 333333', 'chr1 22222 333333', 'chr1:22,222-33,333');
foreach($str as $val){
var_dump(explode(" ", str_replace(array(',',':','-'), array('',' ', ' '), $val)));
}
which pretty much removes all , then replaces : AND - with a space then explodes with spaces as a delimiter.
Demo
which produces
array(3) {
[0]=>
string(4) "chr1"
[1]=>
string(5) "22222"
[2]=>
string(6) "333333"
}
array(3) {
[0]=>
string(4) "chr1"
[1]=>
string(5) "22222"
[2]=>
string(6) "333333"
}
array(3) {
[0]=>
string(4) "chr1"
[1]=>
string(5) "22222"
[2]=>
string(6) "333333"
}
array(3) {
[0]=>
string(4) "chr1"
[1]=>
string(5) "22222"
[2]=>
string(5) "33333"
}
If you value conciseness and want to keep things neat, preg_split is the best way to go, in my opinion.
In the following examples, I assume you want your input separated by commas, spaces or colons:
$splitted = preg_split("/[,: ]/", $string);
If you want to treat tabs as whitespaces, you can replace the single space character with \s, which will match tabs as well:
$splitted = preg_split("/[,:\s]/", $string);
Note: The \s will match newlines too, if your input may eventually be a multline string.
Yet, if you don't trust your input (You don't, right?) and think that perhaps subsequent spaces and/or tabs should be ignored and treated as single spaces, you can go with this version:
$splitted = preg_split("/,|:|\s/", $string);
All the forms above work great provided the input you presented. If you want to play with these a little, this is a nice place to do so.

PHP Explode comma separated values with date

I'm using explode to parse a string of comma separated values into variables. No problem there. The issue I'm having is that one of the values is a date in the format: May 3, 2013. So the explode is picking up on the comma in the date. Do I have any options for getting around this? I don't have much control over the source (the original string) so I'm trying to come up with a way to work with what I've got.
$CONTENT = 'blue,red,purple,May 2, 2013,orange,green';
list($valueA, $valueB, $valueC, $valueD, $valueE, $valueF) = explode(',', $CONTENT);
Thank you!
You can use regex to split your string. This is based on the assumption, that there is not whitespace between two words if it is used as a seperator.
$CONTENT = 'blue,red,purple,May 2, 2013,orange,green';
$result = preg_split('/,(?! )/', $CONTENT);
your string will result correctly in
array(6) {
[0]=>
string(4) "blue"
[1]=>
string(3) "red"
[2]=>
string(6) "purple"
[3]=>
string(11) "May 2, 2013"
[4]=>
string(6) "orange"
[5]=>
string(5) "green"
}
so once you are using your list expression again, your variables should be set correctly
list($valueA, $valueB, $valueC, $valueD, $valueE, $valueF) = preg_split('/,(?! )/', $CONTENT);
Escape the comma by finding the value in the array and replacing the comma with a special symbol, then after the explode replace the special symbol with a comma.
The only thing I can think of is to use a regular expression to look for something matching the pattern of a date, replace the comma with something else, then after you explode the string, replace that special character with a comma again.
It's messy, but if you genuinely have no control over the string coming in, there's probably not a lot you can do.

Regex with multiple newlines in sequence

I'm trying to use PHP's split() (preg_split() is also an option if your answer works with it) to split up a string on 2 or more \r\n's. My current effort is:
split("(\r\n){2,}",$nb);
The problem with this is it matches every time there is 2 or 3 \r\n's, then goes on and finds the next one. This is ineffective with 4 or more \r\n's.
I need all instances of two or more \r\n's to be treated the same as two \r\n's. For example, I'd need
Hello\r\n\r\nMy\r\n\r\n\r\n\r\n\r\n\r\nName is\r\nShadow
to become
array('Hello','My','Name is\r\nShadow');
preg_split() should do it with
$pattern = "/(\\r\\n){2,}/";
What about the following suggestion:
$nb = implode("\r\n", array_filter(explode("\r\n", $nb)));
It works for me:
$nb = "Hello\r\n\r\nMy\r\n\r\n\r\n\r\n\r\n\r\nName is\r\nShadow";
$parts = split("(\r\n){2,}",$nb);
var_dump($parts);
var_dump($parts === array('Hello','My',"Name is\r\nShadow"));
Prints:
array(3) {
[0]=>
string(5) "Hello"
[1]=>
string(2) "My"
[2]=>
string(15) "Name is
Shadow"
}
bool(true)
Note the double quotes in the second test to get the characters represented by \r\n.
Adding the PREG_SPLIT_NO_EMPTY flag to preg_replace() with Tomalak's pattern of "/(\\r\\n){2,}/" accomplished this for me.
\R is shorthand for matching newline sequences across different operating systems. You can prevent empty elements being created at the start and end of your output array by using the PREG_SPLIT_NO_EMPTY flag or you could call trim() on the string before splitting.
Code: (Demo)
$string = "\r\n\r\nHello\r\n\r\nMy\r\n\r\n\r\n\r\n\r\n\r\nName is\r\nShadow\r\n\r\n\r\n\r\n";
var_export(preg_split('~\R{2,}~', $string, 0, PREG_SPLIT_NO_EMPTY));
echo "\n---\n";
var_export(preg_split('~\R{2,}~', trim($string)));
Output from either technique:
array (
0 => 'Hello',
1 => 'My',
2 => 'Name is
Shadow',
)

Categories