Split string in php with comma and new line - php

Im trying to split string in PHP. I should split string using two delimiters: new line and comma. My code is:
$array = preg_split("/\n|,/", $str)
But i get string split using comma, but not using \n. Why is that? Also , do I have to take into account "\r\n" symbol?

I can think of two possible reasons that this is happening.
1. You are using a single quoted string:
$array = preg_split("/\n|,/", 'foo,bar\nbaz');
print_r($array);
Array
(
[0] => foo
[1] => bar\nbaz
)
If so, use double quotes " instead ...
$array = preg_split("/\n|,/", "foo,bar\nbaz");
print_r($array);
Array
(
[0] => foo
[1] => bar
[2] => baz
)
2. You have multiple newline sequences and I would recommend using \R if so. This matches any Unicode newline sequence that is in the ASCII range.
$array = preg_split('/\R|,/', "foo,bar\nbaz\r\nquz");
print_r($array);
Array
(
[0] => foo
[1] => bar
[2] => baz
[3] => quz
)

Related

PHP Parse custom characters inside a string

I need a help to parse the characters inside those brackets:
[]
{}
<>
{|}
<|>
For example, I have this string variable (Japanese):
$question = "この<部屋|へや>[に]{椅子|いす}[が]ありません";
Expected result in HTML:
Description
1) This is a particle. I will convert all word inside [] into HTML tag. Example: [に] will be converted into <span style="color:blue">に</span>. A full sentence can have multiple []. Note: I understand how to use str_replace.
2 and 4) This is normal kanji word which will be used as a question to the user. A full sentence can only have one {}.
3 and 5) This is normal kanji text. A full sentence can have multiple {}.
2, 3, 4, and 5) They will converted into Ruby html tags. Sometimes they will not have a | separator, which is not mandatory. From what I understand, I just need to explode the | characters. If explode return false or | not exist, I will use original value. Note: I understand how to use ruby tags (rb and rt).
My question
How do I parse characters 1-5 I mentioned above with PHP? What keyword I need to start?
Thanks.
Thanks to this page: Capturing text between square brackets in PHP, now I have my own answer.
Full code:
<?php
$text = "この<部屋|へや>[に]{椅子|いす}[が]ありません";
preg_match_all("/\[([^\]]*)\]/", $text, $square_brackets); //[]
preg_match_all("/{([^}]*)}/", $text, $curly_brackets); //{}
preg_match_all("/<([^}]*)>/", $text, $angle_brackets); //<>
print_r($square_brackets);
echo "\r\n";
print_r($curly_brackets);
echo "\r\n";
print_r($angle_brackets);
echo "\r\n";
Result:
Array
(
[0] => Array
(
[0] => [に]
[1] => [が]
)
[1] => Array
(
[0] => に
[1] => が
)
)
Array
(
[0] => Array
(
[0] => {椅子|いす}
)
[1] => Array
(
[0] => 椅子|いす
)
)
Array
(
[0] => Array
(
[0] => <部屋|へや>
)
[1] => Array
(
[0] => 部屋|へや
)
)
Thanks.

str_getcsv not parsing the data correctly

I have a problem with str_getcsv function for PHP.
I have this code:
<?php
$string = '#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=714000,RESOLUTION=640x480,CODECS="avc1.77.30, mp4a.40.34"';
$array = str_getcsv($string, ",", '"');
print_r($array);
Which should return:
Array
(
[0] => #EXT-X-STREAM-INF:PROGRAM-ID=1
[1] => BANDWIDTH=714000
[2] => RESOLUTION=640x480
[3] => CODECS=avc1.77.30, mp4a.40.34
)
But instead, it is returning:
Array
(
[0] => #EXT-X-STREAM-INF:PROGRAM-ID=1
[1] => BANDWIDTH=714000
[2] => RESOLUTION=640x480
[3] => CODECS="avc1.77.30
[4] => mp4a.40.34"
)
Cause it is ignoring the enclosure of the last parameter: CODECS and is spliting also that information. I'm using str_getcsv instead of just doing explode(",", $string) precisely for that reason (that function should respect the enclosure) but it is working the same as explode will do it.
The code being executed: http://eval.in/17471
The enclosure (third) parameter does not have quite that effect. The enclosure character is treated as such only when it appears next to the delimiter.
To get your desired output, the input would need to be
#EXT-X-STREAM-INF:PROGRAM-ID=1,...,"CODECS=avc1.77.30, mp4a.40.34"
See it in action.

Variable inside string

Lets say I have a string "My name is $name and my pet is $animal"
How to check if string has variables inside it? And if has, add to array like
$array = ("$name","animal");
Would it be some pregmatch? but then all $+sometextafterthesymbol needs to be extracted and $ with space after it left alone.
Any ideas?
You can use regular expressions for this. The following will match any dollar sign followed by 1 or more word characters (letters, numbers, or underscore):
preg_match_all('/\$(\w+)/', $string, $matches);
$matches:
Array
(
[0] => Array
(
[0] => $name
[1] => $animal
)
[1] => Array
(
[0] => name
[1] => animal
)
)
Remember that $string, if hardcoded, must be wrapped in single quotes (').

return empty string from preg_split

Right now i'm trying to get this:
Array
(
[0] => hello
[1] =>
[2] => goodbye
)
Where index 1 is the empty string.
$toBeSplit= 'hello,,goodbye';
$textSplitted = preg_split('/[,]+/', $toBeSplit, -1);
$textSplitted looks like this:
Array
(
[0] => hello
[1] => goodbye
)
I'm using PHP 5.3.2
[,]+ means one or more comma characters while as much as possible is matched. Use just /,/ and it works:
$textSplitted = preg_split('/,/', $toBeSplit, -1);
But you don’t even need regular expression:
$textSplitted = explode(',', $toBeSplit);
How about this:
$textSplitted = preg_split('/,/', $toBeSplit, -1);
Your split regex was grabbing all the commas, not just one.
Your pattern splits the text using a sequence of commas as separator (its syntax also isn't perfect, as you're using a character class for no reason), so two (or two hundred) commas count just as one.
Anyway, since your just using a literal character as separator, use explode():
$str = 'hello,,goodbye';
print_r(explode(',', $str));
output:
Array
(
[0] => hello
[1] =>
[2] => goodbye
)

Regex for spliting on all unescaped semi-colons

I'm using php's preg_split to split up a string based on semi-colons, but I need it to only split on non-escaped semi-colons.
<?
$str = "abc;def\\;abc;def";
$arr = preg_split("/;/", $str);
print_r($arr);
?>
Produces:
Array
(
[0] => abc
[1] => def\
[2] => abc
[3] => def
)
When I want it to produce:
Array
(
[0] => abc
[1] => def\;abc
[2] => def
)
I've tried "/(^\\)?;/" or "/[^\\]?;/" but they both produce errors. Any ideas?
This works.
<?
$str = "abc;def\;abc;def";
$arr = preg_split('/(?<!\\\);/', $str);
print_r($arr);
?>
It outputs:
Array
(
[0] => abc
[1] => def\;abc
[2] => def
)
You need to make use of a negative lookbehind (read about lookarounds). Think of "match all ';' unless preceed by a '\'".
I am not really proficient with PHP regexes, but try this one:
/(?<!\\);/
Since Bart asks: Of course you can also use regex to split on unescaped ; and take escaped escape characters into account. It just gets a bit messy:
<?
$str = "abc;def\;abc\\\\;def";
preg_match_all('/((?:[^\\\\;]|\\\.)*)(?:;|$)/', $str, $arr);
print_r($arr);
?>
Array
(
[0] => Array
(
[0] => abc;
[1] => def\;abc\\;
[2] => def
)
[1] => Array
(
[0] => abc
[1] => def\;abc\\
[2] => def
)
)
What this does is to take a regular expression for “(any character except \ and ;) or (\ followed by any character)” and allow any number of those, followed by a ; or the end of the string.
I'm not sure how php handles $ and end-of-line characters within a string, you may need to set some regex options to get exactly what you want for those.

Categories