Use explode in a smarter way - php

I have the following string:
$string = "This is my string, that I would like to explode. But not\, this last part";
I want to explode(',', $string) the string, but explode() should not explode when there is a \ in front of the comma.
Wanted result:
array(2) {
[0] => This is my string
[1] => that I would like to explode. But not , this last part
}

I would use preg_split():
$result = preg_split('/(?<!\\\),/', $string);
print_r($result);
(?<!\\\\) is a lookbehind. So a , not preceded by a \. Using \\\ is needed to represent a single \ since it's an escape character.

Related

explode string at "newline,space,newline,space" in PHP

This is the string I'm trying to explode. This string is part of a paragraph which i need to split at every "newline,space,newline,space" :
s
1
A result from textmagic.com show it contains a \n then a space then a \n and then a space.
This is what I tried:
$values = explode("\n\s\n\s",$string); // 1
$values = explode("\n \n ",$string); // 2
$values = explode("\n\r\n\r",$string); // 3
Desired output:
Array (
[0] => s
[1] => 1
)
but none of them worked. What's wrong here?
How do I do it?
Just use explode() with PHP_EOL." ".PHP_EOL." ", which is of the format "newline, space, newline, space". Using PHP_EOL, you get the correct newline-format for your system.
$split = explode(PHP_EOL." ".PHP_EOL." ", $string);
print_r($split);
Live demo at https://3v4l.org/WpYrJ
Using preg_split() to explode() by multiple delimiters in PHP
Just a quick note here. To explode() a string using multiple delimiters in PHP you will have to make use of the regular expressions. Use pipe character to separate your delimiters.
$string = "\n\ranystring"
$chunks = preg_split('/(de1|del2|del3)/',$string,-1, PREG_SPLIT_NO_EMPTY);
// Print_r to check response output.
echo '<pre>';
print_r($chunks);
echo '</pre>';
PREG_SPLIT_NO_EMPTY – To return only non-empty pieces.

Explode the string to array in php

I have a string
string(22) ""words,one","words2""
and need to explode to an array having structure
array( [0] => words,one ,[1] => words2)
To continue on the explode option you mentioned trying, you could try the following:
$str = '"words,one","words2"';
$arr = explode('","', trim($str, '"'));
print_r($arr);
Notice the trim to remove the beginning and ending quote marks, while explode uses the inner quote marks as part of the delimiter.
Output
Array
(
[0] => words,one
[1] => words2
)
I assume your "" is a typo for "\" or '".
I use regex to capture what is inside of " with (.*?) where the ? means be lazy.
I escape the " with \" to make it read them literal.
You will have your words in $m[1].
$str = '"words,one","words2"';
Preg_match_all("/\"(.*?)\"/", $str, $m);
Var_dump($m);
https://3v4l.org/G4m4f
In case that is not a typo you can use this:
Preg_match_all("/\"+(.*?)\"+/", $str, $m);
Here I add a + to each of the " which means "there can be more than one"
Using preg_split you can try :
$str = '"words,one","words2"';
$matches = preg_split("/\",\"/", trim($str, '"'));
print_r($matches);
check : https://eval.in/945572
Assuming the the input string can be broken down as follows:
The surrounding double-quotes are always present and consist of one double-quote each.
"words,one","words2" is left after removing the surrounding double-quotes.
We can extract a csv formatted string that fgetcsv can parse.
Trimming the original and wrapping it in a stream allows us to use fgetcsv. See sample code on eval.in
$fullString= '""words,one","words2""';
$innerString = substr($fullString, 1, -1)
$tempFileHandle = fopen("php://memory", 'r+');
fputs($tempFileHandle , $innerString);
rewind($tempFileHandle);
$explodedString = fgetcsv($tempFileHandle, 0, ',', '"');
fclose($tempFileHandle);
This method also supports other similarly formatted strings:
""words,one","words2""
""words,one","words2","words3","words,4""

How can I extract or preg_replace chinese characters in a string?

I am currently have a list of string like this
蘋果,香蕉,橙。
榴蓮, 啤梨
鳳爪,排骨,雞排
24個男,2個女,30個老人
What I want to do is just explode all chinese and alphanumeric character from these strings.
How can I replace all special characters like , , 。 / " and spaces with - or _
then extract all chinese character with explode() like $str = explode("-",$str); or $str = explode("_",$str); ?
I am currently have a RegEx like this
if(/^\S[\u0391-\uFFE5 \w]+\S$/.test(value)).....
And I modified it into
$str = preg_replace("/^\S[\x{0391}-\x{FFE5} \w]+\s+\S$/u", "-", $str);
but it seems it didn't work...
the online exampls: https://www.regex101.com/r/qR8aA6/1
EDIT : my expected output(for the first sting):
firstly it should be replaced into
蘋果-香蕉-橙- or 蘋果_香蕉_橙_
then I can use $str = explode("-",$str); to make them finally become:
Array
(
[0] => 蘋果
[1] => 香蕉
[2] => 橙
)
Seems like you want something like this,
$txt = <<<EOT
蘋果,香蕉,橙。
榴蓮, 啤梨
鳳爪,排骨,雞排
24個男,2個女,30個老人
EOT;
echo preg_replace('~[^\p{L}\p{N}\n]+~u', '-', $txt);
Output:
蘋果-香蕉-橙-
榴蓮-啤梨
鳳爪-排骨-雞排
24個男-2個女-30個老人
DEMO
Explanation:
\p{L} Matches any kind of letter from any language.
\p{N} matches any kind of numeric character in any script.
\n Matches a newline character.
By putting all inside a negated character class will do the opposite operation.

Explode a paragraph into sentences in PHP

I have been using
explode(".",$mystring)
to split a paragraph into sentences. However this doen't cover sentences that have been concluded with different punctuation such as ! ? : ;
Is there a way of using an array as a delimiter instead of a single character? Alternativly is there another neat way of splitting using various punctuation?
I tried
explode(("." || "?" || "!"),$mystring)
hopefully but it didn't work...
You can use preg_split() combined with a PCRE lookahead condition to split the string after each occurance of ., ;, :, ?, !, .. while keeping the actual punctuation intact:
Code:
$subject = 'abc sdfs. def ghi; this is an.email#addre.ss! asdasdasd? abc xyz';
// split on whitespace between sentences preceded by a punctuation mark
$result = preg_split('/(?<=[.?!;:])\s+/', $subject, -1, PREG_SPLIT_NO_EMPTY);
print_r($result);
Result:
Array
(
[0] => abc sdfs.
[1] => def ghi;
[2] => this is an.email#addre.ss!
[3] => asdasdasd?
[4] => abc xyz
)
You can also add a blacklist for abbreviations (Mr., Mrs., Dr., ..) that should not be split into own sentences by inserting a negative lookbehind assertion:
$subject = 'abc sdfs. Dr. Foo said he is not a sentence; asdasdasd? abc xyz';
// split on whitespace between sentences preceded by a punctuation mark
$result = preg_split('/(?<!Mr.|Mrs.|Dr.)(?<=[.?!;:])\s+/', $subject, -1, PREG_SPLIT_NO_EMPTY);
print_r($result);
Result:
Array
(
[0] => abc sdfs.
[1] => Dr. Foo said he is not a sentence;
[2] => asdasdasd?
[3] => abc xyz
)
You can do:
preg_split('/\.|\?|!/',$mystring);
or (simpler):
preg_split('/[.?!]/',$mystring);
Assuming that you actually want the punctuations marks with the end result, have you tried:
$mystring = str_replace("?","?---",str_replace(".",".---",str_replace("!","!---",$mystring)));
$tmp = explode("---",$mystring);
Which would leave your punctuation marks in tact.
preg_split('/\s+|[.?!]/',$string);
A possible problem might be if there is an email address as it could split it onto a new line half way through.
Use preg_split and give it a regex like [\.|\?!] to split on
You can't have multiple delimiters for explode. That's what preg_split(); is for. But even then, it explodes at the delimiter, so you will get sentences returned without the punctuation marks.
You can take preg_split a step farther and flag it to return them in their own elements with PREG_SPLIT_DELIM_CAPTURE and then run some loop to implode sentence and following punctation mark in the returned array, or just use preg_match_all();:
preg_match_all('~.*?[?.!]~s', $string, $sentences);
$mylist = preg_split("/[.?!:;]/", $mystring);
You can try preg_split
$sentences = preg_split("/[.?!:;]+/", $mystring);
Please note this will remove the punctuations. If you would like to strip out leading or trailing whitespace as well
$sentences = preg_split("/[.?!:;]+\s+?/", $mystring);

Splitting a string on multiple separators in PHP

I can split a string with a comma using preg_split, like
$words = preg_split('/[,]/', $string);
How can I use a dot, a space and a semicolon to split string with any of these?
PS. I couldn't find any relevant example on the PHP preg_split page, that's why I am asking.
Try this:
<?php
$string = "foo bar baz; boo, bat";
$words = preg_split('/[,.\s;]+/', $string);
var_dump($words);
// -> ["foo", "bar", "baz", "boo", "bat"]
The Pattern explained
[] is a character class, a character class consists of multiple characters and matches to one of the characters which are inside the class
. matches the . Character, this does not need to be escaped inside character classes. Though this needs to be escaped when not in a character class, because . means "match any character".
\s matches whitespace
; to split on the semicolon, this needs not to be escaped, because it has not special meaning.
The + at the end ensures that spaces after the split characters do not show up as matches
The examples are there, not literally perhaps, but a split with multiple options for delimiter
$words = preg_split('/[ ;.,]/', $string);
something like this?
<?php
$string = "blsdk.bldf,las;kbdl aksm,alskbdklasd";
$words = preg_split('/[,\ \.;]/', $string);
print_r( $words );
result:
Array
(
[0] => blsdk
[1] => bldf
[2] => las
[3] => kbdl
[4] => aksm
[5] => alskbdklasd
)
$words = preg_split('/[\,\.\ ]/', $string);
just add these chars to your expression
$words = preg_split('/[;,. ]/', $string);
EDIT: thanks to Igoris Azanovas, escaping dot in character class is not needed ;)
$words = preg_split('/[,\.\s;]/', $string);

Categories