How do you split a string into word pairs? - php

I am trying to split a string into an array of word pairs in PHP. So for example if you have the input string:
"split this string into word pairs please"
the output array should look like
Array (
[0] => split this
[1] => this string
[2] => string into
[3] => into word
[4] => word pairs
[5] => pairs please
[6] => please
)
some failed attempts include:
$array = preg_split('/\w+\s+\w+/', $string);
which gives me an empty array, and
preg_match('/\w+\s+\w+/', $string, $array);
which splits the string into word pairs but doesn't repeat the word. Is there an easy way to do this? Thanks.

Why not just use explode ?
$str = "split this string into word pairs please";
$arr = explode(' ',$str);
$result = array();
for($i=0;$i<count($arr)-1;$i++) {
$result[] = $arr[$i].' '.$arr[$i+1];
}
$result[] = $arr[$i];
Working link

If you want to repeat with a regular expression, you'll need some sort of look-ahead or look-behind. Otherwise, the expression will not match the same word multiple times:
$s = "split this string into word pairs please";
preg_match_all('/(\w+) (?=(\w+))/', $s, $matches, PREG_SET_ORDER);
$a = array_map(
function($a)
{
return $a[1].' '.$a[2];
},
$matches
);
var_dump($a);
Output:
array(6) {
[0]=>
string(10) "split this"
[1]=>
string(11) "this string"
[2]=>
string(11) "string into"
[3]=>
string(9) "into word"
[4]=>
string(10) "word pairs"
[5]=>
string(12) "pairs please"
}
Note that it does not repeat the last word "please" as you requested, although I'm not sure why you would want that behavior.

You could explode the string and then loop through it:
$str = "split this string into word pairs please";
$strSplit = explode(' ', $str);
$final = array();
for($i=0, $j=0; $i<count($strSplit); $i++, $j++)
{
$final[$j] = $strSplit[$i] . ' ' . $strSplit[$i+1];
}
I think this works, but there should be a way easier solution.
Edited to make it conform to OP's spec. - as per codaddict

$s = "split this string into word pairs please";
$b1 = $b2 = explode(' ', $s);
array_shift($b2);
$r = array_map(function($a, $b) { return "$a $b"; }, $b1, $b2);
print_r($r);
gives:
Array
(
[0] => split this
[1] => this string
[2] => string into
[3] => into word
[4] => word pairs
[5] => pairs please
[6] => please
)

Related

Converting comma-separated string of single-quoted numbers to int array

I have a string:
'24','27','38'
I want to convert it:
(
[0] => 24
[1] => 27
[2] => 38
)
The conversion: https://3v4l.org/oDPDl
array_map('intval', explode(',', $string))
gives:
Array
(
[0] => 0
[1] => 0
[2] => 0
)
Basically, array_map() works when the numbers aren't quoted like `24,27,38', but I need a technique that works with quoted numbers.
One solution is looping over the array, but I don't want to do that. Can I achieve the above using only php functions (not control structures -- e.g. foreach())?
Use the following approach:
$str = "'24','27','38'";
$result = array_map(function($v){ return (int) trim($v, "'"); }, explode(",", $str));
var_dump($result);
The output:
array(3) {
[0]=>
int(24)
[1]=>
int(27)
[2]=>
int(38)
}
$arr = explode (",", str_replace("'", "", $str));
foreach ($arr as $elem)
$array[] = trim($elem) ;
sscanf() can instantly return type-cast values if you ask it to.
Here is a technique that doesn't use an explicit loop: sscanf(preg_replace())
Code: (Demo)
var_export(sscanf($string, preg_replace('/\d+/', '%d', $string)));
Output:
array (
0 => 24,
1 => 27,
2 => 38,
)
Or some developers might find this more professional/intuitive (other will disagree): (Demo)
var_export(filter_var_array(explode("','", trim($string, "'")), FILTER_VALIDATE_INT));
// same output as above
or perhaps this alternative which leverages more commonly used native functions:
var_export(
array_map(
function($v) {
return (int)$v;
},
explode("','", trim($string, "'"))
)
);
which simplifies to:
var_export(array_map('intval', explode("','", trim($string, "'"))));
// same output as above
For anyone who doesn't care about the datatype of the newly generated elements in the output array, here are a few working techniques that return string elements: (Demo)
var_export(explode("','", trim($string, "'")));
var_export(preg_split('/\D+/', $string, -1, PREG_SPLIT_NO_EMPTY));
var_export(preg_match_all('/\d+/', $string, $m) ? $m[0] : []);
var_export(filter_var_array(explode(',', $string), FILTER_SANITIZE_NUMBER_INT));
Without looping:
$str= "'24','27','38'";
$arr = array_map("intval", explode(",", str_replace("'", "", $str)));
var_dump($arr);
Output:
array(3) {
[0]=>
int(24)
[1]=>
int(27)
[2]=>
int(38)
}

PHP - Splitting string lines up into two variables

Not sure how I would do this but if someone could point me in the right track that'll be great, basically I've got a lone line of text in a variable which looks like this:
Lambo 1; Trabant 2; Car 3;
Then I want to split "Lambo" to it's own variable then "1" to it's own variable, and repeat for the others. How would I go and do this?
I know about explode() but not sure how I would do it to split the variable up twice etc.
As requested in the comments my desired output would be like this:
$Item = "Lambo"
$Quantity = 1
Then echo them out and go back to top of loop for example and do the same for the Trabant and Car
<?php
$in = "Lambo 1; Trabant 2; Car 3;";
foreach (explode(";", $in) as $element) {
$element = trim($element);
if (strpos($element, " ") !== false ) {
list($car, $number) = explode(" ", $element);
echo "car: $car, number: $number";
}
}
You can use explode to split the input on each ;, loop over the results and then split over each .
You can use preg_split and iterate over the array by moving twice.
$output = preg_split("/ (;|vs) /", $input);
You could use preg_match_all for getting those parts:
$line = "Lambo 1; Trabant 2; Car 3;";
preg_match_all("/[^ ;]+/", $line, $matches);
$matches = $matches[0];
With that sample data, the $matches array will look like this:
Array ( "Lambo", "1", "Trabant", "2", "Car", "3" )
$new_data="Lambo 1;Trabant 2;Car 3;" ;
$new_array=explode(";", $new_data);
foreach ($new_array as $key ) {
# code...
$final_data=explode(" ", $key);
if(isset($final_data[0])){ echo "<pre>".$final_data[0]."</pre>";}
if(isset($final_data[1])){echo "<pre>".$final_data[1]."</pre>";}
}
This places each word and number in a new key of the array if you need to acess them seperatly.
preg_match_all("/(\w+) (\d+);/", $input_lines, $output_array);
Click preg_match_all
http://www.phpliveregex.com/p/fM8
Use a global regular expression match:
<?php
$subject = 'Lambo 1; Trabant 2; Car 3;';
$pattern = '/((\w+)\s+(\d+);\s?)+/Uu';
preg_match_all($pattern, $subject, $tokens);
var_dump($tokens);
The output you get is:
array(4) {
[0] =>
array(3) {
[0] =>
string(8) "Lambo 1;"
[1] =>
string(10) "Trabant 2;"
[2] =>
string(6) "Car 3;"
}
[1] =>
array(3) {
[0] =>
string(8) "Lambo 1;"
[1] =>
string(10) "Trabant 2;"
[2] =>
string(6) "Car 3;"
}
[2] =>
array(3) {
[0] =>
string(5) "Lambo"
[1] =>
string(7) "Trabant"
[2] =>
string(3) "Car"
}
[3] =>
array(3) {
[0] =>
string(1) "1"
[1] =>
string(1) "2"
[2] =>
string(1) "3"
}
}
In there the elements 2 and 3 hold exactly the tokens you are looking for.

I want to explode a variable in a little different way [duplicate]

This question already has answers here:
How can I use str_getcsv() and ignore commas between quotes?
(1 answer)
REGEX: Splitting by commas that are not in single quotes, allowing for escaped quotes
(4 answers)
Closed 6 months ago.
I have this variable.
$var = "A,B,C,D,'1,2,3,4,5,6',E,F";
I want to explode it so that I get the following array.
array(
[0] => A,
[1] => B,
[2] => C,
[3] => D,
[4] => 1,2,3,4,5,6,
[5] => E,
[6] => F
);
I used explode(',',$var) but I am not getting my desired output. Any suggestions?
There is an existing function that can parse your comma-separated string. The function is str_getcsv
It's signature is like so:
array str_getcsv ( string $input [, string $delimiter = "," [, string $enclosure = '"' [, string $escape = "\\" ]]] )
Your only change would be to change the 3rd variable, the enclosure, to single quotes rather than the default double quotes.
Here is a sample.
$var = "A,B,C,D,'1,2,3,4,5,6',E,F";
$array = str_getcsv($var,',',"'");
If you var_dump the array, you'll get the format you wanted:
array(7) {
[0]=>
string(1) "A"
[1]=>
string(1) "B"
[2]=>
string(1) "C"
[3]=>
string(1) "D"
[4]=>
string(11) "1,2,3,4,5,6"
[5]=>
string(1) "E"
[6]=>
string(1) "F"
}
Simply use preg_match_all with the following regex as follows
preg_match_all("/(?<=').*(?=')|\w+/",$var,$m);
print_r($m[0]);
Regex Explanation :
(?<=').*(?=') Capture each and every character within '(quotes)
|\w+ |(OR) Will grab rest of the characters except ,
Demo
Regex
Although preg_split along with array_map is working very good, see below an example using explode and trim
$var = "A,B,C,D,'1,2,3,4,5,6',E,F";
$a = explode("'",$var);
//print_r($a);
/*
outputs
Array
(
[0] => A,B,C,D,
[1] => 1,2,3,4,5,6
[2] => ,E,F
)
*/
$firstPart = explode(',',trim($a[0],',')); //take out the trailing comma
/*
print_r($firstPart);
outputs
Array
(
[0] => A
[1] => B
[2] => C
[3] => D
)
*/
$secondPart = array($a[1]);
$thirdPart = explode(',',trim($a[2],',')); //tale out the leading comma
/*
print_r($thirdPart);
Array
(
[0] => E
[1] => F
)
*/
$fullArray = array_merge($firstPart,$secondPart,$thirdPart);
print_r($fullArray);
/*
ouputs
Array
(
[0] => A
[1] => B
[2] => C
[3] => D
[4] => 1,2,3,4,5,6
[5] => E
[6] => F
)
*/
You need to explode the string to array.
But, you need commas after every element except last one.
Here is working example:
<?php
$var = "A,B,C,D,'1,2,3,4,5,6',E,F";
$arr = explode("'", $var);
$num = ! empty($arr[1]) ? str_replace(',', '_', $arr[1]) : '';
$nt = $arr[0] . $num . $arr[2];
$nt = explode(',', $nt);
$len = count($nt);
$na = array();
$cnt = 0;
foreach ($nt as $v) {
$v = str_replace('_', ',', $v);
$v .= ($cnt != $len - 1) ? ',' : '';
$na[] = $v;
++$cnt;
}
Demo
$var = "A,B,C,D,'1,2,3,4,5,6',E,F";
$arr = preg_split("/(,)(?=(?:[^']|'[^']*')*$)/",$var);
foreach ($arr as $data) {
$requiredData[] = str_replace("'","",$data);
}
echo '<pre>';
print_r($requiredData);
Description :
Regular Exp. :-
(?<=').*(?=') => Used to get all characters within single quotes(' '),
|\w+ |(OR) => Used to get rest of characters excepted comma(,)
Then Within foreach loop i'm replacing single quote

Split a string on different substrings, but conserve those substrings

I'm trying to split the following string:
Hello how are you<br>Foo bar hello
Into
"Hello", " how", " are", " you", "<br>", " Foo", " bar", " Hello"
Is this possible?
Don't make things harder than you have to. Use preg_split() with the PREG_SPLIT_DELIM_CAPTURE flag, and capture the <br>:
$str = 'Hello how are you<br>Foo bar hello';
$array = preg_split( '/\s+|(<br>)/', $str, -1, PREG_SPLIT_DELIM_CAPTURE);
print_r( $array);
Output:
Array
(
[0] => Hello
[1] => how
[2] => are
[3] => you
[4] => <br>
[5] => Foo
[6] => bar
[7] => hello
)
Edit: To include the space in the following token, you can use an assertion:
$array = preg_split( '/(?:\s*(?=\s))|(<br>)/', $str, -1, PREG_SPLIT_DELIM_CAPTURE);
So, the goal of preg_split() is to find a spot in the string to split. The regex we use consists of two parts, OR'd together with |:
(?:\s*(?=\s)). This starts off with a non-capturing group (?:), because when we match this part of the regex, we do not want it returned to us. Inside the non-capturing group, is \s*(?=\s), which says "match zero or more whitespace characters, but assert that the next character is a whitespace character". Looking at our input string, this makes sense:
Hello how are you<br>Foo bar hello
^ ^
The regex will start from left to right, find "Hello{space}how", and decide how to split the string. It tries to match \s* with the restriction that if it consumes any space, there needs to be one space left. So, it breaks up the string at just "Hello". When it continues, it has " how are youFoo bar hello" left. It starts the match again, trying to match from where it left off, and sees " how are", and does the same split as above. It continues until there are no matches left.
Capture <br>, with (<br>). It is captured because when we match this, we want to keep it in the output, so capturing it along with the PREG_SPLIT_DELIM_CAPTURE causes it to be returned to us when it is matched (instead of being completely consumed).
This results in:
array(8)
{
[0]=> string(5) "Hello"
[1]=> string(4) " how"
[2]=> string(4) " are"
[3]=> string(4) " you"
[4]=> string(4) "<br>"
[5]=> string(3) "Foo"
[6]=> string(4) " bar"
[7]=> string(6) " hello"
}
Not pretty, but simple enough:
$data = 'Hello how are you<br>Foo bar hello';
$split = array();
foreach (explode('<br>', $data) as $line) {
$split[] = array_merge($split, explode(' ', $line));
$split[] = '<br>';
}
array_pop($split);
print_r($split);
Or version 2:
$data = 'Hello how are you<br>Foo bar hello';
$data = preg_replace('#\s|(<br>)#', '**$1**', $data);
$split = array_filter(explode('**', $data));
print_r($split);
This is how I'd do it:
Explode the string with space as a delimiter
Loop through the parts
Use strpos and check if part contains the given tag -- <br> in this case
If it does, explode the string again with the tag as the delimiter
Push all the three items into the result array
If it doesn't, then push it into the result array
Code:
$str = 'Hello how are you<br>Foo bar hello';
$parts = explode(' ', $str);
$result = array();
foreach ($parts as $part) {
if(strpos($part, '<br>') !== FALSE) {
$arr = explode('<br>', $part);
$result = array_merge($result, $arr);
$result[] = "<br>";
}
else {
$result[] = $part;
}
}
print_r($result);
Output:
Array
(
[0] => Hello
[1] => how
[2] => are
[3] => you
[4] => Foo
[5] => <br>
[6] => bar
[7] => hello
)
Demo!
Here is a brief solution. Replace <br> by (space <br> space) and split using space:
<?php
$newStr=str_replace("<br>"," <br> ","Hello how are you<br>Foo bar hello");
$str= explode(' ',$newStr);
?>
Output of print_r($str):
(
[0] => Hello
[1] => how
[2] => are
[3] => you
[4] => <br>
[5] => Foo
[6] => bar
[7] => hello
)
Borrowing the preg_split pattern from #nickb's answer:
<?php
$string = 'Hello how are you<br>Foo bar hello';
$array = preg_split('/\s/',$string);
foreach($array as $key => $value) {
$a = preg_split( '/\s+|(<br>)/', $value, -1, PREG_SPLIT_DELIM_CAPTURE);
if(is_array($a)) {
foreach($a as $key2 => $value2) {
$result[] = $value2;
}
}
}
print_r($result);
?>
Output:
Array
(
[0] => Hello
[1] => how
[2] => are
[3] => you
[4] => <br>
[5] => Foo
[6] => bar
[7] => hello
)

Split a string on every third instance of character

How can I explode a string on every third semicolon (;)?
example data:
$string = 'piece1;piece2;piece3;piece4;piece5;piece6;piece7;piece8;';
Desired output:
$output[0] = 'piece1;piece2:piece3;'
$output[1] = 'piece4;piece5;piece6;'
$output[2] = 'piece7;piece8;'
I am sure you can do something slick with regular expressions, but why not just explode the each semicolor and then add them three at a time.
$tmp = explode(";", $string);
$i=0;
$j=0;
foreach($tmp as $piece) {
if(! ($i++ %3)) $j++; //increment every 3
$result[$j] .= $piece;
}
Easiest solution I can think of is:
$chunks = array_chunk(explode(';', $input), 3);
$output = array_map(create_function('$a', 'return implode(";",$a);'), $chunks);
Essentially the same solution as the other ones that explode and join again...
$tmp = explode(";", $string);
while ($tmp) {
$output[] = implode(';', array_splice($tmp, 0, 3));
};
$string = "piece1;piece2;piece3;piece4;piece5;piece6;piece7;piece8;piece9;";
preg_match_all('/([A-Za-z0-9\.]*;[A-Za-z0-9\.]*;[A-Za-z0-9\.]*;)/',$string,$matches);
print_r($matches);
Array
(
[0] => Array
(
[0] => piece1;piece2;piece3;
[1] => piece4;piece5;piece6;
[2] => piece7;piece8;piece9;
)
[1] => Array
(
[0] => piece1;piece2;piece3;
[1] => piece4;piece5;piece6;
[2] => piece7;piece8;piece9;
)
)
Maybe approach it from a different angle. Explode() it all, then combine it back in triples. Like so...
$str = "1;2;3;4;5;6;7;8;9";
$boobies = explode(";", $array);
while (!empty($boobies))
{
$foo = array();
$foo[] = array_shift($boobies);
$foo[] = array_shift($boobies);
$foo[] = array_shift($boobies);
$bar[] = implode(";", $foo) . ";";
}
print_r($bar);
Array
(
[0] => 1;2;3;
[1] => 4;5;6;
[2] => 7;8;9;
)
Here's a regex approach, which I can't say is all too good looking.
$str='';
for ($i=1; $i<20; $i++) {
$str .= "$i;";
}
$split = preg_split('/((?:[^;]*;){3})/', $str, -1,
PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
Output:
Array
(
[0] => 1;2;3;
[1] => 4;5;6;
[2] => 7;8;9;
[3] => 10;11;12;
[4] => 13;14;15;
[5] => 16;17;18;
[6] => 19;
)
Another regex approach.
<?php
$string = 'piece1;piece2;piece3;piece4;piece5;piece6;piece7;piece8';
preg_match_all('/([^;]+;?){1,3}/', $string, $m, PREG_SET_ORDER);
print_r($m);
Results:
Array
(
[0] => Array
(
[0] => piece1;piece2;piece3;
[1] => piece3;
)
[1] => Array
(
[0] => piece4;piece5;piece6;
[1] => piece6;
)
[2] => Array
(
[0] => piece7;piece8
[1] => piece8
)
)
Regex Split
$test = ";2;3;4;5;6;7;8;9;10;;12;;14;15;16;17;18;19;20";
// match all groups that:
// (?<=^|;) follow the beginning of the string or a ;
// [^;]* have zero or more non ; characters
// ;? maybe a semi-colon (so we catch a single group)
// [^;]*;? again (catch second item)
// [^;]* without the trailing ; (to not capture the final ;)
preg_match_all("/(?<=^|;)[^;]*;?[^;]*;?[^;]*/", $test, $matches);
var_dump($matches[0]);
array(7) {
[0]=>
string(4) ";2;3"
[1]=>
string(5) "4;5;6"
[2]=>
string(5) "7;8;9"
[3]=>
string(6) "10;;12"
[4]=>
string(6) ";14;15"
[5]=>
string(8) "16;17;18"
[6]=>
string(5) "19;20"
}
<?php
$str = 'piece1;piece2;piece3;piece4;piece5;piece6;piece7;piece8;';
$arr = array_map(function ($arr) {
return implode(";", $arr);
}, array_chunk(explode(";", $str), 3));
var_dump($arr);
outputs
array(3) {
[0]=>
string(20) "piece1;piece2;piece3"
[1]=>
string(20) "piece4;piece5;piece6"
[2]=>
string(14) "piece7;piece8;"
}
Similar to #Sebastian's earlier answer, I recommend preg_split() with a repeated pattern. The difference is that by using a non-capturing group and appending \K to restart the fullstring match, you can spare writting the PREG_SPLIT_DELIM_CAPTURE flag.
Code: (Demo)
$string = 'piece1;piece2;piece3;piece4;piece5;piece6;piece7;piece8;';
var_export(preg_split('/(?:[^;]*;){3}\K/', $string, 0, PREG_SPLIT_NO_EMPTY));
A similar technique for splitting after every 2 things can be found here. That snippet actually writes the \K before the last space character so that the trailing space is consumed while splitting.

Categories