split paragraphs in array php - php

I have an array of this form:
$steps = array (0=> "the sentence one. the sentence two. the sentence three.",
1=> "the sentence for. the sentence 5");
and I want to have an array $steps like this:
$steps = array (0 => "the sentence one.",
1 => "the sentence two.",
.
.
4 =>"the sentence for."
);
I tried to use explode and implode but I did not succeed.

You can split your strings in your existing array using (?<=\.\s)(?=\w) regex and then iterate over all the matches using foreach loop and keep adding all the splitted strings in an array. Check this PHP code,
$steps = array (0=> "the sentence one. the sentence two. the sentence three.",
1=> "the sentence for. the sentence 5");
$arr = array();
foreach ($steps as $s) {
$mat = preg_split('/(?<=\.\s)(?=\w)/', $s);
foreach($mat as $m) {
array_push($arr,$m);
}
}
print_r($arr);
Prints,
Array
(
[0] => the sentence one.
[1] => the sentence two.
[2] => the sentence three.
[3] => the sentence for.
[4] => the sentence 5
)
This assumes that a new sentence starts after a dot . is followed by a space by looking at your current sample data. In case you have more complicated sample data containing dots in various forms, please post your such samples and if need be, my solution can be updated to accommodate them as well.

Let me know if this works for you preg_split("/\. (?=[A-Z])/", join(" ", $steps));
Your target array :
$steps = array (
0 => "The sentence one. The sentence two. The sentence three.",
1 => "The sentence for. The sentence 5"
);
$steps_unified = preg_split("/\. (?=[A-Z])/", join(" ", $steps));
print_r ($steps_unified);
You will get:
Array (
[0] => The sentence one
[1] => The sentence two
[2] => The sentence three
[3] => The sentence for
[4] => The sentence 5
)
If we use proper grammar, lines should end with a '.' and begin with a space and a Capital latter word.

Related

Explode a string where the explode condition is bunch of specific characters

I'm looking for a way to explode a string. For example, I have the following string: (we don't count the beginning - 0x)
0xa9059xbb000000000000000000000000fc7a5f48a1a1b3f48e7dcb1f23a1ea24199af4d00000000000000000000000000000000000000000000000000000000000054368
which is actually an ETH transaction input. I need to explode this string into 3 parts. Imagine 1 bunch of zeros is actually a single space and these spaces define the gates where the string should be exploded.
How can I do that?
preg_split()
This function uses a regular expression to split a string.
So in this example at two or more 0 in a row:
$arr = preg_split('/[0]{2,}/', $string);
print_r($arr);
echo PHP_EOL;
This will output the following:
Array
(
[0] => a9059xbb
[1] => fc7a5f48a1a1b3f48e7dcb1f23a1ea24199af4d
[2] => 54368
)
Be aware that you will have problems if a message itself has a 00 in it. Assuming it is used as a null-byte for "end of string", this will not happen, though.
preg_match()
This is an example using regular expressions. You can split at arbitrary points.
$string = 'a9059xbb000000000000000000000000fc7a5f48a1a1b3f48e7dcb1f23a1ea24199af4d00000000000000000000000000000000000000000000000000000000000054368';
print_r($string);
echo PHP_EOL;
$res = preg_match('/(.{4})(.{32})(.{32})/', $string, $matches);
print_r($matches);
echo PHP_EOL;
This outputs:
a9059xbb000000000000000000000000fc7a5f48a1a1b3f48e7dcb1f23a1ea24199af4d00000000000000000000000000000000000000000000000000000000000054368
Array
(
[0] => a9059xbb000000000000000000000000fc7a5f48a1a1b3f48e7dcb1f23a1ea24199a
[1] => a905
[2] => 9xbb000000000000000000000000fc7a
[3] => 5f48a1a1b3f48e7dcb1f23a1ea24199a
)
As you can see /(.{4})(.{32})(.{32})/ will find 4 bytes, then 32 and after that 32 again. Capturing groups are made with () around what you want to find. They appear in the $matches array (0 is always the whole string found).
In case you want to ignore certain parts you can express that as well:
/(.{4})9x(.{32}).{4}(.{32})/
This changes the found string:
Array
(
[0] => a9059xbb000000000000000000000000fc7a5f48a1a1b3f48e7dcb1f23a1ea24199af4d000
[1] => a905
[2] => bb000000000000000000000000fc7a5f
[3] => a1b3f48e7dcb1f23a1ea24199af4d000
)
Links
PHP documentation for the mentioned functions:
https://www.php.net/manual/en/function.preg-split.php
https://www.php.net/manual/en/book.pcre.php
Play around with the second regular expression using this demo:
https://regex101.com/r/pfZtH8/1
If you will always explode them at the same points (4 bytes(8 hexadecimal digits), 32 bytes(64 hexadecimal digits), 32 bytes(64 hexadecimal digits)), you could use substr().
$input = "0xa9059xbb000000000000000000000000fc7a5f48a1a1b3f48e7dcb1f23a1ea24199af4d00000000000000000000000000000000000000000000000000000000000054368";
$first = substr($input,2,8);
$second = substr($input,10,64);
$third = substr($input,74,64);
print_r($first);
print "<br>";
print_r($second);
print "<br>";
print_r($third);
print "<br>";
this outputs:
a9059xbb
000000000000000000000000fc7a5f48a1a1b3f48e7dcb1f23a1ea24199af4d0
0000000000000000000000000000000000000000000000000000000000054368

Php splitting a sentence

I'm trying to split a string of sentences by "." to get each sentence in an array. Like below:
$Text = "Hello, Mr. James. How are you today."
$split= explode(".", $Text);
As you can see $Text contains 2 sentences therefore i should only have 2 elements in the array. The issue i'm having is that sometimes my $Text can contain words like "Mr." or any other word which contains a "." in the middle of a sentence. This will result in the sentences being split from the middle and placed separately in the array like below:
Array ( [0] => Hello, Mr [1] => James [2] => How are you today [3] => )
You can avoid a lot of exception handling and general misery, if you can ensure that all English sentences are properly spaced at the end of each sentence -- 2 consecutive spaces. This can be difficult when dealing with some digitized strings because sometimes multi-spacing gets condensed to a single space.
This is what I mean:
$Text = "Hello, Mr. James. How are you today.";
$split = explode(" ", $Text);
var_export($split);
// array ( 0 => 'Hello, Mr. James.', 1 => 'How are you today.', )
Exploding on each space-space will give you a reliable result.
If you want good output, you'll need to use good input.
If you want to blacklist a few predictable substrings that should not be use to split the string, then you can use (*SKIP)(*FAIL) for that.
Code: (Demo)
$text = "Hello, Mr. James. How are you today.";
var_export(
preg_split('~(?:Mrs?|Miss|Ms|Prof|Rev|Col|Dr)[.?!:](*SKIP)(*F)|[.?!:]+\K\s+~', $text, 0, PREG_SPLIT_NO_EMPTY)
);
Output:
array (
0 => 'Hello, Mr. James.',
1 => 'How are you today.',
)

PHP from string to multiple arrays at the hand of placeholders

Good day,
I have an I think rather odd question and I also do not really know how to ask this question.
I want to create a string variable that looks like this:
[car]Ford[/car]
[car]Dodge[/car]
[car]Chevrolet[/car]
[car]Corvette[/car]
[motorcycle]Yamaha[/motorcycle]
[motorcycle]Ducati[/motorcycle]
[motorcycle]Gilera[/motorcycle]
[motorcycle]Kawasaki[/motorcycle]
This should be processed and look like:
$variable = array(
'car' => array(
'Ford',
'Dodge',
'Chevrolet',
'Corvette'
),
'motorcycle' => array(
'Yamaha',
'Ducati',
'Gilera',
'Kawasaki'
)
);
Does anyone know how to do this?
And what is it called what I am trying to do?
I want to explode the string into the two arrays. If it is a sub array
or two individual arrays. I do not care. I can always combine the
latter if I wish so.
But from the above mentioned string to two arrays. That is what I
want.
Solution by Dlporter98
<?php
///######## GET THE STRING FILE OR DIRECT INPUT
// $str = file_get_contents('file.txt');
$str = '[car]Ford[/car]
[car]Dodge[/car]
[car]Chevrolet[/car]
[car]Corvette[/car]
[motorcycle]Yamaha[/motorcycle]
[motorcycle]Ducati[/motorcycle]
[motorcycle]Gilera[/motorcycle]
[motorcycle]Kawasaki[/motorcycle]';
$str = explode(PHP_EOL, $str);
$finalArray = [];
foreach($str as $item){
//Use preg_match to capture the pieces of the string we want using a regular expression.
//The first capture will grab the text of the tag itself.
//The second capture will grab the text between the opening and closing tag.
//The resulting captures are placed into the matches array.
preg_match("/\[(.*?)\](.*?)\[/", $item, $matches);
//Build the final array structure.
$finalArray[$matches[1]][] = $matches[2];
}
print_r($finalArray);
?>
This gives me the following array:
Array
(
[car] => Array
(
[0] => Ford
[1] => Dodge
[2] => Chevrolet
[3] => Corvette
)
[motorcycle] => Array
(
[0] => Yamaha
[1] => Ducati
[2] => Gilera
[3] => Kawasaki
)
)
The small change I had to make was:
Change
$finalArray[$matches[1]] = $matches[2]
To:
$finalArray[$matches[1]][] = $matches[2];
Thanks a million!!
There are many ways to convert the information in this string to an associative array.
split the string on the new line into an array using the explode function:
$str = "[car]Ford[/car]
[car]Dodge[/car]
[car]Chevrolet[/car]
[car]Corvette[/car]
[motorcycle]Yamaha[/motorcycle]
[motorcycle]Ducati[/motorcycle]
[motorcycle]Gilera[/motorcycle]
[motorcycle]Kawasaki[/motorcycle]";
$items = explode(PHP_EOL, $str);
At this point each delimited item is now an array entry.
Array
(
[0] => [car]Ford[/car]
[1] => [car]Dodge[/car]
[2] => [car]Chevrolet[/car]
[3] => [car]Corvette[/car]
[4] => [motorcycle]Yamaha[/motorcycle]
[5] => [motorcycle]Ducati[/motorcycle]
[6] => [motorcycle]Gilera[/motorcycle]
[7] => [motorcycle]Kawasaki[/motorcycle]
)
Next, loop over the array and pull out the appropriate pieces needed to build the final associative array using the preg_match function with a regular expression:
$finalArray = [];
foreach($items as $item)
{
//Use preg_match to capture the pieces of the string we want using a regular expression.
//The first capture will grab the text of the tag itself.
//The second capture will grab the text between the opening and closing tag.
//The resulting captures are placed into the matches array.
preg_match("/\[(.*?)\](.*?)\[/", $item, $matches);
//Build the final array structure.
$finalArray[$matches[1]] = $matches[2]
}
The following is an example of what will be found in the matches array for a given iteration of the foreach loop.
Array
(
[0] => [motorcycle]Gilera[
[1] => motorcycle
[2] => Gilera
)
Please note that I use the PHP_EOL constant to explode the initial string. This may not work if the string was pulled from a different operating system than the one you are running this code on. You may need to replace this with the actual end of line characters that is being used by the string.
Why don't you create two separate arrays?
$cars = array("Ford", "Dodge", "Chevrolet", "Corvette");
$motorcycle = array("Yamaha", "Ducati", "Gilera", "Kawasaki");
You could also use an Associative array to do this.
$variable = array("Ford"=>"car", "Yamaha"=>"motorbike");

str_replace in array, and append text at the end

So I'm kind of stuck on this - I'm looking to replace text in an array (easily done via str_replace), but I would also like to append text onto the end of that specific array. For example, my original array is:
Array
(
[1] => DTSTART;VALUE=DATE:20130712
[2] => DTEND;VALUE=DATE:20130713
[3] => SUMMARY:Vern
[4] => UID:1fb5aa60-ff89-429e-80fd-ad157dc777b8
[5] => LAST-MODIFIED:20130711T010042Z
[6] => SEQUENCE:1374767972
)
I would like to search that array for ";VALUE=DATE" and replace it with nothing (""), but would also like to insert a text string 7 characters after each replace ("T000000"). So my resulting array would be:
Array
(
[1] => DTSTART:20130712T000000
[2] => DTEND:20130713T000000
[3] => SUMMARY:Vern
[4] => UID:1fb5aa60-ff89-429e-80fd-ad157dc777b8
[5] => LAST-MODIFIED:20130711T010042Z
[6] => SEQUENCE:1374767972
)
Is something like this possible using combinations of str_replace, substr_replace, etc? I'm fairly new to PHP and would love if someone could point me in the right direction! Thanks much
You can use preg_replace as an one-stop shop for this type of manipulation:
$array = preg_replace('/(.*);VALUE=DATE(.*)/', '$1$2T000000', $array);
The regular expression matches any string that contains ;VALUE=DATE and captures whatever precedes and follows it into capturing groups (referred to as $1 and $2 in the replacement pattern). It then replaces that string with $1 concatenated to $2 (effectively removing the search target) and appends "T000000" to the result.
The naive approach would be to loop over each element and check for ;VALUE=DATE. If it exists, remove it and append T000000.
foreach ($array as $key => $value) {
if (strpos($value, ';VALUE=DATE') !== false) {
$array[$key] = str_replace(";VALUE=DATE", "", $value) . "T000000";
}
}
You are correct str_replace() is the function that you are looking for. In addition you can use the concatenation operator . to append your string to the end of the new string. Is this what you are looking for?
$array[1] = str_replace(";VALUE=DATE", "", $array[1])."T000000";
$array[2] = str_replace(";VALUE=DATE", "", $array[2])."T000000";
for($i=0;$i<count($array);$i++){
if(strpos($array[$i], ";VALUE=DATE")){//look for the text into the string
//Text found, let's replace and append
$array[$i]=str_replace(";VALUE=DATE","",$array[$i]);
$array[$i].="T000000";
}
else{
//text not found in that position, will not replace
//Do something
}
}
If you want just to replace, just do it
$array=str_replace($array,";VALUE=DATE","");
And will replace all the text in all the array's positions...

split string by any amount of whitespace in PHP

I know how to split a string so the words between the delimitor into elements in an array using .explode() by " ".
But that only splits the string by a single whitespace character. How can I split by any amount of whitespace?
So an element in the array end when whitespace is found and the next element in the array starts when the first next non-whitespace character is found.
So something like "The quick brown fox" turns into an array with The, quick, brown, and fox are elements in the returned array.
And "jumped over the lazy dog" also splits so each word is an individual element in the returned array.
Like this:
preg_split('#\s+#', $string, null, PREG_SPLIT_NO_EMPTY);
$yourSplitArray=preg_split('/[\ \n\,]+/', $your_string);
try this
preg_split(" +", "hypertext language programming"); //for one or more whitespaces
you can see here: PHP explode() Function
<?php
$str = "Hello world. It's a beautiful day.";
print_r (explode(" ",$str));
?>
will return:
Array ( [0] => Hello [1] => world. [2] => It's [3] => a [4] => beautiful [5] => day. )

Categories