ignoring upper case words with explode() in PHP - php

I'm new to PHP and I'm trying to explode data in a text file and put it into an array, then a table. The data in the text file looks like this:
THE MAN IN THE HIGH CASTLE by Philip K. Dick published 1965 born 1922
Assume that you cannot alter the original data. If I write:
$dataArray = explode(" ",$book);
that works for most of the data, but but splits every word of the book title into a different element. Is there a way I can tell it not to split upper case words?

Instead of explode, you may want to try using preg_split for this. It splits strings using a regular expression:
$book = 'THE MAN IN THE HIGH CASTLE by Philip K. Dick published 1965 born 1922';
// Split on all-lowercase words
print_r(preg_split('/\b\s*[a-z]+\s*\b/', $book));
Output:
Array
(
[0] => THE MAN IN THE HIGH CASTLE
[1] => Philip K. Dick
[2] => 1965
[3] => 1922
)

$input = explode("by", $book);
$title = $input[0];
$stuff = $input[1];

Related

Find and Replace Multi-Row Phrases/text Within PHP Arrays without damaging the array

Lets say I have an array:
$myarray = (
[0] => 'Johnny likes to go to school',
[1] => 'but he only likes to go on Saturday',
[2] => 'because Saturdays are the best days',
[3] => 'unless you of course include Sundays',
[4] => 'which are also pretty good days too.',
[5] => 'Sometimes Johnny likes picking Strawberrys',
[6] => 'with his mother in the fields',
[7] => 'but sometimes he likes picking blueberries'
);
Keeping the structure of this array intact, I want to be able to replace phrases within it, even if they spill over to the next or previous string. Also I don't want punctuation or case to impact it.
Examples:
String to Find:
"Sundays which are also pretty good"
Replace with:
"Mondays which are also pretty great"
After replace:
$myarray = (
[0] => 'Johnny likes to go to school',
[1] => 'but he only likes to go on Saturday',
[2] => 'because Saturdays are the best days',
[3] => 'unless you of course include Mondays',
[4] => 'which are also pretty great days too.',
[5] => 'Sometimes Johnny likes picking Strawberrys',
[6] => 'with his mother in the fields',
[7] => 'but sometimes he likes picking blueberries'
);
Curious if there is an ideal way of doing this. My original thought was, I would turn the array into a string, strip out punctuation and spaces, count the characters and replace the phrase based on the character count. But, it is getting rather complex, but it is a complex problem
This is a job for regular expressions.
The method I used was to implode the array with some glue (i.e. '&'). Then I generated a regular expression by inserting a zero-or-one check for '&' in between each character in the find string.
I used the regular expression to replace occurrences of the string we were looking for with the replacement string. Then I exploded the string back into an array using the same delimiter as above ('&')
$myarray = [
0 => 'Johnny likes to go to school',
1 => 'but he only likes to go on Saturday',
2 => 'because Saturdays are the best days',
3 => 'unless you of course include Mondays',
4 => 'which are also pretty great days too.',
5 => 'Sometimes Johnny likes picking Strawberrys',
6 => 'with his mother in the fields',
7 => 'but sometimes he likes picking blueberries'
];
// preg_quote this because we're looking for the string, not for a pattern it might contain
$findstring = preg_quote("Sundays which are also pretty good");
// htmlentities this in case it contains a string of characters which happen to be an html entity
$replacestring = htmlentities("Mondays which are also pretty great");
// Combine array into one string
// We use htmlentitles to escape the ampersand, so we can use it as a sentence delimeter
$mystring = implode("&", array_map('htmlentities', $myarray));
// Turns $findString into:
// S\&?u\&?n\&?d\&?a\&?y\&?s\&? \&?w\&?h\&?i\&?c\&?h\&? \&?a\&?r\&?e\&?
// \&?a\&?l\&?s\&?o\&? \&?p\&?r\&?e\&?t\&?t\&?y\&? \&?g\&?o\&?o\&?d
$regexstring = implode("\&?", str_split($findstring));
// Johnny likes to go to school&but he only likes to go on Saturday&because Saturdays are the
// best days&unless you of course include Mondays&which are also pretty great days
// too.&Sometimes Johnny likes picking Strawberrys&with his mother in the fields&but sometimes
// he likes picking blueberries
$finalstring = preg_replace("/$regexstring/", $replacestring, $mystring);
// Break string back up into array, and return any html entities which might have existed
// at the beginning.
$replacedarray = array_map('html_entity_decode', explode("&", $finalstring));
var_dump($replacedarray);

PHP preg_split is returning empty strings

I am trying to split a string of combined lowercase letters into separate words with each first letter of the word being capitalized. I am trying to use PHP's preg_split(), but I'm not sure that I'm using it correctly, because the words aren't delimiters. the options for words are:
1. Burger
2. Fries
3. Chicken
4. Pizza
5. Sandwich
6. Onionrings
7. Milkshake
8. Coke
The below code returns blank array elements:
<?php
$input = 'milkshakepizzachickenfriescokeburgerpizzasandwichmilkshakepizza';
$split = preg_split("/(burger|fries|chicken|pizza|sandwich|onionrings|milkshake|coke)/", $input);
var_dump($split);
All the var_dumps and the echos are for debugging purposes only. The expected output is to have one long string with space-separated menu items. For example:
Burger Coke Fries
preg_split() will split the array by the value you're giving it, just like most split()-style functions. So, of course you get an array of blanks. If you split the string "-----" by the character -, for instance, then every character is counted as a delimiter and gets scooped out of the string.
What you want is preg_match_all().
preg_match_all — Perform a global regular expression match
Store the matches in some $matches variable as I do below...
$input = 'milkshakepizzachickenfriescokeburgerpizzasandwichmilkshakepizza';
$split = preg_match_all("/(burger|fries|chicken|pizza|sandwich|onionrings|milkshake|coke)/", $input, $matches);
print_r($matches);
Working Demo.
Results:
[0] => Array
(
[0] => milkshake
[1] => pizza
[2] => chicken
[3] => fries
[4] => coke
[5] => burger
[6] => pizza
[7] => sandwich
[8] => milkshake
[9] => pizza
)
try this
<?php
$input ="burger|fries|chicken|pizza|sandwich|onionrings|milkshake|coke";
$pattern = "/[|\s:]/";
$split = preg_split($pattern,$input);
print_r ($split);
You can capture your splitters, but the bits between the splits are empty, though it's possible to discard them.
<?php
$input = 'milkshakepizzachickenfriescokeburgerpizzasandwichmilkshakepizza';
$split = preg_split("/(burger|fries|chicken|pizza|sandwich|onionrings|milkshake|coke)/", $input, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
print ucwords(implode(' ', $split));
Output:
Milkshake Pizza Chicken Fries Coke Burger Pizza Sandwich Milkshake Pizza

splitting string into php array

I have an array that outputs the following:
Array
(
[0] => #EXTM3U
[1] => #EXTINF:206,"Weird" Al Yankovic - Dare to be Stupid
[2] => E:\Dare to be Stupid.mp3
[3] => #EXTINF:156,1910 Fruitgum Company - Chewy, Chewy
[4] => E:\Chewy Chewy.mp3
[5] => #EXTINF:134,1910 Fruitgum Company - Goody Goody Gumdrops
[6] => E:\Goody Goody Gumdrops.mp3
[7] => #EXTINF:134,1910 Fruitgum Company - Simon Says
[8] => E:\Simon Says.mp3
[9] => #EXTINF:255,3 Doors Down - When I'm Gone
[10] => E:\When I'm Gone.mp3 [
11] => #EXTINF:179,? And the Mysterians - 96 Tears**
)
I need to split this array then loop through and save each value to the database, e.g:
"Weird" Al Yankovic - Dare to be Stupid
Fruitgum Company - Chewy, Chewy
Save each value above to database individually.
Thanks in advance!
Edit: Added from the comments
Let me try and explain in more detail. I start with a string that looks like this:
#EXTM3U #EXTINF:266,10cc - Dreadlock Holiday
D:\Music - Sorted\Various Artists\De Beste Pop Klassiekers Disc 1\10cc - Dreadlock Holiday.mp3
#EXTINF:263,1919 - Cry Wolf
D:\Music - Sorted\Various Artists\Gothic Rock, Vol. 3 Disc 2\1919 - Cry Wolf.mp3
#EXTINF:318,3 Doors Down - [Untitled Hidden Track]
D:\Music - Sorted\3 Doors Down\Away From The Sun\3 Doors Down - [Untitled Hidden Track].mp3
I'm then trying to strip everything out of this and just have an array of track titles, this is a playlist file for online radio. What I am doing so far:
$finaloutput = $_POST['thestring'];
$finaloutput = str_replace('#EXTINF:','',$finaloutput);
$finaloutput = str_replace('#EXTM3U','',$finaloutput);
$finaloutput = preg_split("/\r\n|\r|\n/", $finaloutput);
foreach ($finaloutput as $value) {
echo $value; echo '<br>';
}
But I still have these rows remaining, I need to try and do a str_replace between a line break and the end .mp3
D:\Music - Sorted\3 Doors Down\Away From The Sun\3 Doors Down - [Untitled Hidden Track].mp3
You can extract the relevant parts from source by use of preg_match_all with a regex like this.
$pattern = '/^#[^,\n]+,\K.*/m';
^ anchor matches start of line in multiline mode which is set with m flag.
#[^,\n]+, matches the part from # until , by use of a negated class.
\K is used to reset beginning of the reported match. We don't want the previous part.
.* the part to be extracted: Any amount of any character until end of the line.
if(preg_match_all($pattern, $finaloutput, $out) > 0);
print_r($out[0]);
PHP demo at eval.in

Get all matches with pure regex?

I'm working in PHP and need to parse strings looking like this:
Rake (100) Pot (1000) Players (andy: 10, bob: 20, cindy: 70)
I need to get the rake, pot, and rake contribution per player with names. The number of players is variable. Order is irrelevant so long as I can match player name to rake contribution in a consistent way.
For example I'm looking to get something like this:
Array
(
[0] => Rake (100) Pot (1000) Players (andy: 10, bob: 20, cindy: 70)
[1] => 100
[2] => 1000
[3] => andy
[4] => 10
[5] => bob
[6] => 20
[7] => cindy
[8] => 70
)
I was able to come up with a regex which matches the string but it only returns the last player-rake contribution pair
^Rake \(([0-9]+)\) Pot \(([0-9]+)\) Players \((?:([a-z]*): ([0-9]*)(?:, )?)*\)$
Outputs:
Array
(
[0] => Rake (100) Pot (1000) Players (andy: 10, bob: 20, cindy: 70)
[1] => 100
[2] => 1000
[3] => cindy
[4] => 70
)
I've tried using preg_match_all and g modifiers but to no success. I know preg_match_all would be able to get me what I wanted if I ONLY wanted the player-rake contribution pairs but there is data before that I also require.
Obviously I can use explode and parse the data myself but before going down that route I need to know if/how this can be done with pure regex.
You could use the below regex,
(?:^Rake \(([0-9]+)\) Pot \(([0-9]+)\) Players \(|)(\w+):?\s*(\d+)(?=[^()]*\))
DEMO
| at the last of the first non-capturing group helps the regex engine to match the characters from the remaining string using the pattern which follows the non-capturing group.
I would use the following Regex to validate the input string:
^Rake \((?<Rake>\d+)\) Pot \((?<Pot>\d+)\) Players \(((?:\w*: \d*(?:, )?)+)\)$
And then just use the explode() function on the last capture group to split the players out:
preg_match($regex, $string, $matches);
$players = explode(', ', $matches[2]);

regex to find year/month substring

Can someone help me with a regular expression to get the year and month from a text string?
Here is an example text string:
http://www.domain.com/files/images/2012/02/filename.jpg
I'd like the regex to return 2012/02.
This regex pattern would match what you need:
(?<=\/)\d{4}\/\d{2}(?=\/)
Depending on your situation and how much your strings vary - you might be able to dodge a bullet by simply using PHP's handy explode() function.
A simple demonstration - Dim the lights please...
$str = 'http://www.domain.com/files/images/2012/02/filename.jpg';
print_r( explode("/",$str) );
Returns :
Array
(
[0] => http:
[1] =>
[2] => www.domain.com
[3] => files
[4] => images
[5] => 2012 // Jack
[6] => 02 // Pot!
[7] => filename.jpg
)
The explode() function (docs here), splits a string according to a "delimiter" that you provide it. In this example I have use the / (slash) character.
So you see - you can just grab the values at 5th and 6th index to get the date values.

Categories