PHP Regex, Only getting back partial Results - php

I have the following PHP regex:
#<tr[\s\S]*?<a class="b1"[\s\S]*?<em[^>]*>([^<]*)[\s\S]*?stars_small_([0-9].[0-9])#
Which I am using on this site:
Gamespy
I get back this data:
[1] => Array
(
[0] => AC/DC Live: Rock Band Track Pack
[1] => Ace Combat 6: Fires of Liberation
[2] => All-Pro Football 2K8
[3] => Alone in the Dark
[4] => Armored Core 4
[5] => Army of Two
[6] => Army of Two: The 40th Day
)
[2] => Array
(
[0] => 3.5
[1] => 2.5
[2] => 3.5
[3] => 3.5
[4] => 2.5
[5] => 3.5
[6] => 3.5
)
This is what I am looking for, however I don't seem to be getting back all of the data. I should get the following Titles with scores. But for some reason I am only getting some of them.
AC/DC Live: Rock Band Track Pack
Ace Combat 6: Fires of Liberation
Afro Samurai
Alan Wake
Aliens vs. Predator
All-Pro Football 2K8
Alone in the Dark
Amped 3
Armored Core 4
Army of Two
Army of Two: The 40th Day
Assassin's Creed
Assassin's Creed II
Assassin's Creed: Brotherhood
Avatar: The Game
I have tested my regex here:
http://www.solmetra.com/scripts/regex/index.php
Using this HTML:
http://justpaste.it/20u5
Any help explaining why I am only getting back some of the results would be greatly appreciated. Thanks

Change the sub-pattern stars_small_([0-9].[0-9]) to stars_small_([0-9](?:\.[0-9])?) as some of the urls only have one digit in the SRC attribute of the IMG tag.

Related

Find semantics words between different phrases

I try to find a way to define in PHP a semantic by comparing some phrases and find occurence for example i have something like
Array
(
[0] => Cats love Mouse Car Dog Fish
[1] => Pictures Cats Some Text Mouse
[2] => Game of thrones 2015 Series
[3] => Stark Series lannister John Game of thrones
[4] => Pop Rock David Bowie Music
[5] => David Great Lower Text Bowie
)
And I expected for the output
Array
(
[0] => Cats Mouse
[1] => Game of thrones Series
[2] => David Bowie
)
How can I proceed ?

PHP extracting data from text

I have an old windows 95 program that exports data without account numbers, seasonal accounts, and if accounts contains a sub account.
I am, however, able to print customer information and notes that has the above information to a pdf file and copy that text to notepad; which I would like to extract the data.
The order the data: 1) page headers (I do not need this data.)
Company Name
Customer Information and Notes
Computed Monday, August 10 2015 Page 1
2) standard titles and 3) the data after titles:
Ser Name: Block, Sunny Route: 1
Address: 3354 ASPEN RD. Frequency: Monthly
Address: ST PETE, GA 33333 Week/Day: First Monday
City State Zip: data Sched Time (HH:MM): 10:00A
Ser Phone: 555-1212 Service: BASIC SERVICE
Bill to: BLOCK,SUNNY Rate ($): 24.00
Company Name
Customer Information and Notes
Computed Monday, August 10 2015 Page 2
Address: 1123 Sligh Terms: CASH
Address: Apt B
notes: Sunny has a mean dog
Do not enter unless dog is put up
Then it loops to next customers data and so on.
The main titles never change, such as, ser name, route, address, notes, phone. There is a set number of titles in order; however, the title notes: can take 1 -16 lines; and the header is random throughout the data. and although the titles are in order, address is titled 4 times for both service- line 1 and line 2 and billing addresses- line 1 and line 2.
I would like to set variables to these titles and only take what's after them; the extraction part through PHP. Is there anyway to do this?
I don't think it's possible for a perfect solution, but FWIW, maybe this is good enough for you.
Without a known / reliable delimiter between clients, I can't think of any good way you can get the notes without having the header stuff for the next company included, unless you can do something involving a big lookup table of all client names.
I do have (an ugly) regex that may reliably help as far as the other stuff though:
$content='[the contents of your file]';
preg_match_all('~(Ser Name|Route|Address|Frequency|Week/Day|City State Zip|Sched Time \(HH:MM\)|Ser Phone|Service|Bill to|Rate \(\$\)|Terms|notes):\s*((?:(?!Ser Name|Route|Address|Frequency|Week/Day|City State Zip|Sched Time \(HH:MM\)|Ser Phone|Service|Bill to|Rate \(\$\)|Terms|notes).)+)~is',$content,$matches);
So this basically looks for the "header" and puts into first captured group, and then matches up to the next "header" and puts that into 2nd captured group.
Perhaps this is good enough for you, but TBH I can't think of anything better you can do, unless you can improve your extraction to a better format.
So your example data would output:
Array
(
[0] => Array
(
[0] => Ser Name: Block, Sunny
[1] => Route: 1
[2] => Address: 3354 ASPEN RD.
[3] => Frequency: Monthly
[4] => Address: ST PETE, GA 33333
[5] => Week/Day: First Monday
[6] => City State Zip: data
[7] => Sched Time (HH:MM): 10:00A
[8] => Ser Phone: 555-1212
[9] => Service: BASIC SERVICE
[10] => Bill to: BLOCK,SUNNY
[11] => Rate ($): 24.00
Company Name
Customer Information and Notes
Computed Monday, August 10 2015 Page 2
[12] => Address: 1123 Sligh
[13] => Terms: CASH
[14] => Address: Apt B
[15] => notes: Sunny has a mean dog
)
[1] => Array
(
[0] => Ser Name
[1] => Route
[2] => Address
[3] => Frequency
[4] => Address
[5] => Week/Day
[6] => City State Zip
[7] => Sched Time (HH:MM)
[8] => Ser Phone
[9] => Service
[10] => Bill to
[11] => Rate ($)
[12] => Address
[13] => Terms
[14] => Address
[15] => notes
)
[2] => Array
(
[0] => Block, Sunny
[1] => 1
[2] => 3354 ASPEN RD.
[3] => Monthly
[4] => ST PETE, GA 33333
[5] => First Monday
[6] => data
[7] => 10:00A
[8] => 555-1212
[9] => BASIC SERVICE
[10] => BLOCK,SUNNY
[11] => 24.00
Company Name
Customer Information and Notes
Computed Monday, August 10 2015 Page 2
[12] => 1123 Sligh
[13] => CASH
[14] => Apt B
[15] => Sunny has a mean dog
)
)

php sort multidimensional array on date column

I have an array $dataStoreForTableImport set up in the following way.
$dataStoreForTableImport['title']
$dataStoreForTableImport['content']
$dataStoreForTableImport['date']
$dataStoreForTableImport['link']
$dataStoreForTableImport['username']
$dataStoreForTableImport['website']
It contains data as below
Array
(
[0] => Array
(
[title] => Quote from Tony Blair
[content] => ... from beating it I'm afraid." (Tony Blair, Sky News) He had every opportunity to put religion in its ...
[articledate] => 28/09/2013
[link] => http://boards.fool.co.uk/message.asp?source=isesitlnk0000001&mid=12890951
[Username] => Michael Dray
[website] => The Motley Fool
)
[1] => Array
(
[title] => Re: The Tony Blair Show
[content] => ... I am irritated that he got such an easy ride; Why? Because he is not to your political liking? He was a witness; he was not on trial and he spoke under oath. What did you expect Jay to ask him? I had dealings with a QC a few years ago. He was as ...
[articledate] => 28/05/2012
[link] => http://boards.fool.co.uk/message.asp?source=isesitlnk0000001&mid=12564154
[Username] => Michael Dray
[website] => The Motley Fool
)
[2] => Array
(
[title] => Re: The Tony Blair Show
[content] => ... If your doubts about Jay's competence/bias were shared I'm sure it would have been debated ad nauseam on Radio 4. Eh - are you serious? I'm a Radio 4 fan - but thats despite its hatred of all things right of centre, not becuase of. ...
[articledate] => 28/05/2012
[link] => http://boards.fool.co.uk/message.asp?source=isesitlnk0000001&mid=12564167
[Username] => Michael Dray
[website] => The Motley Fool
)
[3] => Array
(
[title] => Re: The Tony Blair Show
[content] => ... Maybe Tony should have brought Cherie with him - remember Rupert Murdoch and the reaction of Wendi Deng to the custard pie incident. IMHO Cherie is every bit as intimidating:-) Wendi Deng did not eject the pie flinger, she intervened when he acted. Use of an angled bat to deflect criticism from ...
[articledate] => 30/05/2012
[link] => http://boards.fool.co.uk/message.asp?source=isesitlnk0000001&mid=12565346
[Username] => Michael Dray
[website] => The Motley Fool
)
[4] => Array
(
[title] => Re: The Tony Blair Show
[content] => ... What did surprise me was the fact that he had his own personal bodyguards in the hearing with him. Although, given the level of security that allowed that protester to break into the hearing, maybe he had a point! Eh? He is clearly at risk of terror ...
[articledate] => 28/05/2012
[link] => http://boards.fool.co.uk/message.asp?source=isesitlnk0000001&mid=12564500
[Username] => Michael Dray
[website] => The Motley Fool
)
I want to be able to remove rows from this array that if the articleDate is before a given date.
I have tried everything but it does not seem to work. I am not even able to get it to sort correctly by date?
The date comes in the format - February 10, 2007
I have used
$sortDate = date('d/m/Y', strtotime($sortDate));
to format it to the format shown in the array above.
Can anyone please help?
Thanks
Mike
Do the filtering and sorting of rows in the database backend. This will improve the performance of you app. Use a WHERE clause for filtering by date and an ORDER BY clause for ordering by date in you SQL query.

check if in_array php not working

I want to check a value in an array. If its not there it gets pushed to a new array. If the value is already there then its not added to the new value.
But with my code the checking of the value is not working.
Its still adding the array if the specific value is present.
I got this code:
foreach($user_movie_info['data'] as $movie) {
similar_text($movie_facebook_page['genre'], $movie['genre'], $percent);
if ($percent > 30) {
echo $movie_facebook_page['genre']. "" ."</br>";
echo $movie['genre']. "" ."</br>";
echo $percent. "" ."</br>";
echo "match! </br></br>";
// add all movie information to matched array, only if its not already present
if (!in_array($movie_facebook_page['name'], $matched_movies_array)) {
array_push($matched_movies_array, $movie_facebook_page);
}
} //foreach
If i print out the $matched_movies_array i got one array 2 times in it:
Array
(
[0] => Array
(
[about] => In theaters January 4th 2013 www.TexasChainsaw3D.com
[can_post] => 1
[description] => Lionsgate’s TEXAS CHAINSAW 3D continues the legendary story of the homicidal Sawyer family, picking up where Tobe Hooper’s 1974 horror classic left off in Newt, Texas, where for decades people went missing without a trace. The townspeople long suspected the Sawyer family, owners of a local barbeque pit, were somehow responsible. Their suspicions were finally confirmed one hot summer day when a young woman escaped the Sawyer house following the brutal murders of her four friends. Word around the small town quickly spread, and a vigilante mob of enraged locals surrounded the Sawyer stronghold, burning it to the ground and killing every last member of the family – or so they thought.
Decades later and hundreds of miles away from the original massacre, a young woman named Heather learns that she has inherited a Texas estate from a grandmother she never knew she had. After embarking on a road trip with friends to uncover her roots, she finds she is the sole owner of a lavish, isolated Victorian mansion. But her newfound wealth comes at a price as she stumbles upon a horror that awaits her in the mansion’s dank cellars…
With gruesome surprises in store for a whole new generation, TEXAS CHAINSAW 3D stars Alexandra Daddario, Dan Yeager, Tremaine ‘Trey Songz’ Neverson, Scott Eastwood, Tania Raymonde, Shaun Sipos, Keram Malicki-Sanchez, James MacDonald, Thom Barry, Paul Rae and Richard Riehle, along with special appearances from four beloved cast members from previous installments of the franchise: Gunnar Hansen (the original Leatherface), Marilyn Burns, John Dugan and Bill Moseley. The film is directed by John Luessenhop (TAKERS), from a screenplay by Adam Marcus & Debra Sullivan and Kirsten Elms, based on a story by Stephen Susco and Adam Marcus & Debra Sullivan and based on characters created by Kim Henkel and Tobe Hooper, and produced by Carl Mazzocone. Lionsgate presents a production and Main Line Pictures production.
[directed_by] => John Luessenhop
[genre] => Horror
[is_published] => 1
[produced_by] => Millennium Films
[release_date] => January 4th 2012
[screenplay_by] => Adam Marcus & Debra Sullivan and Kirsten Elms
[starring] => Alexandra Daddario, Dan Yeager, Tremaine ‘Trey Songz’ Neverson, Scott Eastwood, Tania Raymonde, Shaun Sipos, Keram Malicki-Sanchez, James MacDonald, Thom Barry, Paul Rae and Richard Riehle
[studio] => Lionsgate
[talking_about_count] => 62964
[username] => TexasChainsaw3D
[website] => www.texaschainsaw3d.com, twitter.com/lionsgatehorror, http://pinterest.com/lionsgatemovies/texas-chainsaw-3d, https://plus.google.com/u/0/+LionsgateMovies, http://instagr.am/p/Qpm0JMPtDr/
[were_here_count] => 0
[written_by] => based on a story by Stephen Susco and Adam Marcus & Debra Sullivan
[category] => Movie
[id] => 323192834416509
[name] => Texas Chainsaw 3D
[link] => http://www.facebook.com/TexasChainsaw3D
[likes] => 367992
[cover] => Array
(
[cover_id] => 4.14284428641E+14
[source] => http://sphotos-c.ak.fbcdn.net/hphotos-ak-ash3/s720x720/530974_414284428640682_1806025466_n.png
[offset_y] => 0
)
)
[1] => Array
(
[about] => The official Facebook Page for The Shining | All work and no play makes Jack a dull boy.
[awards] => (1981) Saturn Award, Best Supporting Actor, Scatman Crothers
[can_post] => 1
[description] => Get The Shining at the WB Shop: http://bit.ly/shiningdvd
[directed_by] => Stanley Kubrick
[genre] => Horror, Suspense/Thriller
[is_published] => 1
[plot_outline] => All work and no play makes Academy Award-winner Jack Nicholson ("As Good As It Gets," "Batman"), the caretaker of an isolated resort, go way off the deep end, terrorizing his young son and wife Shelley Duvall ("Roxanne"). Master filmmaker Stanley Kubrick's ("Full Metal Jacket," "2001: A Space Odyssey") visually haunting chiller, based on the bestseller by master-of-suspense Stephen King ("The Stand," "Carrie," "The Shawshank Redemption"), is an undeniable contemporary classic. Newsweek Magazine calls this "the first epic horror film," full of indelible images, and a signature role for Nicholson whose character was recently selected by AFI for its' 50 Greatest Villains.
[produced_by] => Jan Harlan, Stanley Kubrick
[release_date] => 5/23/80
[screenplay_by] => Stephen King, Diane Johnson
[starring] => Jack Nicholson, Shelley Duvall, Danny Lloyd, Scatman Crothers
[studio] => Warner Bros.
[talking_about_count] => 5594
[username] => KubrickShining
[website] => http://bit.ly/shiningdvd
[were_here_count] => 0
[written_by] => Stephen King
[category] => Movie
[id] => 135347089926692
[name] => The Shining
[link] => http://www.facebook.com/KubrickShining
[likes] => 832526
[cover] => Array
(
[cover_id] => 2.24275514367E+14
[source] => http://sphotos-f.ak.fbcdn.net/hphotos-ak-ash4/320182_224275514367182_46004854_n.jpg
[offset_y] => 85
)
)
[2] => Array
(
[about] => The official Facebook Page for The Shining | All work and no play makes Jack a dull boy.
[awards] => (1981) Saturn Award, Best Supporting Actor, Scatman Crothers
[can_post] => 1
[description] => Get The Shining at the WB Shop: http://bit.ly/shiningdvd
[directed_by] => Stanley Kubrick
[genre] => Horror, Suspense/Thriller
[is_published] => 1
[plot_outline] => All work and no play makes Academy Award-winner Jack Nicholson ("As Good As It Gets," "Batman"), the caretaker of an isolated resort, go way off the deep end, terrorizing his young son and wife Shelley Duvall ("Roxanne"). Master filmmaker Stanley Kubrick's ("Full Metal Jacket," "2001: A Space Odyssey") visually haunting chiller, based on the bestseller by master-of-suspense Stephen King ("The Stand," "Carrie," "The Shawshank Redemption"), is an undeniable contemporary classic. Newsweek Magazine calls this "the first epic horror film," full of indelible images, and a signature role for Nicholson whose character was recently selected by AFI for its' 50 Greatest Villains.
[produced_by] => Jan Harlan, Stanley Kubrick
[release_date] => 5/23/80
[screenplay_by] => Stephen King, Diane Johnson
[starring] => Jack Nicholson, Shelley Duvall, Danny Lloyd, Scatman Crothers
[studio] => Warner Bros.
[talking_about_count] => 5594
[username] => KubrickShining
[website] => http://bit.ly/shiningdvd
[were_here_count] => 0
[written_by] => Stephen King
[category] => Movie
[id] => 135347089926692
[name] => The Shining
[link] => http://www.facebook.com/KubrickShining
[likes] => 832526
[cover] => Array
(
[cover_id] => 2.24275514367E+14
[source] => http://sphotos-f.ak.fbcdn.net/hphotos-ak-ash4/320182_224275514367182_46004854_n.jpg
[offset_y] => 85
)
)
)
I get this info from the open graph api from Facebook.
$matched_movies_array does not contain the movie names. So in_array will never pass.
Try something like:
$movieIds = array();
foreach($user_movie_info['data'] as $movie) {
similar_text($movie_facebook_page['genre'], $movie['genre'], $percent);
if ($percent > 30) {
echo $movie_facebook_page['genre']. "" ."</br>";
echo $movie['genre']. "" ."</br>";
echo $percent. "" ."</br>";
echo "match! </br></br>";
// add all movie information to matched array, only if its not already present
if (!in_array($movie_facebook_page['id'], $movieIds)) {
$movieIds[] = $movie_facebook_page['id'];
array_push($matched_movies_array, $movie_facebook_page);
}
} //foreach
Or maybe even better:
$id = $movie_facebook_page['id'];
if (!isset($matched_movies_array[$id])) {
$matched_movies_array[$id] = $movie_facebook_page;
}
The in_array() function doesn't work searching for a string in a multi-dimensional array. Take this for example:
$find = array("name" => "test");
$matches = array(array("name" => "test"), array("name" => "test2"));
echo (in_array($find['name'], $matches)) ? "found" : "not found";
echo "<br /><br />";
echo (in_array($find, $matches)) ? "found" : "not found";
The first echo is "not found", the second one is "found". You should use your entire $movie_facebook_page array as the needle.

RegEx Statement Issues - PHP

I am attempting to use RegEx to strip down the following data:
mlb_s_left1=Baltimore 3 ^NY Yankees 12 (FINAL)&mlb_s_right1_1=W: Hughes L: Britton&mlb_s_right1_count=1&mlb_s_url1=http://sports.espn.go.com/mlb/boxscore?gameId=320801110&mlb_s_left2=^Chicago Sox 3 Minnesota 2 (FINAL)&mlb_s_right2_1=W: Peavy L: Diamond S: Reed&mlb_s_right2_count=1&mlb_s_url2=http://sports.espn.go.com/mlb/boxscore?gameId=320801109
I am hoping to split it apart by home team (first city), home score (first digit), away team (second city), away score (second digit), and where in the game it is (in parenthesis). This is the RegEx I have currently, but am feeling is very wrong.
preg_match_all('/mlb_s_left[0-9]=(?P<hometeam>.*?) (?P<homescore>.*?) (?P<awayteam>.*?) (?P<awayscore>.*?)\((?P<time>.*?)\)/', $content, $matches);
I would appreciate any and all help in getting this working.
I have tested following code snippet in php 5.4.5:
<?php
$foo = 'mlb_s_left1=Baltimore 3 ^NY Yankees 12 (FINAL)&mlb_s_right1_1=W: Hughes L: Britton&mlb_s_right1_count=1&mlb_s_url1=http://sports.espn.go.com/mlb/boxscore?gameId=320801110&mlb_s_left2=^Chicago Sox 3 Minnesota 2 (FINAL)&mlb_s_right2_1=W: Peavy L: Diamond S: Reed&mlb_s_right2_count=1&mlb_s_url2=http://sports.espn.go.com/mlb/boxscore?gameId=320801109';
preg_match_all('/mlb_s_left\d=\^?(?P<hometeam>[a-zA-Z]+(?:\s+[a-zA-Z]+)*)\s+(?P<homescore>\d+)\s+\^?(?P<awayteam>[a-zA-Z]+(?:\s+[a-zA-Z]+)*)\s+(?P<awayscore>\d+)\s+\((?P<time>\w+)\)/', $foo, $matches, PREG_SET_ORDER);
print_r($matches);
?>
output:
Array
(
[0] => Array
(
[0] => mlb_s_left1=Baltimore 3 ^NY Yankees 12 (FINAL)
[hometeam] => Baltimore
[1] => Baltimore
[homescore] => 3
[2] => 3
[awayteam] => NY Yankees
[3] => NY Yankees
[awayscore] => 12
[4] => 12
[time] => FINAL
[5] => FINAL
)
[1] => Array
(
[0] => mlb_s_left2=^Chicago Sox 3 Minnesota 2 (FINAL)
[hometeam] => Chicago Sox
[1] => Chicago Sox
[homescore] => 3
[2] => 3
[awayteam] => Minnesota
[3] => Minnesota
[awayscore] => 2
[4] => 2
[time] => FINAL
[5] => FINAL
)
)
Something like this should get you close.
preg_match_all('/mlb_s_left\d+=(?P<hometeam>\D+)\s+(?P<homescore>\d+)\s+(?P<awayteam>\D+)\s+(?P<awayscore>\d+)\s*\((?P<time>[^)]+)\)/',
$content, $matches);
Note that \d matches any digit, and \D matches anything that is not a digit.
[^)]+ matches one or more non-close parens characters; \s+ matches one or more whitespace chars, and \s* matches zero or more whitespace characters.
This wouldn't work very well if you have a city name with a number in it, and if you have a huge string, it's possible it could get hung up somewhere; you might consider splitting it up and matching a bit more piecemeal.
Generally speaking I would avoid .*? as a pattern match, as it basically matches almost anything. It's best for your regular expression to be as specific as possible, based on what you know about the data.

Categories