I'm a bit of an noob when it comes to SOAP client requests.
I'm hoping someone could help, I'm trying to make a soap client request to a website. I can make the request however the returned XML (which I'm turning into an Array) seems to come as a single string not separated into the XML elements.
The XML:
<ProductList>
<Product>
<ProductCode>00380</ProductCode>
<ProductName>Droopy Eye Specs</ProductName>
<BrochureDescription>Droopy Eye Specs, Black, with Metal Spring, on Display Card</BrochureDescription>
<WebDescription>Droopy Eye Specs, Black, with Metal Spring</WebDescription>
<WashingInstructions>Not Applicable</WashingInstructions>
<RRP>1.8900</RRP>
<StockQuantity>943</StockQuantity>
<VatRate>20.00</VatRate>
<Gender>UNISEX</Gender>
<PackType>on Display Card</PackType>
<PackQty>1</PackQty>
<Audience>ADULT</Audience>
<Colour>BLACK</Colour>
<ETA>2019-07-04 00:00:00.</ETA>
<CataloguePage>641</CataloguePage>
<BarCode>5020570003800</BarCode>
<Price1>0.91</Price1>
<Price2>0.00</Price2>
<Price3>0.00</Price3>
<Break1>1.00</Break1>
<Break2>0.00</Break2>
<Break3>0.00</Break3>
<unit_size>1</unit_size>
<warnings/>
<carton>120</carton>
<stdPrice1>0.91</stdPrice1>
<stdPrice2>0.00</stdPrice2>
<stdPrice3>0.00</stdPrice3>
<stdBreak1>12.00</stdBreak1>
<stdBreak2>0.00</stdBreak2>
<stdBreak3>0.00</stdBreak3>
<Photo>1</Photo>
<CatalogueCode>JN-01</CatalogueCode>
<CatalogueName>Jokes & Novelties_Assorted</CatalogueName>
<Catalogue/>
<acc_code1>32928</acc_code1>
<acc_code2>32929</acc_code2>
<alt_code1>20073</alt_code1>
<alt_code2>25202</alt_code2>
<alt_code3>29111</alt_code3>
<alt_code4>6155</alt_code4>
<alt_code5>98413</alt_code5>
<new_code/>
<art_cat/>
<ImageAvailability>No</ImageAvailability>
<Seasonal>No</Seasonal>
<p_list2/>
<Licence_Territory/>
<ThemeName>Funnyside Fancy Dress</ThemeName>
<GroupID>3</GroupID>
<GroupName>Adult Fancy Dress Costumes</GroupName>
<GroupID1>0</GroupID1>
<ThemeGroup1>Uncategorized</ThemeGroup1>
<GroupID2>0</GroupID2>
<ThemeGroup2>Uncategorized</ThemeGroup2>
<GroupID3>0</GroupID3>
<ThemeGroup3>Uncategorized</ThemeGroup3>
<EFPrice>0.9100</EFPrice>
<EFQty>1</EFQty>
<size>Not Applicable</size>
<Ext_Size>NOT APPLICABLE</Ext_Size>
<GenericCode>00380</GenericCode>
<HasImageRights>No</HasImageRights>
<Safety>Warning! Not suitable for children under 3 years due to small parts. Choking Hazard.</Safety>
<Composition/>
</Product>
<Product>
<ProductCode>00429</ProductCode>
<ProductName>Metal Handcuffs</ProductName>
<BrochureDescription>Metal Handcuffs, Silver, with Key, on Display Card</BrochureDescription>
<WebDescription>Metal Handcuffs, Silver, with Key</WebDescription>
<WashingInstructions>Not Applicable</WashingInstructions>
<RRP>3.0900</RRP>
<StockQuantity>4926</StockQuantity>
<VatRate>20.00</VatRate>
<Gender>UNISEX</Gender>
<PackType>on Display Card</PackType>
<PackQty>1</PackQty>
<Audience>ADULT</Audience>
<Colour>SILVER</Colour>
<ETA>2019-02-10 00:00:00.</ETA>
<CataloguePage>424</CataloguePage>
<BarCode>5020570004296</BarCode>
<Price1>1.50</Price1>
<Price2>0.00</Price2>
<Price3>0.00</Price3>
<Break1>1.00</Break1>
<Break2>0.00</Break2>
<Break3>0.00</Break3>
<unit_size>1</unit_size>
<warnings>FREIG, FREIG,</warnings>
<carton>96</carton>
<stdPrice1>1.50</stdPrice1>
<stdPrice2>0.00</stdPrice2>
<stdPrice3>0.00</stdPrice3>
<stdBreak1>3.00</stdBreak1>
<stdBreak2>0.00</stdBreak2>
<stdBreak3>0.00</stdBreak3>
<Photo>1</Photo>
<CatalogueCode>AC-30</CatalogueCode>
<CatalogueName>Accessories_Truncheons & Handcuffs</CatalogueName>
<Catalogue/>
<acc_code1>29535</acc_code1>
<acc_code2>33723</acc_code2>
<acc_code3>96318</acc_code3>
<alt_code1>23076</alt_code1>
<alt_code2>23918</alt_code2>
<alt_code3>30652</alt_code3>
<alt_code4>34757</alt_code4>
<alt_code5>374</alt_code5>
<new_code/>
<art_cat/>
<ImageAvailability>No</ImageAvailability>
<Seasonal>No</Seasonal>
<p_list2/>
<Licence_Territory/>
<ThemeName>Cops & Robbers Fancy Dress</ThemeName>
<GroupID>3</GroupID>
<GroupName>Adult Fancy Dress Costumes</GroupName>
<GroupID1>0</GroupID1>
<ThemeGroup1>Uncategorized</ThemeGroup1>
<GroupID2>0</GroupID2>
<ThemeGroup2>Uncategorized</ThemeGroup2>
<GroupID3>0</GroupID3>
<ThemeGroup3>Uncategorized</ThemeGroup3>
<EFPrice>1.5000</EFPrice>
<EFQty>1</EFQty>
<size>Not Applicable</size>
<Ext_Size>NOT APPLICABLE</Ext_Size>
<GenericCode>00429</GenericCode>
<HasImageRights>No</HasImageRights>
<Safety>Warning! Not suitable for children under 3 years due to small parts - Choking Hazard. Keep these details for reference. Warning! Do not over tighten as this may cause the safety catch to jam. INSTRUCTIONS: 1. LOCK Move stop bar to upper position, press cuff down on wrist and rotate the jaw until it engages ratchet. Jaw may be tightened as required. Do not over tighten. Move stop bar to down position, jaw is thus locked against travel in either direction. 2. UNLOCK Move stop bar to open</Safety>
<Composition/>
</Product>
My PHP:
$apiKey = '00000';
$clientID = 'MyID';
$LanguageCode = 'EN';
$wdsl = 'http://webservices.website.com/services/products.asmx?WSDL';
$params = array('apiKey' => $apiKey, 'clientID' => $clientID);
$soapclient = new SoapClient($wdsl);
$response = $soapclient->GetFullDataSet($params);
$array = json_decode(json_encode($response), true);
print_r ($array);
the Returned Array :
Array ( [GetFullDataSetResult] => Array ( [any] => 00380Droopy Eye SpecsDroopy Eye Specs, Black, with Metal Spring, on Display CardDroopy Eye Specs, Black, with Metal SpringNot Applicable1.890094320.00UNISEXon Display Card1ADULTBLACK2019-07-04 00:00:00.64150205700038000.910.000.001.000.000.0011200.910.000.0012.000.000.001JN-01Jokes & Novelties_Assorted3292832929200732520229111615598413NoNoFunnyside Fancy Dress3Adult Fancy Dress Costumes0Uncategorized0Uncategorized0Uncategorized0.91001Not ApplicableNOT APPLICABLE00380NoWarning! Not suitable for children under 3 years due to small parts. Choking Hazard.00429Metal HandcuffsMetal Handcuffs, Silver, with Key, on Display CardMetal Handcuffs, Silver, with KeyNot Applicable3.0900492620.00UNISEXon Display Card1ADULTSILVER2019-02-10 00:00:00.42450205700042961.500.000.001.000.000.001FREIG, FREIG, 961.500.000.003.000.000.001AC-30Accessories_Truncheons & Handcuffs29535337239631823076239183065234757374NoNoCops & Robbers Fancy Dress3Adult Fancy Dress Costumes0Uncategorized0Uncategorized0Uncategorized1.50001Not ApplicableNOT APPLICABLE00429NoWarning! Not suitable for children under 3 years due to small parts - Choking Hazard. Keep these details for reference. Warning! Do not over tighten as this may cause the safety catch to jam. INSTRUCTIONS: 1. LOCK Move stop bar to upper position, press cuff down on wrist and rotate the jaw until it engages ratchet. Jaw may be tightened as required. Do not over tighten. Move stop bar to down position, jaw is thus locked against travel in either direction. 2. UNLOCK Move stop bar to open
How do I get the XML elements in to the Array as Each Element?
Eg. Product > Productcode > ProductName > BrochureDescription > Etc...
Let me know if there is any info I've missed out. Any Help would be appreciate.
Many thanks.
Related
Taken from the Lyrics Wikia REST api XML response http://api.wikia.com/wiki/LyricWiki_API/REST , how do I parse this? I wasn't able to find any examples of other people working with these nodes.
http://lyrics.wikia.com/api.php?func=getArtist&artist=Linkin_Park&fmt=xml
<getArtistResponse>
<artist>Linkin Park</artist>
<albums>
<album>Xero</album>
<year>1997</year>
<amazonLink>
http://www.amazon.com/exec/obidos/redirect?link_code=ur2&tag=wikia-20&camp=1789&creative=9325&path=external-search%3Fsearch-type=ss%26index=music%26keyword=Linkin%20Park%20Xero
</amazonLink>
<songs>
<item>Rhinestone</item>
<item>Reading My Eyes</item>
<item>Fuse</item>
<item>Stick N' Move</item>
</songs>
<album>Hybrid Theory</album>
<year>1999</year>
<amazonLink>
http://www.amazon.com/exec/obidos/redirect?link_code=ur2&tag=wikia-20&camp=1789&creative=9325&path=external-search%3Fsearch-type=ss%26index=music%26keyword=Linkin%20Park%20Hybrid%20Theory
</amazonLink>
<songs>
<item>Carousel</item>
<item>Technique</item>
<item>Step Up</item>
<item>And One</item>
<item>High Voltage (EP Version)</item>
<item>Part Of Me</item>
<item>High Voltage</item>
<item>Esaul (Underground EP)</item>
</songs>
<album>Hybrid Theory</album>
<year>2000</year>
<amazonLink>
http://www.amazon.com/exec/obidos/redirect?link_code=ur2&tag=wikia-20&camp=1789&creative=9325&path=external-search%3Fsearch-type=ss%26index=music%26keyword=Linkin%20Park%20Hybrid%20Theory
</amazonLink>
<songs>
<item>Papercut</item>
<item>One Step Closer</item>
<item>With You</item>
<item>Points Of Authority</item>
<item>Crawling</item>
<item>Runaway</item>
<item>By Myself</item>
<item>In The End</item>
<item>A Place For My Head</item>
<item>Forgotten</item>
<item>Cure For The Itch</item>
<item>Pushing Me Away</item>
<item>My December</item>
<item>High Voltage</item>
<item>Papercut</item>
<item>Papercut</item>
<item>Points Of Authority</item>
<item>A Place For My Head</item>
</songs>
I'd like for these to look like the following with links to each song node:
Xero 1997: Rhinestone Reading My Eyes Fuse Stick N' Move
Hybrid Theory 1999: Carousel Technique Step Up Etc..
I've been able to get the song->items with `
foreach ($xml->albums->songs as $entry){
$item0 = $entry->item[0];
$item1 = $entry->item[1];
$item2 = $entry->item[2];
$item3 = $entry->item[3];
$item4 = $entry->item[4];
etc
But I have been unsuccessful in grabbing each album and year beyond the first.
Thank you for the help. :)
I'm trying to audit a vast amount of company data from companycheck.co.uk my current script appears to be looping the first 10 results from only the first page. I had the script gather more than 10 results at one point, but this caused a fatal error after around 600 results (not a timeout error, but a connection error of some sort), I need the script to be more reliable as I'm fetching over 40,000 results.
My code so far:
<?php
set_time_limit(0);
ini_set('max_execution_time', 0);
require 'vendor/autoload.php';
require "Guzzle/guzzle.phar";
// Add this to allow your app to use Guzzle and the Cookie Plugin.
use Guzzle\Http\Client as GuzzleClient;
use Guzzle\Plugin\Cookie\Cookie;
use Guzzle\Plugin\Cookie\CookiePlugin;
use Guzzle\Plugin\Cookie\CookieJar\ArrayCookieJar;
use Guzzle\Plugin\Cookie\CookieJar\CookieJarInterface;
$Pagesurl = 'http://companycheck.co.uk/search/UpdateSearchCompany?searchTerm=cars&type=name';
$pagesData = json_decode(file_get_contents($Pagesurl), true);
$resultsFound = $pagesData["hits"]["found"];
$pages = ceil($resultsFound / 10);
//echo $pages;
echo "<br>";
for ($p = 0; $p < $pages; $p++) {
$url = 'http://companycheck.co.uk/search/UpdateSearchCompany?searchTerm=cars&type=name&companyPage=' . $p . '';
$data = json_decode(file_get_contents($url), true);
for ($i = 0; $i < 11; $i++) {
$id = $data["hits"]["hit"][$i]["id"];
$TradingAddress = $data["hits"]["hit"][$i]["data"]["address"][0];
$companyName = $data["hits"]["hit"][$i]["data"]["companyname"][0];
$companyNumber = $data["hits"]["hit"][$i]["data"]["companynumber"][0];
$finalURL = "http://companycheck.co.uk/company/" . $id . "";
$httpClient = new GuzzleClient($finalURL);
$httpClient->setSslVerification(FALSE);
$cookieJar = new ArrayCookieJar();
// Create a new cookie plugin
$cookiePlugin = new CookiePlugin($cookieJar);
// Add the cookie plugin to the client
$httpClient->addSubscriber($cookiePlugin);
$httpClient->setUserAgent("Opera/9.23 (Windows NT 5.1; U; en-US)");
$request = $httpClient->get($finalURL);
$response = $request->send();
$body = $response->getBody(true);
$matches = array();
preg_match_all('/<table.*?>(.*?)<\/table>/si', $body, $table);
preg_match('/<meta name=\"keywords\" content=\"(.*?)\"\/>/si', $body, $metaName);
preg_match('/<p itemprop="streetAddress".*?>(.*?)<\/p>/si', $body, $regOffice);
echo "<table><tbody>";
echo "<tr><th>Company Name</th><td>";
echo $companyName;
echo "</td></tr>";
echo "<tr><th>Company Number</th><td>";
echo $companyNumber;
echo "</td></tr>";
echo "<tr><th>Registar Address</th><td>";
echo str_replace("<br>", " ", $regOffice[0]);
echo "</td></tr>
<tr><th>Trading Address</th><td>";
echo $TradingAddress;
echo "</td></tr>
<tr>
<th>Director Name</th>
<td>";
$name = explode(',', $metaName[1]);
echo $name[2];
echo "</td>
</tr></tbody></table>";
echo $table[0][1];
echo "<br><br><br>";
}
}
To get each page, I use http://companycheck.co.uk/search/UpdateSearchCompany?searchTerm=cars&type=name&companyPage=1 which returns json for each page from http://companycheck.co.uk/search/results?SearchCompaniesForm[name]=cars&yt1= and some data, but not all.
With this I can get the ID of each company to navigate to each link and scrape some data from the frontend of the site.
For example the first result is:
"hits":{"found":42842,"start":0,"hit":[{"id":"08958547","data":{"address":["THE ALEXANDER SUITE SILK POINT, QUEENS AVENUE, MACCLESFIELD, SK10 2BB"],"assets":[],"assetsnegative":[],"cashatbank":[],"cashatbanknegative":[],"companyname":["CAR2CARS LIMITED"],"companynumber":["08958547"],"dissolved":["0"],"liabilities":[],"liabilitiesnegative":[],"networth":[],"networthnegative":[],"postcode":["SK10 2BB"],"siccode":[]}}
So the first link is: http://companycheck.co.uk/company/08958547
Then from this I can pull table data such as:
Registered Office
THE ALEXANDER SUITE SILK POINT
QUEENS AVENUE
MACCLESFIELD
SK10 2BB
And information from the meta tags such as:
<meta name="keywords" content="CAR2CARS LIMITED, 08958547,INCWISE COMPANY SECRETARIES LIMITED,MR ROBERT CARTER"/>
An example of one of the results returned:
Company Name CAR2CARS LIMITED
Company Number 08958547
Registar Address
THE ALEXANDER SUITE SILK POINT QUEENS AVENUE MACCLESFIELD SK10 2BB
Trading Address THE ALEXANDER SUITE SILK POINT, QUEENS AVENUE, MACCLESFIELD, SK10 2BB
Director Name INCWISE COMPANY SECRETARIES LIMITED
Telephone No telephone number available.
Email Address No email address available.
Contact Person No contact person available.
Business Activity No Business Activity on record.
Each json page contains 10 company IDs to put into the URL to find the company, from each of these companies I need to scrape data from the full URL, then after these 10 move onto the next page and get the next 10 and loop this up until the last page.
It is almost certainly blocking you deliberately due to an excessive number of requests. Try putting a pause in between requests - that might help you fly under their radar.
The website you are intending to scrape appears to be a private company that is reformatting and republishing data from Companies House, the official record of company information in the UK. This company offers an API which allows 10K requests per month, and this is either free or costs GBP200/month, depending on what data you need. Since you want 40K results immediately, it is no wonder they operate IP blocks.
The rights and wrongs of scraping are complicated, but there is an important point to understand: by copying someone else's data, you are attempting to avoid the costs of collating the data yourself. By taking them from someone else's server, you are also adding to their operational costs without reimbursing them, an economic phenomenon known as an externality.
There are some cases where I am sympathetic to passing on costs in this way, such as where the scrape target is engaged in potential market abuse (e.g. monopolistic practices) and scraping has an alleviating effect. I have heard that some airline companies operate anti-scraping devices because they don't want price scrapers to bring prices down. Since bringing prices down is in the interest of the consumer one could argue that the externality can be justified (on moral, if not legal grounds).
In your case, I would suggest obtaining this data directly from Companies House, where it might be available for a much lower cost. In any case, if you republish valuable data obtained from a scrape, having dodged technical attempts to block you, you may find yourself in legal trouble anyway. If in doubt (and if there is no moral or public interest defence such as I outlined earlier) get in touch with the site operator and ask if what you want to do is OK.
Stackoverflow: I need your help!
I've been tasked with turning some (fairly) complex work diagrams for railway staff extracted from a Word document into something more usable for further processing, such as into a PHP array.
Here is a sample of one of the work diagrams:
LTP BH 4000
( Link 5)
DVR Su
On 00.22 PASS Barnham 00+34 5H97
Off 08.03 Lham 00+42
Hrs 7:41 PPTC Lham (06+24) 5N08
Traction for the above Service is
Days Su class 377
From 18/05/2014 377 PC Lham 01+46 5S62 DOO
To 24/08/2014 (Via CET)
TC Lham O Sh 01+50
PNB
377 PC Lham O Sh 03+10 5W62 DOO
(Via CWM)
DTCS Lham 03+32
377 PP Lham Shed 04+10 5W00 DOO
(Via CWM)
DTCS Lham Shed 04+24
PPTC Lham Shed (07+39) 5E24
Traction for the above Service is
class 377
PPTC Lham (06+37) 5H92
Traction for the above service is
class 377
377 PP Lham Shed 05+45 5W01 DOO
(Via CET)
377 Lham O Sh 05+57 06+28 5W01 DOO
(Via CWM)
TC Lham Shed 06+42
PPTC Lham Shed (09+58) 5H67
Traction for the above Service is
class 377
PPTC Lham Shed (07+41) 5P29 RP MO
Traction for the above Service is
class 377
(Unit forms part of 22+17
attachment)
PASS Lham 07.54 2P31
(To Bognor Regis)
Barnham 08.02
Routes 919
I've managed to process some of the data using simple regular expressions, but where I am struggling is the "middle" data which actually shows the work to be done. I am struggling because there is no real structure that defines what each line should look like, you will notice that many lines are different with some even including free text notes.
What I am looking to accomplish is to turn each row into an array that looks like the following:
$row = array("stock", "activity", "location", "departure_time", "arrival_time", "train_id", "notes");
The difficulty comes as not every line fits into this format - some lines have every "column", whereas others have one or more columns missing and other lines consist of free text.
I am by no means a text processing expert, but I cannot seem to find a solution to this problem. I'm not after a complete solution, just some pointers would be gratefully received!
Update Just for clarification, I'm not interested in the free text rows. The data they contain is not important for what I am trying to accomplish.
I'll refine this answer more as soon as more data comes in, but in the meantime I'd go with what amounts to a state machine.
You read the text one line after the other. Initially you are in the "WAITING FOR DIAGRAM" state:
$status = array(
'file' => $fp,
'manager' => 'waitForDiagram',
);
$chunk = 0;
$lineno = 0;
$manage = $status['manager'];
while (!feof($fp)) {
$line = fgets($fp, 1024); // is 1 Kb enough? Maybe not.
$lineno ++;
$manage($status, $line);
if ($status['manager'] != $manage)) {
$chunk = 0;
if (!function_exists($status['manager'])) {
trigger_error("{$manage}({$line}) -> {$status['manager']}: no such state");
}
$manage = $status['manager'];
}
if (++$chunk > ALERT) {
trigger_error("Stuck in state {$manage} since {$chunk} lines!", E_USER_ERROR);
}
}
Then you define a function for each state, beginning with the first:
function waitForDiagram(&$status, $line) {
// Part common to most such state functions:
$tokens = tokenise($line);
// Quickly check whether anything needs doing.
if (!in_array($token[0], [ "LTP" ]) {
// if not, return.
return;
}
$status['diagram'] = array(
'diagram' => array(
'title' => $token[0],
'whatever' => $token[1],
'comment' => '',
)
);
...
// In this case, all information is only in one line, so we can
// continue to the next state, which in this case is always waitForOnAndGetComments.
$status['manager'] = 'waitForOnAndGetComments';
}
function waitForOnAndGetComments(&$status, $line) {
$tokens = tokenise($line);
// If we get "On" it's the line, otherwise it is still the comment
if (!in_array($token[0], [ "On" ]) {
$status['diagram']['comments'] .= $line;
return;
}
// Otherwise we have On 00.22 PASS Barnham 00+34
// and always a next line.
$offTok = tokenise(fgets($status['fp'], 1024));
if ($offTok['0'] != "Off") {
trigger_error("Found ON, but next row is not OFF, what gives?", E_USER_ERROR);
}
$status['diagram']['on'] = array(
'time' => $tokens[1],
...
);
...
$status['diagram']['off'] = array(
'time' => $offTok[1],
'line' => $offTok[2],
...
);
$status['manager'] = 'waitForSomethingElse';
}
...and so on...
One important thing is how you tokenise the lines. If you have a clear delimiter (such as a tab) and can use explode, all well and good. Else you can try with preg_split('#\\s{2,}#'), using sequences of two or more whitespaces to separate "cells" in each "row".
I found what was causing me grief solving this. I'm loading the Word document using a tool called "antiword". Antiword seems to strip special characters such as tabs. However, I found that by passing the "-w 0" switch, these characters are preserved and parsing the diagrams using simple regular expressions became trivial. Many thanks to #Iserni for taking to time to help me, none the less.
I need some sort of database or feed to access live scores(and possibly player stats) for the NFL. I want to be able to display the scores on my site for my pickem league and show the users if their pick is winning or not.
I'm not sure how to go about this. Can someone point me in the right direction?
Also, it needs to be free.
Disclaimer: I'm the author of the tools I'm about to promote.
Over the past year, I've written a couple Python libraries that will do what you want. The first is nflgame, which gathers game data (including play-by-play) from NFL.com's GameCenter JSON feed. This includes active games where data is updated roughly every 15 seconds. nflgame has a wiki with some tips on getting started.
I released nflgame last year, and used it throughout last season. I think it is reasonably stable.
Over this past summer, I've worked on its more mature brother, nfldb. nfldb provides access to the same kind of data nflgame does, except it keeps everything stored in a relational database. nfldb also has a wiki, although it isn't entirely complete yet.
For example, this will output all current games and their scores:
import nfldb
db = nfldb.connect()
phase, year, week = nfldb.current(db)
q = nfldb.Query(db).game(season_year=year, season_type=phase, week=week)
for g in q.as_games():
print '%s (%d) at %s (%d)' % (g.home_team, g.home_score,
g.away_team, g.away_score)
Since no games are being played, that outputs all games for next week with 0 scores. This is the output with week=1: (of the 2013 season)
CLE (10) at MIA (23)
DET (34) at MIN (24)
NYJ (18) at TB (17)
BUF (21) at NE (23)
SD (28) at HOU (31)
STL (27) at ARI (24)
SF (34) at GB (28)
DAL (36) at NYG (31)
WAS (27) at PHI (33)
DEN (49) at BAL (27)
CHI (24) at CIN (21)
IND (21) at OAK (17)
JAC (2) at KC (28)
PIT (9) at TEN (16)
NO (23) at ATL (17)
CAR (7) at SEA (12)
Both are licensed under the WTFPL and are free to use for any purpose.
N.B. I realized you tagged this as PHP, but perhaps this will point you in the right direction. In particular, you could use nfldb to maintain a PostgreSQL database and query it with your PHP program.
So I found something that gives me MOST of what I was looking for. It has live game stats, but doesn't include current down, yards to go, and field position.
Regular Season:
http://www.nfl.com/liveupdate/scorestrip/ss.xml
Post Season:
http://www.nfl.com/liveupdate/scorestrip/postseason/ss.xml
I'd still like to find a live player stat feed to use to add Fantasy Football to my website, but I don't think a free one exists.
I know this is old, but this is what I use for scores only... maybe it will help someone some day. Note: there are some elements that you will not use and are specific for my site... but this would be a very good start for someone.
<?php
require('includes/application_top.php');
$week = (int)$_GET['week'];
//load source code, depending on the current week, of the website into a variable as a string
$url = "http://www.nfl.com/liveupdate/scorestrip/ss.xml"; //LIVE GAMES
if ($xmlData = file_get_contents($url)) {
$xml = simplexml_load_string($xmlData);
$json = json_encode($xml);
$games = json_decode($json, true);
}
$teamCodes = array(
'JAC' => 'JAX',
);
//build scores array, to group teams and scores together in games
$scores = array();
foreach ($games['gms']['g'] as $gameArray) {
$game = $gameArray['#attributes'];
//ONLY PULL SCORES FROM COMPLETED GAMES - F=FINAL, FO=FINAL OVERTIME
if ($game['q'] == 'F' || $game['q'] == 'FO') {
$overtime = (($game['q'] == 'FO') ? 1 : 0);
$away_team = $game['v'];
$home_team = $game['h'];
foreach ($teamCodes as $espnCode => $nflpCode) {
if ($away_team == $espnCode) $away_team = $nflpCode;
if ($home_team == $espnCode) $home_team = $nflpCode;
}
$away_score = (int)$game['vs'];
$home_score = (int)$game['hs'];
$winner = ($away_score > $home_score) ? $away_team : $home_team;
$gameID = getGameIDByTeamID($week, $home_team);
if (is_numeric(strip_tags($home_score)) && is_numeric(strip_tags($away_score))) {
$scores[] = array(
'gameID' => $gameID,
'awayteam' => $away_team,
'visitorScore' => $away_score,
'hometeam' => $home_team,
'homeScore' => $home_score,
'overtime' => $overtime,
'winner' => $winner
);
}
}
}
//see how the scores array looks
//echo '<pre>' . print_r($scores, true) . '</pre>';
echo json_encode($scores);
//game results and winning teams can now be accessed from the scores array
//e.g. $scores[0]['awayteam'] contains the name of the away team (['awayteam'] part) from the first game on the page ([0] part)
I've spent the last year or so working on a simple CLI tool to easily create your own NFL databases. It currently supports PostgreSql and Mongo natively, and you can programmatically interact with the Engine if you'd like to extend it.
Want to create your own different database (eg MySql) using the Engine (or even use Postgres/Mongo but with your own schema)? Simply implement an interface and the Engine will do the work for you.
Running everything, including the database setup and updating with all the latest stats, can be done in a single command:
ffdb setup
I know this question is old, but I also realize that there's still a need out there for a functional and easy-to-use tool to do this. The entire reason I built this is to power my own football app in the near future, and hopefully this can help others.
Also, because the question is fairly old, a lot of the answers are not working at the current time, or reference projects that are no longer maintained.
Check out the github repo page for full details on how to download the program, the CLI commands, and other information:
FFDB Github Repository
$XML = "http://www.nfl.com/liveupdate/scorestrip/ss.xml";
$lineXML = file_get_contents($XML);
$subject = $lineXML;
//match and capture week then print
$week='/w="([0-9])/';
preg_match_all($week, $subject, $week);
echo "week ".$week[1][0]."<br/>";
$week2=$week[1][0];
echo $week2;
//capture team, scores in two dimensional array
$pattern = '/hnn="(.+)"\shs="([0-9]+)"\sv="[A-Z]+"\svnn="(.+)"\svs="([0-9]+)/';
preg_match_all($pattern, $subject, $matches);
//enumerate length of array (number games played)
$count= count($matches[0]);
//print array values
for ($x = 0; $x < $count ; $x++) {
echo"<br/>";
//print home team
echo $matches[1][$x]," ",
//print home score
$matches[2][$x]," ",
//print visitor team
$matches[3][$x]," ",
//print visitor score
$matches[4][$x];
echo "<br/>";
}
I was going through problems finding a new source for the 2021 season. Well I finally found one on ESPN.
http://site.api.espn.com/apis/site/v2/sports/football/nfl/scoreboard
Returns the results in JSON format.
I recommend registering at http://developer.espn.com and get access to their JSON API. It just took me 5 minutes and they have documentation to make pretty much any call you need.
I think example will be much better than loooong description :)
Let's assume we have an array of arrays:
("Server1", "Server_1", "Main Server", "192.168.0.3")
("Server_1", "VIP Server", "Main Server")
("Server_2", "192.168.0.4")
("192.168.0.3", "192.168.0.5")
("Server_2", "Backup")
Each line contains strings which are synonyms. And as a result of processing of this array I want to get this:
("Server1", "Server_1", "Main Server", "192.168.0.3", "VIP Server", "192.168.0.5")
("Server_2", "192.168.0.4", "Backup")
So I think I need a kind of recursive algorithm. Programming language actually doesn't matter — I need only a little help with idea in general. I'm going to use php or python.
Thank you!
This problem can be reduced to a problem in graph theory where you find all groups of connected nodes in a graph.
An efficient way to solve this problem is doing a "flood fill" algorithm, which is essentially a recursive breath first search. This wikipedia entry describes the flood fill algorithm and how it applies to solving the problem of finding connected regions of a graph.
To see how the original question can be made into a question on graphs: make each entry (e.g. "Server1", "Server_1", etc.) a node on a graph. Connect nodes with edges if and only if they are synonyms. A matrix data structure is particularly appropriate for keeping track of the edges, provided you have enough memory. Otherwise a sparse data structure like a map will work, especially since the number of synonyms will likely be limited.
Server1 is Node #0
Server_1 is Node #1
Server_2 is Node #2
Then edge[0][1] = edge[1][0] = 1, indicated that there is an edge between nodes #0 and #1 ( which means that they are synonyms ). While edge[0][2] = edge[2][0] = 0, indicating that Server1 and Server_2 are not synonyms.
Complexity Analysis
Creating this data structure is pretty efficient because a single linear pass with a lookup of the mapping of strings to node numbers is enough to crate it. If you store the mapping of strings to node numbers in a dictionary then this would be a O(n log n) step.
Doing the flood fill is O(n), you only visit each node in the graph once. So, the algorithm in all is O(n log n).
Introduce integer marking, which indicates synonym groups. On start one marks all words with different marks from 1 to N.
Then search trough your collection and if you find two words with indexes i and j are synonym, then remark all of words with marking i and j with lesser number of both. After N iteration you get all groups of synonyms.
It is some dirty and not throughly efficient solution, I believe one can get more performance with union-find structures.
Edit: This probably is NOT the most efficient way of solving your problem. If you are interested in max performance (e.g., if you have millions of values), you might be interested in writing more complex algorithm.
PHP, seems to be working (at least with data from given example):
$data = array(
array("Server1", "Server_1", "Main Server", "192.168.0.3"),
array("Server_1", "VIP Server", "Main Server"),
array("Server_2", "192.168.0.4"),
array("192.168.0.3", "192.168.0.5"),
array("Server_2", "Backup"),
);
do {
$foundSynonyms = false;
foreach ( $data as $firstKey => $firstValue ) {
foreach ( $data as $secondKey => $secondValue ) {
if ( $firstKey === $secondKey ) {
continue;
}
if ( array_intersect($firstValue, $secondValue) ) {
$data[$firstKey] = array_unique(array_merge($firstValue, $secondValue));
unset($data[$secondKey]);
$foundSynonyms = true;
break 2; // outer foreach
}
}
}
} while ( $foundSynonyms );
print_r($data);
Output:
Array
(
[0] => Array
(
[0] => Server1
[1] => Server_1
[2] => Main Server
[3] => 192.168.0.3
[4] => VIP Server
[6] => 192.168.0.5
)
[2] => Array
(
[0] => Server_2
[1] => 192.168.0.4
[3] => Backup
)
)
This would yield lower complexity then the PHP example (Python 3):
a = [set(("Server1", "Server_1", "Main Server", "192.168.0.3")),
set(("Server_1", "VIP Server", "Main Server")),
set(("Server_2", "192.168.0.4")),
set(("192.168.0.3", "192.168.0.5")),
set(("Server_2", "Backup"))]
b = {}
c = set()
for s in a:
full_s = s.copy()
for d in s:
if b.get(d):
full_s.update(b[d])
for d in full_s:
b[d] = full_s
c.add(frozenset(full_s))
for k,v in b.items():
fsv = frozenset(v)
if fsv in c:
print(list(fsv))
c.remove(fsv)
I was looking for a solution in python, so I came up with this solution. If you are willing to use python data structures like sets
you can use this solution too. "It's so simple a cave man can use it."
Simply this is the logic behind it.
foreach set_of_values in value_collection:
alreadyInSynonymSet = false
foreach synonym_set in synonym_collection:
if set_of_values in synonym_set:
alreadyInSynonymSet = true
synonym_set = synonym_set.union(set_of_values)
if not alreadyInSynonymSet:
synonym_collection.append(set(set_of_values))
vals = (
("Server1", "Server_1", "Main Server", "192.168.0.3"),
("Server_1", "VIP Server", "Main Server"),
("Server_2", "192.168.0.4"),
("192.168.0.3", "192.168.0.5"),
("Server_2", "Backup"),
)
value_sets = (set(value_tup) for value_tup in vals)
synonym_collection = []
for value_set in value_sets:
isConnected = False # If connected to a term in the graph
print(f'\nCurrent Value Set: {value_set}')
for synonyms in synonym_collection:
# IF two sets are disjoint, they don't have common elements
if not set(synonyms).isdisjoint(value_set):
isConnected = True
synonyms |= value_set # Appending elements of new value_set to synonymous set
break
# If it's not related to any other term, create a new set
if not isConnected:
print ('Value set not in graph, adding to graph...')
synonym_collection.append(value_set)
print('\nDone, Completed Graphing Synonyms')
print(synonym_collection)
This will have a result of
Current Value Set: {'Server1', 'Main Server', '192.168.0.3', 'Server_1'}
Value set not in graph, adding to graph...
Current Value Set: {'VIP Server', 'Main Server', 'Server_1'}
Current Value Set: {'192.168.0.4', 'Server_2'}
Value set not in graph, adding to graph...
Current Value Set: {'192.168.0.3', '192.168.0.5'}
Current Value Set: {'Server_2', 'Backup'}
Done, Completed Graphing Synonyms
[{'VIP Server', 'Main Server', '192.168.0.3', '192.168.0.5', 'Server1', 'Server_1'}, {'192.168.0.4', 'Server_2', 'Backup'}]