I need some sort of database or feed to access live scores(and possibly player stats) for the NFL. I want to be able to display the scores on my site for my pickem league and show the users if their pick is winning or not.
I'm not sure how to go about this. Can someone point me in the right direction?
Also, it needs to be free.
Disclaimer: I'm the author of the tools I'm about to promote.
Over the past year, I've written a couple Python libraries that will do what you want. The first is nflgame, which gathers game data (including play-by-play) from NFL.com's GameCenter JSON feed. This includes active games where data is updated roughly every 15 seconds. nflgame has a wiki with some tips on getting started.
I released nflgame last year, and used it throughout last season. I think it is reasonably stable.
Over this past summer, I've worked on its more mature brother, nfldb. nfldb provides access to the same kind of data nflgame does, except it keeps everything stored in a relational database. nfldb also has a wiki, although it isn't entirely complete yet.
For example, this will output all current games and their scores:
import nfldb
db = nfldb.connect()
phase, year, week = nfldb.current(db)
q = nfldb.Query(db).game(season_year=year, season_type=phase, week=week)
for g in q.as_games():
print '%s (%d) at %s (%d)' % (g.home_team, g.home_score,
g.away_team, g.away_score)
Since no games are being played, that outputs all games for next week with 0 scores. This is the output with week=1: (of the 2013 season)
CLE (10) at MIA (23)
DET (34) at MIN (24)
NYJ (18) at TB (17)
BUF (21) at NE (23)
SD (28) at HOU (31)
STL (27) at ARI (24)
SF (34) at GB (28)
DAL (36) at NYG (31)
WAS (27) at PHI (33)
DEN (49) at BAL (27)
CHI (24) at CIN (21)
IND (21) at OAK (17)
JAC (2) at KC (28)
PIT (9) at TEN (16)
NO (23) at ATL (17)
CAR (7) at SEA (12)
Both are licensed under the WTFPL and are free to use for any purpose.
N.B. I realized you tagged this as PHP, but perhaps this will point you in the right direction. In particular, you could use nfldb to maintain a PostgreSQL database and query it with your PHP program.
So I found something that gives me MOST of what I was looking for. It has live game stats, but doesn't include current down, yards to go, and field position.
Regular Season:
http://www.nfl.com/liveupdate/scorestrip/ss.xml
Post Season:
http://www.nfl.com/liveupdate/scorestrip/postseason/ss.xml
I'd still like to find a live player stat feed to use to add Fantasy Football to my website, but I don't think a free one exists.
I know this is old, but this is what I use for scores only... maybe it will help someone some day. Note: there are some elements that you will not use and are specific for my site... but this would be a very good start for someone.
<?php
require('includes/application_top.php');
$week = (int)$_GET['week'];
//load source code, depending on the current week, of the website into a variable as a string
$url = "http://www.nfl.com/liveupdate/scorestrip/ss.xml"; //LIVE GAMES
if ($xmlData = file_get_contents($url)) {
$xml = simplexml_load_string($xmlData);
$json = json_encode($xml);
$games = json_decode($json, true);
}
$teamCodes = array(
'JAC' => 'JAX',
);
//build scores array, to group teams and scores together in games
$scores = array();
foreach ($games['gms']['g'] as $gameArray) {
$game = $gameArray['#attributes'];
//ONLY PULL SCORES FROM COMPLETED GAMES - F=FINAL, FO=FINAL OVERTIME
if ($game['q'] == 'F' || $game['q'] == 'FO') {
$overtime = (($game['q'] == 'FO') ? 1 : 0);
$away_team = $game['v'];
$home_team = $game['h'];
foreach ($teamCodes as $espnCode => $nflpCode) {
if ($away_team == $espnCode) $away_team = $nflpCode;
if ($home_team == $espnCode) $home_team = $nflpCode;
}
$away_score = (int)$game['vs'];
$home_score = (int)$game['hs'];
$winner = ($away_score > $home_score) ? $away_team : $home_team;
$gameID = getGameIDByTeamID($week, $home_team);
if (is_numeric(strip_tags($home_score)) && is_numeric(strip_tags($away_score))) {
$scores[] = array(
'gameID' => $gameID,
'awayteam' => $away_team,
'visitorScore' => $away_score,
'hometeam' => $home_team,
'homeScore' => $home_score,
'overtime' => $overtime,
'winner' => $winner
);
}
}
}
//see how the scores array looks
//echo '<pre>' . print_r($scores, true) . '</pre>';
echo json_encode($scores);
//game results and winning teams can now be accessed from the scores array
//e.g. $scores[0]['awayteam'] contains the name of the away team (['awayteam'] part) from the first game on the page ([0] part)
I've spent the last year or so working on a simple CLI tool to easily create your own NFL databases. It currently supports PostgreSql and Mongo natively, and you can programmatically interact with the Engine if you'd like to extend it.
Want to create your own different database (eg MySql) using the Engine (or even use Postgres/Mongo but with your own schema)? Simply implement an interface and the Engine will do the work for you.
Running everything, including the database setup and updating with all the latest stats, can be done in a single command:
ffdb setup
I know this question is old, but I also realize that there's still a need out there for a functional and easy-to-use tool to do this. The entire reason I built this is to power my own football app in the near future, and hopefully this can help others.
Also, because the question is fairly old, a lot of the answers are not working at the current time, or reference projects that are no longer maintained.
Check out the github repo page for full details on how to download the program, the CLI commands, and other information:
FFDB Github Repository
$XML = "http://www.nfl.com/liveupdate/scorestrip/ss.xml";
$lineXML = file_get_contents($XML);
$subject = $lineXML;
//match and capture week then print
$week='/w="([0-9])/';
preg_match_all($week, $subject, $week);
echo "week ".$week[1][0]."<br/>";
$week2=$week[1][0];
echo $week2;
//capture team, scores in two dimensional array
$pattern = '/hnn="(.+)"\shs="([0-9]+)"\sv="[A-Z]+"\svnn="(.+)"\svs="([0-9]+)/';
preg_match_all($pattern, $subject, $matches);
//enumerate length of array (number games played)
$count= count($matches[0]);
//print array values
for ($x = 0; $x < $count ; $x++) {
echo"<br/>";
//print home team
echo $matches[1][$x]," ",
//print home score
$matches[2][$x]," ",
//print visitor team
$matches[3][$x]," ",
//print visitor score
$matches[4][$x];
echo "<br/>";
}
I was going through problems finding a new source for the 2021 season. Well I finally found one on ESPN.
http://site.api.espn.com/apis/site/v2/sports/football/nfl/scoreboard
Returns the results in JSON format.
I recommend registering at http://developer.espn.com and get access to their JSON API. It just took me 5 minutes and they have documentation to make pretty much any call you need.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
Below is a php script for this algorithmic problem:
The SHIELD is a secretive organization entrusted with the task of
guarding the world against any disaster. Their arch nemesis is the
organization called HYDRA. Unfortunately some members from HYDRA had
infiltrated into the SHIELD camp. SHIELD needed to find out all these
infiltrators to ensure that it was not compromised.
Nick Fury, the executive director and the prime SHIELD member figured out that every one in SHIELD could send a SOS signal to every
other SHIELD member he knew well. The HYDRA members could send bogus
SOS messages to others to confuse others, but they could never receive
a SOS message from a SHIELD member. Every SHIELD member would receive
a SOS message ateast one other SHIELD member, who in turn would have
received from another SHIELD member and so on till NickFury. SHIELD
had a sophisticated mechanism to capture who sent a SOS signal to
whom. Given this information, Nick needed someone to write a program
that could look into this data and figure out all HYDRA members.
Sample Input
Nick Fury : Tony Stark, Maria Hill, Norman Osborn
Hulk : Tony Stark, HawkEye, Rogers
Rogers : Thor,
Tony Stark: Pepper Potts, Nick Fury
Agent 13 : Agent-X, Nick Fury, Hitler
Thor: HawkEye, BlackWidow
BlackWidow:Hawkeye
Maria Hill : Hulk, Rogers, Nick Fury
Agent-X : Agent 13, Rogers
Norman Osborn: Tony Stark, Thor
Sample Output
Agent 13, Agent-X, Hitler
SCRIPT.php
<?php
//this is the input
$input="Nick Fury : Tony Stark, Maria Hill, Norman Osborn
Hulk : Tony Stark, HawkEye, Rogers
Rogers : Thor
Tony Stark: Pepper Potts, Nick Fury
Agent 13 : Agent-X, Nick Fury, Hitler
Thor: HawkEye, BlackWidow
BlackWidow:HawkEye
Maria Hill : Hulk, Rogers, Nick Fury
Agent-X : Agent 13, Rogers
Norman Osborn: Tony Stark, Thor";
/*this is an array for storing shield member names as
key-value pairs like Nick Fury=>Tony Stark, Maria Hill, Norman Osborn*/
$shield=array();
$keys=array(); //array for storing shield member names as keys
$values=array(); //array for storing shield member names as value
$shield_members=array(); //array which stores each valid shield member's name using below process
//$hydra=array();
$rows = explode("\n", $input); //creating array of given input row wise
$i=0;
//considering each row at once
while ($i<count($rows)) {
# code...
$temp=explode(":", $rows[$i]); //exploding row
$key=trim($temp[0]); //getting key
$value=trim($temp[1]); //getting value
$keys[]=$key; //storing each key in keys[]
$values[]=$value; //storing each value in values[]
$shield[$key]=$value; //storing these key-value pair as array in shield[]
$i++;
}
//loop to find all persons who are receiving messages from anyone
$i=0;
$recieiver_hydra=array(); //arrray to store all persons receiving messages
while($i<count($values)){
$temp = explode(",", $values[$i]); //exploding RHS values row wise
$j=0;
while ($j<count($temp)) {
# code...
if(!in_array($temp[$j], $recieiver_hydra)) //trying insert only unique values in recieiver_hydra[]
$recieiver_hydra[]=$temp[$j]; //========BUT NOT GETTING=======
$j++;
}
$i++;
}
$shield_members[]= "Nick Fury"; //starting from Nick Fury
$temp = explode(",",$shield["Nick Fury"]); //getting value from shield[] array for given key
$i=0;
while($i<count($temp)){
$shield_members[]= $temp[$i]; //adding shield members which are valid not the hydra people
$i++;
}
/* searching process for each valid shield member starting from nick fury upto last one found
and storing them in the sheil_members[] array */
for($i=1; $i<30; $i++){ //=============HERE DON'T GETTING WHAT SHOULD BE THE CONDITION TO STOP EXECUTING
if (array_key_exists(trim($shield_members[$i]),$shield))
$temp = explode(",",$shield[trim($shield_members[$i])]);
else continue;
$j=0;
while ($j<count($temp)) {
# code...
$shield_members[] = trim($temp[$j]);
$j++;
}
}
$i=0;
//printing all valid members added to shield_members[] including duplicate values
while ( $i<count($shield_members)) {
# code...
echo $shield_members[$i]."<br>";
$i++;
}
asort($shield_members); //sorting them
$res1 = array_diff($keys, $shield_members); //finding difference and getting HYDRA members from LHS
$res2= array_diff($recieiver_hydra, $shield_members); //finding difference and getting HYDRA members from RHS ======BUT NOT GETTING
/*
<form action=<?php echo $_SERVER["PHP_SELF"]?> method="POST">
**<i> Enter input in the textarea... </i>
<textarea name="input" rows="10" cols="90"></textarea><br>
<input type="submit" name="submit" value="Find HYDRA Members">
</form>
*/
?>
As per my coding and expectations, $res1 must be having all HYDRA members from LHS of inputs and $res2 must be having all HYDRA members from RHS of input.
But when printing them I am getting wrong results.
$res1 = Array ( [4] => Agent 13 [8] => Agent-X ) //THIS IS CORRECT
$res2 = Array ( [3] => HawkEye [4] => Rogers [7] => Nick Fury
[8] => Agent-X [9] => Hitler [11] => BlackWidow
[13] => Agent 13 [14] => Thor ) //WRONG RESULTS
Please help me knowing what am I doing wrong?
(I am not expert at algorithms. Whole this code I made using raw coding ideas and knowledge.)
This sounds like a exercise from some programming course. Normally i would say, that you should solve such a task by yourself (I think this will be the reason for the down votes of your question), but it is a funny problem, so here is my solution.
If i need to implement a new algorithm, i always try to split the problem into subproblems.
Doing so,
it is easier to find a good solution
as result you get a much more readable code
So problem number one is to transform the input into an array structure to work with:
function transformInput($input) {
foreach (explode("\n", $input) as $row) {
list($receiver, $senders) = explode(':', $row);
$memberTree[trim($receiver)] = array_map('trim', explode(',', $senders));
}
return $memberTree;
}
The second problem is to find all real SHIELD members:
function extractShieldMembers($memberTree, $startMember, $shieldMembers = array()) {
$shieldMembers[] = $startMember;
if (array_key_exists($startMember, $memberTree)) {
foreach ($memberTree[$startMember] as $member) {
if (!in_array($member, $shieldMembers)) {
$shieldMembers = array_merge(
$shieldMembers,
extractShieldMembers($memberTree, $member, $shieldMembers)
);
}
}
}
return $shieldMembers;
}
With this two functions in your library, it is very easy to get the list with all SHIELD members:
$memberTree = transformInput($input);
$shieldMembers = array_unique(extractShieldMembers($memberTree, 'Nick Fury'));
Now, after knowing who really belongs to SHIELD you can find all HYDRA members with this functon:
function getHydraMembers($memberTree, $shieldMembers){
$allMembers = array();
foreach ($memberTree as $key => $value) {
$allMembers[] = $key;
$allMembers = array_merge($allMembers, $value);
}
return array_unique(array_diff($allMembers, $shieldMembers));
}
$hydraMembers = getHydraMembers($memberTree, $shieldMembers);
Is this the best solution for this problem?
I don't think so, because it was just a quick approach.
But as you can see, if you try to split your code into small logic functions, you get more readable code, that you can better debug.
You also don't need as many temporary variables and counter.
I have integrated Yelp reviews into my directory site with each venue that has a Yelp ID returning the number of reviews and overall score.
Following a successful MySQL query for all venue details, I output the results of the database formatted for the user. The Yelp element is:
while ($searchresults = mysql_fetch_array($sql_result)) {
if ($yelpID = $searchresults['yelpID']) {
require('yelp.php');
if ( $numreviews > 0 ) {
$yelp = '<img src="'.$ratingimg.'" border="0" /> Read '.$numreviews.' reviews on <img src="graphics/yelp_logo_50x25.png" border="0" /><br />';
} else {
$yelp = '';
}
} //END if ($yelpID = $searchresults['yelpID']) {
} //END while ($searchresults = mysql_fetch_array($sql_result)) {
The yelp.php file returns:
$yrating = $result->rating;
$numreviews = $result->review_count;
$ratingimg = $result->rating_img_url;
$url = $result->url;
If a venue has a Yelp ID and one or more reviews then the output displays correctly, but if the venue has no Yelp ID or zero reviews then it displays the Yelp review number of the previous venue.
I've checked the $numreviews variable type and it's an integer.
So far I've tried multiple variations of the "if ( $numreviews > 0 )" statement such as testing it against >=1, !$numreviews etc., also converting the integer to a string and comparing it against other strings.
There are no errors and printing all of the variables returned gives the correct number of reviews for each property with venues having no ID or no reviews returning nothing (as opposed to zero). I've also compared it directly against $result->review_count with the same problem.
Is there a better way to make the comparison or better format of variable to work with to get the correct result?
EDIT:
The statement if ($yelpID = $searchresults['yelpID']) { is not operating as it should. It is identical to other statements in the file, validating row contents which work correctly for their given variable, e.g. $fbID = $searchresults['fbID'] etc.
When I changed require('yelp.php'); to require_once('yelp.php'); all of the venue outputs changed to showing only the first iterated result. Looking through the venues outputted, the error occurs on the first venue after a successful result which makes me think there is a pervasive piece of code in the yelp.php file, causing if ($yelpID = $searchresults['yelpID']) { to be ignored until a positive result is found (a yelpID in the db), i.e. each venue is correctly displayed with a yelp number of reviews until a blank venue is encountered. The preceding venues' number of reviews is then displayed and this continues for each blank venue until a venue is found with a yelpID when it shows the correct number again. The error reoccurs on the next venue output with no yelpID and so on.
Sample erroneous output: (line 1 is var_dump)
string(23) "bayview-hotel-bushmills"
Bayview Hotel
Read 3 reviews on yelp
Benedicts
Read 3 reviews on yelp (note no var_dump output, this link contains the url for the Bayview Hotel entry above)
string(31) "bushmills-inn-hotel-bushmills-2"
Bushmills Inn Hotel
Read 7 reviews on yelp
I suspect this would be a new question rather than clutter/confuse this one further?
END OF EDIT
Note: I'm aware of the need to upgrade to mysqli but I have thousands of lines of legacy code to update. For now I'm working on functionality before reviewing the code for best practice.
Since the yelp.php is sort of a blackbox; the best explanation for this behavior would be that it only set's those variables if it finds a match. Updating your code to this should fix that:
unset($yrating, $numreviews, $ratingimg, $url);
require('yelp.php');
I also noticed this peculiar if-statement, do you realize that's an assignment or is this a copy/paste error? If you want to test (that's what if is for)
if ($yelpID == $searchresults['yelpID']) {
I'm trying to audit a vast amount of company data from companycheck.co.uk my current script appears to be looping the first 10 results from only the first page. I had the script gather more than 10 results at one point, but this caused a fatal error after around 600 results (not a timeout error, but a connection error of some sort), I need the script to be more reliable as I'm fetching over 40,000 results.
My code so far:
<?php
set_time_limit(0);
ini_set('max_execution_time', 0);
require 'vendor/autoload.php';
require "Guzzle/guzzle.phar";
// Add this to allow your app to use Guzzle and the Cookie Plugin.
use Guzzle\Http\Client as GuzzleClient;
use Guzzle\Plugin\Cookie\Cookie;
use Guzzle\Plugin\Cookie\CookiePlugin;
use Guzzle\Plugin\Cookie\CookieJar\ArrayCookieJar;
use Guzzle\Plugin\Cookie\CookieJar\CookieJarInterface;
$Pagesurl = 'http://companycheck.co.uk/search/UpdateSearchCompany?searchTerm=cars&type=name';
$pagesData = json_decode(file_get_contents($Pagesurl), true);
$resultsFound = $pagesData["hits"]["found"];
$pages = ceil($resultsFound / 10);
//echo $pages;
echo "<br>";
for ($p = 0; $p < $pages; $p++) {
$url = 'http://companycheck.co.uk/search/UpdateSearchCompany?searchTerm=cars&type=name&companyPage=' . $p . '';
$data = json_decode(file_get_contents($url), true);
for ($i = 0; $i < 11; $i++) {
$id = $data["hits"]["hit"][$i]["id"];
$TradingAddress = $data["hits"]["hit"][$i]["data"]["address"][0];
$companyName = $data["hits"]["hit"][$i]["data"]["companyname"][0];
$companyNumber = $data["hits"]["hit"][$i]["data"]["companynumber"][0];
$finalURL = "http://companycheck.co.uk/company/" . $id . "";
$httpClient = new GuzzleClient($finalURL);
$httpClient->setSslVerification(FALSE);
$cookieJar = new ArrayCookieJar();
// Create a new cookie plugin
$cookiePlugin = new CookiePlugin($cookieJar);
// Add the cookie plugin to the client
$httpClient->addSubscriber($cookiePlugin);
$httpClient->setUserAgent("Opera/9.23 (Windows NT 5.1; U; en-US)");
$request = $httpClient->get($finalURL);
$response = $request->send();
$body = $response->getBody(true);
$matches = array();
preg_match_all('/<table.*?>(.*?)<\/table>/si', $body, $table);
preg_match('/<meta name=\"keywords\" content=\"(.*?)\"\/>/si', $body, $metaName);
preg_match('/<p itemprop="streetAddress".*?>(.*?)<\/p>/si', $body, $regOffice);
echo "<table><tbody>";
echo "<tr><th>Company Name</th><td>";
echo $companyName;
echo "</td></tr>";
echo "<tr><th>Company Number</th><td>";
echo $companyNumber;
echo "</td></tr>";
echo "<tr><th>Registar Address</th><td>";
echo str_replace("<br>", " ", $regOffice[0]);
echo "</td></tr>
<tr><th>Trading Address</th><td>";
echo $TradingAddress;
echo "</td></tr>
<tr>
<th>Director Name</th>
<td>";
$name = explode(',', $metaName[1]);
echo $name[2];
echo "</td>
</tr></tbody></table>";
echo $table[0][1];
echo "<br><br><br>";
}
}
To get each page, I use http://companycheck.co.uk/search/UpdateSearchCompany?searchTerm=cars&type=name&companyPage=1 which returns json for each page from http://companycheck.co.uk/search/results?SearchCompaniesForm[name]=cars&yt1= and some data, but not all.
With this I can get the ID of each company to navigate to each link and scrape some data from the frontend of the site.
For example the first result is:
"hits":{"found":42842,"start":0,"hit":[{"id":"08958547","data":{"address":["THE ALEXANDER SUITE SILK POINT, QUEENS AVENUE, MACCLESFIELD, SK10 2BB"],"assets":[],"assetsnegative":[],"cashatbank":[],"cashatbanknegative":[],"companyname":["CAR2CARS LIMITED"],"companynumber":["08958547"],"dissolved":["0"],"liabilities":[],"liabilitiesnegative":[],"networth":[],"networthnegative":[],"postcode":["SK10 2BB"],"siccode":[]}}
So the first link is: http://companycheck.co.uk/company/08958547
Then from this I can pull table data such as:
Registered Office
THE ALEXANDER SUITE SILK POINT
QUEENS AVENUE
MACCLESFIELD
SK10 2BB
And information from the meta tags such as:
<meta name="keywords" content="CAR2CARS LIMITED, 08958547,INCWISE COMPANY SECRETARIES LIMITED,MR ROBERT CARTER"/>
An example of one of the results returned:
Company Name CAR2CARS LIMITED
Company Number 08958547
Registar Address
THE ALEXANDER SUITE SILK POINT QUEENS AVENUE MACCLESFIELD SK10 2BB
Trading Address THE ALEXANDER SUITE SILK POINT, QUEENS AVENUE, MACCLESFIELD, SK10 2BB
Director Name INCWISE COMPANY SECRETARIES LIMITED
Telephone No telephone number available.
Email Address No email address available.
Contact Person No contact person available.
Business Activity No Business Activity on record.
Each json page contains 10 company IDs to put into the URL to find the company, from each of these companies I need to scrape data from the full URL, then after these 10 move onto the next page and get the next 10 and loop this up until the last page.
It is almost certainly blocking you deliberately due to an excessive number of requests. Try putting a pause in between requests - that might help you fly under their radar.
The website you are intending to scrape appears to be a private company that is reformatting and republishing data from Companies House, the official record of company information in the UK. This company offers an API which allows 10K requests per month, and this is either free or costs GBP200/month, depending on what data you need. Since you want 40K results immediately, it is no wonder they operate IP blocks.
The rights and wrongs of scraping are complicated, but there is an important point to understand: by copying someone else's data, you are attempting to avoid the costs of collating the data yourself. By taking them from someone else's server, you are also adding to their operational costs without reimbursing them, an economic phenomenon known as an externality.
There are some cases where I am sympathetic to passing on costs in this way, such as where the scrape target is engaged in potential market abuse (e.g. monopolistic practices) and scraping has an alleviating effect. I have heard that some airline companies operate anti-scraping devices because they don't want price scrapers to bring prices down. Since bringing prices down is in the interest of the consumer one could argue that the externality can be justified (on moral, if not legal grounds).
In your case, I would suggest obtaining this data directly from Companies House, where it might be available for a much lower cost. In any case, if you republish valuable data obtained from a scrape, having dodged technical attempts to block you, you may find yourself in legal trouble anyway. If in doubt (and if there is no moral or public interest defence such as I outlined earlier) get in touch with the site operator and ask if what you want to do is OK.
This is my problem. I'm trying to retrieve a value from an XML url (from last.fm api). For example the biography of an artist.
I already know that I can do it simply like this (e.g. for Adele):
<?php
$xml = simplexml_load_file("http://ws.audioscrobbler.com/2.0/?method=artist.getinfo&artist=adele&api_key=b25b959554ed76058ac220b7b2e0a026");
$info = $xml->artist->bio->summary;
echo $info;
?>
This returns:
Adele Laurie Blue Adkins, (born 5 May 1988), is a Grammy Award-Winning English singer-songwriter from Enfield, North London. Her debut album, <a title="Adele - 19" href="http://www.last.fm/music/Adele/19" class="bbcode_album">19</a>, was released in January 2008 and entered the UK album chart at #1. The album has since received four-times Platinum certification in the UK and has sold 5,500,000 copies worldwide. The album included the hugely popular song <a title="Adele – Chasing Pavements" href="http://www.last.fm/music/Adele/_/Chasing+Pavements" class="bbcode_track">Chasing Pavements</a>. 19 earned Adele two Grammy Awards in February 2009 for Best New Artist and Best Female Pop Vocal Performance.
I would now like to put that result in a text file and store it on a folder on my website, for example: "http://mysite.com/artistdescriptions/adele.txt".
I've already tried:
<?php
$file_path = $_SERVER['DOCUMENT_ROOT'].'/artistdescriptions/adele.txt';
$xml = simplexml_load_file("http://ws.audioscrobbler.com/2.0/?method=artist.getinfo&artist=adele&api_key=b25b959554ed76058ac220b7b2e0a026");
$info = $xml->artist->bio->summary;
file_put_contents($file_path, serialize($info));
$file_contents = file_get_contents($file_path);
if(!empty($file_contents)) {
$data = unserialize($file_contents);
echo $data;
}
?>
But this unfortunately didn't work. Any help would be greatly appreciated!
Echo call magic method __toString(), but serialize doesn't, and in SimpleXMLElement isn't serialize allowed, so it throw an Exception. But you can call __toString() magic method by yourself, and this solve you problem.
Replace your line with mine
$info = $xml->artist->bio->summary->__toString();
I think example will be much better than loooong description :)
Let's assume we have an array of arrays:
("Server1", "Server_1", "Main Server", "192.168.0.3")
("Server_1", "VIP Server", "Main Server")
("Server_2", "192.168.0.4")
("192.168.0.3", "192.168.0.5")
("Server_2", "Backup")
Each line contains strings which are synonyms. And as a result of processing of this array I want to get this:
("Server1", "Server_1", "Main Server", "192.168.0.3", "VIP Server", "192.168.0.5")
("Server_2", "192.168.0.4", "Backup")
So I think I need a kind of recursive algorithm. Programming language actually doesn't matter — I need only a little help with idea in general. I'm going to use php or python.
Thank you!
This problem can be reduced to a problem in graph theory where you find all groups of connected nodes in a graph.
An efficient way to solve this problem is doing a "flood fill" algorithm, which is essentially a recursive breath first search. This wikipedia entry describes the flood fill algorithm and how it applies to solving the problem of finding connected regions of a graph.
To see how the original question can be made into a question on graphs: make each entry (e.g. "Server1", "Server_1", etc.) a node on a graph. Connect nodes with edges if and only if they are synonyms. A matrix data structure is particularly appropriate for keeping track of the edges, provided you have enough memory. Otherwise a sparse data structure like a map will work, especially since the number of synonyms will likely be limited.
Server1 is Node #0
Server_1 is Node #1
Server_2 is Node #2
Then edge[0][1] = edge[1][0] = 1, indicated that there is an edge between nodes #0 and #1 ( which means that they are synonyms ). While edge[0][2] = edge[2][0] = 0, indicating that Server1 and Server_2 are not synonyms.
Complexity Analysis
Creating this data structure is pretty efficient because a single linear pass with a lookup of the mapping of strings to node numbers is enough to crate it. If you store the mapping of strings to node numbers in a dictionary then this would be a O(n log n) step.
Doing the flood fill is O(n), you only visit each node in the graph once. So, the algorithm in all is O(n log n).
Introduce integer marking, which indicates synonym groups. On start one marks all words with different marks from 1 to N.
Then search trough your collection and if you find two words with indexes i and j are synonym, then remark all of words with marking i and j with lesser number of both. After N iteration you get all groups of synonyms.
It is some dirty and not throughly efficient solution, I believe one can get more performance with union-find structures.
Edit: This probably is NOT the most efficient way of solving your problem. If you are interested in max performance (e.g., if you have millions of values), you might be interested in writing more complex algorithm.
PHP, seems to be working (at least with data from given example):
$data = array(
array("Server1", "Server_1", "Main Server", "192.168.0.3"),
array("Server_1", "VIP Server", "Main Server"),
array("Server_2", "192.168.0.4"),
array("192.168.0.3", "192.168.0.5"),
array("Server_2", "Backup"),
);
do {
$foundSynonyms = false;
foreach ( $data as $firstKey => $firstValue ) {
foreach ( $data as $secondKey => $secondValue ) {
if ( $firstKey === $secondKey ) {
continue;
}
if ( array_intersect($firstValue, $secondValue) ) {
$data[$firstKey] = array_unique(array_merge($firstValue, $secondValue));
unset($data[$secondKey]);
$foundSynonyms = true;
break 2; // outer foreach
}
}
}
} while ( $foundSynonyms );
print_r($data);
Output:
Array
(
[0] => Array
(
[0] => Server1
[1] => Server_1
[2] => Main Server
[3] => 192.168.0.3
[4] => VIP Server
[6] => 192.168.0.5
)
[2] => Array
(
[0] => Server_2
[1] => 192.168.0.4
[3] => Backup
)
)
This would yield lower complexity then the PHP example (Python 3):
a = [set(("Server1", "Server_1", "Main Server", "192.168.0.3")),
set(("Server_1", "VIP Server", "Main Server")),
set(("Server_2", "192.168.0.4")),
set(("192.168.0.3", "192.168.0.5")),
set(("Server_2", "Backup"))]
b = {}
c = set()
for s in a:
full_s = s.copy()
for d in s:
if b.get(d):
full_s.update(b[d])
for d in full_s:
b[d] = full_s
c.add(frozenset(full_s))
for k,v in b.items():
fsv = frozenset(v)
if fsv in c:
print(list(fsv))
c.remove(fsv)
I was looking for a solution in python, so I came up with this solution. If you are willing to use python data structures like sets
you can use this solution too. "It's so simple a cave man can use it."
Simply this is the logic behind it.
foreach set_of_values in value_collection:
alreadyInSynonymSet = false
foreach synonym_set in synonym_collection:
if set_of_values in synonym_set:
alreadyInSynonymSet = true
synonym_set = synonym_set.union(set_of_values)
if not alreadyInSynonymSet:
synonym_collection.append(set(set_of_values))
vals = (
("Server1", "Server_1", "Main Server", "192.168.0.3"),
("Server_1", "VIP Server", "Main Server"),
("Server_2", "192.168.0.4"),
("192.168.0.3", "192.168.0.5"),
("Server_2", "Backup"),
)
value_sets = (set(value_tup) for value_tup in vals)
synonym_collection = []
for value_set in value_sets:
isConnected = False # If connected to a term in the graph
print(f'\nCurrent Value Set: {value_set}')
for synonyms in synonym_collection:
# IF two sets are disjoint, they don't have common elements
if not set(synonyms).isdisjoint(value_set):
isConnected = True
synonyms |= value_set # Appending elements of new value_set to synonymous set
break
# If it's not related to any other term, create a new set
if not isConnected:
print ('Value set not in graph, adding to graph...')
synonym_collection.append(value_set)
print('\nDone, Completed Graphing Synonyms')
print(synonym_collection)
This will have a result of
Current Value Set: {'Server1', 'Main Server', '192.168.0.3', 'Server_1'}
Value set not in graph, adding to graph...
Current Value Set: {'VIP Server', 'Main Server', 'Server_1'}
Current Value Set: {'192.168.0.4', 'Server_2'}
Value set not in graph, adding to graph...
Current Value Set: {'192.168.0.3', '192.168.0.5'}
Current Value Set: {'Server_2', 'Backup'}
Done, Completed Graphing Synonyms
[{'VIP Server', 'Main Server', '192.168.0.3', '192.168.0.5', 'Server1', 'Server_1'}, {'192.168.0.4', 'Server_2', 'Backup'}]