PHP performance issue - php

I am trying to get the coin data of this website: http://www.tf2wh.com.
With this script:
$name = $_POST["item"];
$url = file_get_contents("http://www.tf2wh.com/allitems");
$dom = new DOMDocument();
#$dom->loadHTML($url);
$dom->saveHTML();
$code = "";
$xpath = new DOMXPath($dom);
foreach($xpath->query('//div[contains(attribute::class, "entry qual")]') as $e ) {
$code .= $e->nodeValue;
}
$code = substr($code,strpos($code,$name)-30,30);
$code = explode("(",$code);
$coins = "";
for($i = 0; $i < strlen($code[0]); $i++){
if(is_numeric($code[0][$i])){
$coins .= $code[0][$i];
}
}
echo $coins;
It works fine but there are two problems. First, its sooo slow, the time between request and response is around 15-30 seconds. Second, sometime this error occurs:
Fatal error: Maximum execution time of 30 seconds exceeded in
C:\xampp\htdocs\steammarket\getCoins.php on line 6
How can I fix this problem with the performance issue.

Connect site slow.
First php code set_time_limit(0); or ini_set('max_execution_time', 300); //300 seconds = 5 minutes
<?php
set_time_limit(0);
$name = $_POST["item"];
$url = file_get_contents("http://www.tf2wh.com/allitems");
$dom = new DOMDocument();
#$dom->loadHTML($url);
$dom->saveHTML();
$code = "";
$xpath = new DOMXPath($dom);
foreach($xpath->query('//div[contains(attribute::class, "entry qual")]') as $e ) {
$code .= $e->nodeValue;
}
$code = substr($code,strpos($code,$name)-30,30);
$code = explode("(",$code);
$coins = "";
for($i = 0; $i < strlen($code[0]); $i++){
if(is_numeric($code[0][$i])){
$coins .= $code[0][$i];
}
}
echo $coins;

Related

PHP/DOMXpath/DOMDocument - Unable to parse specific links

Here is my code, you can copy and paste it to start runing, it's complete for test:
<?php
$url = "http://www.sportsdirect.com/ladies/ladies-underwear";
libxml_use_internal_errors(true);
$doc = new DOMDocument();
$doc->loadHTMLFile($url);
$xpath = new DOMXpath($doc);
$n = $xpath->query('//div[#class="s-producttext-top-wrapper"]');
$l = $xpath->query('//div[#class="s-producttext-top-wrapper"]/a');
$p = $xpath->query('//div[#class="s-largered"]');
$nl = $xpath->query('//a[#class="swipeNextClick NextLink"]');
$NextLink = $nl->item(0)->getAttribute("data-dcp");
$item = 0;
foreach ($n as $entry) {
$Name = $entry->nodeValue;
$Link = $l->item($item)->getAttribute("href");
$Price = $p->item($item)->nodeValue;
$Find = array('£');
$Replace = array('');
$Price = str_replace($Find, $Replace, $Price);
echo "Name: $Name - Link: $Link - Price: $Price - $NextLink<br>";
$item++;
}
?>
This is parsing all the products from http://www.sportsdirect.com/ladies/ladies-underwear which are on the FIRST page.
Here is the link for the second page http://www.sportsdirect.com/ladies/ladies-underwear#dcp=2&dppp=100&OrderBy=rank
And when i execute this code to get all the products from the SECOND page:
<?php
$url = "http://www.sportsdirect.com/ladies/ladies-underwear#dcp=2&dppp=100&OrderBy=rank";
libxml_use_internal_errors(true);
$doc = new DOMDocument();
$doc->loadHTMLFile($url);
$xpath = new DOMXpath($doc);
$n = $xpath->query('//div[#class="s-producttext-top-wrapper"]');
$l = $xpath->query('//div[#class="s-producttext-top-wrapper"]/a');
$p = $xpath->query('//div[#class="s-largered"]');
$nl = $xpath->query('//a[#class="swipeNextClick NextLink"]');
$NextLink = $nl->item(0)->getAttribute("data-dcp");
$item = 0;
foreach ($n as $entry) {
$Name = $entry->nodeValue;
$Link = $l->item($item)->getAttribute("href");
$Price = $p->item($item)->nodeValue;
$Find = array('£');
$Replace = array('');
$Price = str_replace($Find, $Replace, $Price);
echo "Name: $Name - Link: $Link - Price: $Price - $NextLink<br>";
$item++;
}
?>
I still get the results for the products of the FIRST page. Why?
How can i parse all the products from Page 2, where is my mistake?
Can you please help me out?
Thanks in advance!

PHP script is timing out

I am pulling data from a page and I know this is a long process depending on the date being pulled. After 132 seconds of pulling the data the page times-out.
I have set the set_time_limit(0);and ignore_user_abort(true); - I am not sure what else to do to keep the script alive and pull all the data.
I have added the code below in case there is something i can do to speed it up??
set_time_limit(0);
ignore_user_abort(true);
error_reporting(-1);
ini_set('display_errors', 'On');
include "../include/class.php";
include "../include/db.php";
//the below will get the list of id's for each race that day
function curl($url){
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch,CURLOPT_FOLLOWLOCATION,true);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
$url = "http://form.timeform.betfair.com/daypage?date=20150516"; //WILL NEED TO PULL TOMORROWS DATE AS DD-MM-YYY
$html = curl($url);
$dom = new DOMDocument();
#$dom->loadHTML($html);
$dom->preserveWhiteSpace = false;
$xpath = new DOMXPath($dom);
//pull the individual cards for the day
//li class="rac-cardsclass="ix ixc"
$getdropdown = '//div[contains(#data-location, "RACING_COUNTRY_GB_IE")]//div[contains(#class, "course")]';
$getdropdown2 = $xpath->query($getdropdown);
//loop through each individual card
foreach($getdropdown2 as $dropresults) {
//loop through and get all the a tags
$arr = $dropresults->getElementsByTagName("a");
foreach($arr as $item) {
//only grab the links which point to the results page
if(strpos($item->getAttribute('href'), 'raceresult') !== false) {
//grab the code
$code = explode("=", $item->getAttribute('href'));
$code = end($code);
$url = "http://form.timeform.betfair.com/raceresult?raceId=" . $code; //WILL NEED TO PULL TOMORROWS DATE AS DD-MM-YYY
$html = curl($url);
$dom = new DOMDocument();
#$dom->loadHTML($html);
$dom->preserveWhiteSpace = false;
$xpath = new DOMXPath($dom);
$spanTexts = array();
//get the place name
$getplacename = '//span[contains(#class, "locality")]';
$getplacename2 = $xpath->query($getplacename);
//loop through each individual card
foreach($getplacename2 as $getplacename22) {
echo "Venue: " . $venue = $getplacename22->textContent;
} //$getplacename2 as $getplacename22
$gettime = '//abbr [contains(#class, "dtstart")]';
//get the Date and the Time
$gettime2 = $xpath->query($gettime);
foreach($gettime2 as $gettime22) {
echo "Date : " . $Dateandtime = date(trim($gettime22->getAttribute('title')), strtotime('+5 hours'));
} //$gettime2 as $gettime22
//pull the data for the race e.g going money ect
$getdropdown22 = '//div[contains(#class, "content")]/p';
$getdropdown222 = $xpath->query($getdropdown22);
foreach($getdropdown222 as $dropresults2) {
$racename = trim($dropresults2->childNodes->item(0)->textContent);
//foreach ($dropresults2->childNodes as $node) { if(is_object($node)) { echo $node->nodeType; } else { echo $node; } }
foreach($dropresults2->childNodes as $node) {
if(is_object($node) && $node->nodeType === XML_ELEMENT_NODE && strtolower($node->tagName) === 'span') {
$spanTexts[] = (string) $node->textContent;
} //is_object($node) && $node->nodeType === XML_ELEMENT_NODE && strtolower($node->tagName) === 'span'
} //$dropresults2->childNodes as $node
if(count($spanTexts) < 6)
continue;
list($going, $distance, $age, $prizemoney, $runners, $racetype) = $spanTexts;
$going = str_replace(array(
'Â',
'Going:',
'|'
), '', $going);
$distance = miletofurlong($distance = trim(GetBetween($distance, ':', 'Â')));
$age = trim(GetBetween($age, ':', 'Â'));
$prizemoney = trim(GetBetween($prizemoney, '£', 'Â'));
$runners = trim(GetBetween($runners, ':', 'Â'));
$racetype = trim(GetBetween($racetype, ':', 'Â'));
} //$getdropdown222 as $dropresults2
//pull the individual horse data
$getdropdown = '//div[contains(#class, "table-container")]//tbody//tr';
$getdropdown2 = $xpath->query($getdropdown);
//loop through each individual card
foreach($getdropdown2 as $dropresults) {
$position = $dropresults->childNodes->item(0)->childNodes->item(1)->textContent;
$draw = str_replace(array('(',')'), '', $dropresults->childNodes->item(0)->childNodes->item(3)->textContent);
$losingdist = str_replace('Â', '', trim($dropresults->childNodes->item(2)->textContent));
if(strpos($losingdist, '¾') !== false) {
$losingdist = str_replace('¾', '.75', $losingdist);
} //strpos($losingdist, '¾') !== false
if(strpos($losingdist, '½') !== false) {
$losingdist = str_replace('½', '.5', $losingdist);
} //strpos($losingdist, '½') !== false
if(strpos($losingdist, '¼') !== false) {
$losingdist = str_replace('¼', '.25', $losingdist);
} //strpos($losingdist, '¼') !== false
$losingdist;
$horse = trim(preg_replace("/\([^\)]+\)/","",str_replace("'","",trim($dropresults->childNodes->item(4)->textContent))));
$horseage = trim($dropresults->childNodes->item(6)->textContent);
$weight = trim($dropresults->childNodes->item(8)->childNodes->item(1)->textContent);
$or = str_replace(array('(',')'), '', trim($dropresults->childNodes->item(8)->childNodes->item(3)->textContent));
str_replace('-', '', $eq = trim($dropresults->childNodes->item(10)->textContent));
$jockey = trim($dropresults->childNodes->item(12)->childNodes->item(1)->textContent);
$trainer = trim($dropresults->childNodes->item(12)->childNodes->item(4)->textContent);
$highandlowinrunning = trim($dropresults->childNodes->item(14)->childNodes->item(1)->textContent);
$highandlow = explode("/", $highandlowinrunning);
str_replace('-', '', $lowodds = trim($highandlow['1']));
str_replace('-', '', $highodds = trim($highandlow['0']));
$bfsp = trim($dropresults->childNodes->item(16)->childNodes->item(1)->textContent);
$isp = trim(str_replace('/', '', $dropresults->childNodes->item(16)->childNodes->item(3)->textContent));
$placeodds = trim($dropresults->childNodes->item(18)->textContent);
$venue = mysqli_real_escape_string($db, $venue);
$Dateandtime = mysqli_real_escape_string($db,$Dateandtime);
$going = mysqli_real_escape_string($db, $going);
$distance = mysqli_real_escape_string($db,$distance);
$age = mysqli_real_escape_string($db,$age);
$prizemoney = mysqli_real_escape_string($db,$prizemoney);
$runners = mysqli_real_escape_string($db,$runners );
$racetype = mysqli_real_escape_string($db,$racetype);
$position = mysqli_real_escape_string($db,$position );
$draw = mysqli_real_escape_string($db,$draw);
$losingdist = mysqli_real_escape_string($db,$losingdist);
$horse = mysqli_real_escape_string($db,$horse );
$age = mysqli_real_escape_string($db,$age);
$weight = mysqli_real_escape_string($db,$weight);
$or = mysqli_real_escape_string($db,$or );
$eq = mysqli_real_escape_string($db,$eq );
$jockey = mysqli_real_escape_string($db,$jockey);
$trainer = mysqli_real_escape_string($db,$trainer);
$lowodds = mysqli_real_escape_string($db,$lowodds);
$highodds = mysqli_real_escape_string($db,$highodds);
$bfsp = mysqli_real_escape_string($db,$bfsp);
$isp = mysqli_real_escape_string($db,$isp);
$placeodds = mysqli_real_escape_string($db,$placeodds);
$sql = "
INSERT INTO `Race_Records`
(
`Venue`,
`DateandTime`,
`Going`,
`Distance`,
`Age`,
`PrizeMoney`,
`Runners`,
`RaceType`,
`Position`,
`Draw`,
`LosingDist`,
`Horse`,
`HorseAge`,
`Weight`,
`OR`,
`EQ`,
`Jockey`,
`Trainer`,
`InRunningLow`,
`InRunningHigh`,
`BFSP`,
`ISP`,
`PlaceOdds`,
`RaceName`
)
VALUES
(
'$venue',
'$Dateandtime',
'$going',
'$distance',
'$age',
'$prizemoney',
'$runners',
'$racetype',
'$position',
'$draw',
'$losingdist',
'$horse',
'$age',
'$weight',
'$or',
'$eq',
'$jockey',
'$trainer',
'$lowodds',
'$highodds',
'$bfsp',
'$isp',
'$placeodds',
'$racename'
)
";
$res = mysqli_query($db, $sql);
if (!$res) {
echo PHP_EOL . "FAIL: $sql";
trigger_error(mysqli_error($db), E_USER_ERROR);
}
}
}
}
}
$id = date_create($id);
$theid2 = date_format($id,"d-m-Y");
$url = "www.sportinglife.com/racing/results/".$theid2; //WILL NEED TO PULL TOMORROWS DATE AS DD-MM-YYY
$html = curl($url);
$dom = new DOMDocument();
#$dom->loadHTML($html);
$dom->preserveWhiteSpace = false;
$xpath = new DOMXPath($dom);
$getdropdown = '//li[contains(#class, "rac-cards")]//div[contains(#class, "ix ixv")]';
$getdropdown2 = $xpath->query($getdropdown);
//loop through each individual card
foreach($getdropdown2 as $dropresults) {
//loop through and get all the a tags
$arr = $dropresults->getElementsByTagName("a");
foreach($arr as $item) {
//only grab the links which point to the results page
//grab the code
$getcomments = $item->getAttribute('href');
foreach ($listofcorses as $bad) {
if (strstr( strtolower($getcomments),strtolower($bad)) !== false) {
$url = "http://www.sportinglife.com/".$getcomments; //WILL NEED TO PULL TOMORROWS DATE AS DD-MM-YYY
$html = curl($url);
$dom = new DOMDocument();
#$dom->loadHTML($html);
$dom->preserveWhiteSpace = false;
$xpath = new DOMXPath($dom);
$spanTexts = array();
//get the place name
$getplacename = '//table';
$getplacename2 = $xpath->query($getplacename);
//loop through each individual card
$loopnumber = 0;
foreach($getplacename2 as $getplacename22) {
// get how many child nodes are in the loop
$count = 0;
foreach($getplacename22 ->childNodes->item(11)->childNodes as $node)
if(!($node instanceof \DomText))
$count++;
//loop through and get the horses name and the comment
for ($i = 0; $i < $count; $i++) {
if ($i % 2 == 0)
{
if ($getplacename22 ->childNodes->item(11)->childNodes->item($i)->childNodes->item(4) != null)
{
$horse = mysqli_real_escape_string($db,trim(preg_replace("/[^A-Za-z ]+/", "", preg_replace("/\([^\)]+\)/","",trim($getplacename22 ->childNodes->item(11)->childNodes->item($i)->childNodes->item(4)->textContent)))));
$check = "ok";
}
else
{
$check = "no";
}
}
else
{
if ($check == "ok") {
$comments = mysqli_real_escape_string($db,trim($getplacename22 ->childNodes->item(11)->childNodes->item($i)->textContent));
//update the database
$results = $db->query("UPDATE Race_Records SET comments= '$comments' WHERE Horse='$horse'");
}
}
}
}
}
}
}
}
?>
You could try setting curl's timeout
curl_setopt($ch,CURLOPT_TIMEOUT,1000);
You might also want to check that the services you are accessing in the loop are rate-limited or not, and if so put in an appropriate sleep in the loop to make sure you aren't making too many requests from the service in consecutive cycles; it could well be that the code is running OK, but then timeingout after a number of HTTP requests to the remote service
Set max execution time
// Begin your php code with this
ini_set('max_execution_time',300); // 60s*5=300s 5 minutes

Display most recent additions to XML file?

I'm trying to display the latest additions to this NVD XML file:
http://static.nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-recent.xml
I can get all of them to list using the following code, but I'm only interested in displaying the most recent ten (from 2013 for the time being) and the XML file lists them in chronological order (starting in 2011).
<?php
$file= 'http://static.nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-recent.xml';
$xml = file_get_contents($file);
$sxe = new SimpleXMLElement($xml);
$ns = $sxe->getNamespaces(true);
echo "<b>Latest Vulnerabilities:</b><p>";
foreach($sxe->entry as $entry)
{
$vuln = $entry->children($ns['vuln']);
$href = $vuln->references->reference->attributes()->href;
echo "" . $vuln->{'cve-id'} . "<br>";
}
?>
Since you cannot manipulate the XML arrays directly, something like this should work for your needs:
$file= 'http://static.nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-recent.xml';
$xml = file_get_contents($file);
$sxe = new SimpleXMLElement($xml);
$ns = $sxe->getNamespaces(true);
echo "<b>Latest Vulnerabilities:</b><p>";
$all = $sxe->entry;
$length = count($all);
$offset_start = $length - 10;
for($i = 0; $i < $length; $i++)
{
if($i >= $offset_start)
{
$entry = $all[$i];
$vuln = $entry->children($ns['vuln']);
$href = $vuln->references->reference->attributes()->href;
echo "" . $vuln->{'cve-id'} . "<br>";
}
}

Working with large XML files in PHP

i have a problem using XMLParser and simplexml_load_dom. Im trying to search in 4 with 2MB each files and in a 27 MB file. The problem is not with the memory but with the execution time (around 50s). How can i optimize the code?
public function searchInFeed()
{
$feed =& $this->getModel('feed','afiliereFeeduriModel');
$myfeeds = $feed->selectFeed();
foreach ($myfeeds as $f)
{
$x = new XMLReader();
$x->open($f->url);
$z = microtime(true);
$doc = new DOMDocument('1.0', 'UTF-8');
set_time_limit(0);
while ($x->read())
{
if ($x->nodeType === XMLReader::ELEMENT)
{
$nod = simplexml_import_dom($doc->importNode($x->expand(), true));
$data['text'] = 'Chicco termometru';
$data['titlu'] = 'title';
$data['nod'] = &$nod;
if ($this->searchInXML($data))
{
echo $nod->title."<br>";
}
$x->next();
}
}
}
echo microtime(true) - $z."<br>";
echo memory_get_usage()/1024/1024;
die();
}

Get PHP script to output result when completed - browser continues running

I have a PHP script that writes data to files in batches of 5000 table rows. When all the rows have been written to file(s) it should output the time taken to run the script. Instead what is happening is the script appears to be continually running on the browser and the output never appears but the file(s) exist, meaning the script has run. This only happens with large amounts of data. Any suggestions?
$startTime = time();
$ID = '123';
$productBatchLimit = 5000;
$products = new Products();
$countProds = $products->countShopKeeperProducts();
//limit the amount of products
//if ($countProds > $productBatchLimit){$countProds = $productBatchLimit; }
$counter = 1;
for ($i = 0; $i < $countProds; $i += $productBatchLimit) {
$xml_file = 'xml/products/'. $ID . '_'. $counter .'.xml';
$xml = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n";
$xml .= "<products>\n";
//create new file
$fh = fopen($xml_file, 'a');
fwrite($fh, $xml);
$limit = $productBatchLimit*$counter;
$prodList = $products->getProducts($i, $limit);
foreach ($prodList as $prod ){
$xml = "<product>\n";
foreach ($prod as $key => $value){
$value = functions::xml_entities($value);
$xml .= "<{$key}>{$value}</{$key}>\n";
}
$xml .= "</product>\n";
fwrite($fh, $xml);
}
$counter++;
fwrite($fh, '</products>');
fclose($fh);
}
//check to see when XML is fully formed
$validxml = XMLReader::open($xml_file);
$validxml->setParserProperty(XMLReader::VALIDATE, true);
if ($validxml->isValid()==true){
$endTime = time();
echo "Total time to generate results: ".($endTime - $startTime)." seconds. \n";
} else {
echo "Problem saving Products XML.\n";
}

Categories