I have a PHP script that writes data to files in batches of 5000 table rows. When all the rows have been written to file(s) it should output the time taken to run the script. Instead what is happening is the script appears to be continually running on the browser and the output never appears but the file(s) exist, meaning the script has run. This only happens with large amounts of data. Any suggestions?
$startTime = time();
$ID = '123';
$productBatchLimit = 5000;
$products = new Products();
$countProds = $products->countShopKeeperProducts();
//limit the amount of products
//if ($countProds > $productBatchLimit){$countProds = $productBatchLimit; }
$counter = 1;
for ($i = 0; $i < $countProds; $i += $productBatchLimit) {
$xml_file = 'xml/products/'. $ID . '_'. $counter .'.xml';
$xml = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n";
$xml .= "<products>\n";
//create new file
$fh = fopen($xml_file, 'a');
fwrite($fh, $xml);
$limit = $productBatchLimit*$counter;
$prodList = $products->getProducts($i, $limit);
foreach ($prodList as $prod ){
$xml = "<product>\n";
foreach ($prod as $key => $value){
$value = functions::xml_entities($value);
$xml .= "<{$key}>{$value}</{$key}>\n";
}
$xml .= "</product>\n";
fwrite($fh, $xml);
}
$counter++;
fwrite($fh, '</products>');
fclose($fh);
}
//check to see when XML is fully formed
$validxml = XMLReader::open($xml_file);
$validxml->setParserProperty(XMLReader::VALIDATE, true);
if ($validxml->isValid()==true){
$endTime = time();
echo "Total time to generate results: ".($endTime - $startTime)." seconds. \n";
} else {
echo "Problem saving Products XML.\n";
}
Related
I want to load csv file data to extract the urls from CSV and check for the title tag for all the urls and update the urls with corresponding title tags in a new csv. But while I try to add data to the csv all the urls are getting listed but only the title of the last url is displayed in the CSV. I have tried different ways to overcome this problem but unable to do so.
Here is my code:
<?php
ini_set('max_execution_time', '300'); //300 seconds = 5 minutes
ini_set('max_execution_time', '0');
include('simple_html_dom.php');
// if (isset($_POST['resurl'])) {
// $url = $_POST['resurl'];
if (($csv_file = fopen("old.csv", "r", 'a')) !== FALSE) {
$arraydata = array();
while (($read_data = fgetcsv($csv_file, 1000, ",")) !== FALSE) {
$column_count = count($read_data);
for ($c = 0; $c < $column_count; $c++) {
array_push($arraydata, $read_data[$c]);
}
}
fclose($csv_file);
}
$title = [];
foreach ($arraydata as $ad) {
$ard = [];
$ard = $ad;
$html = file_get_html($ard);
if ($html) {
$title = $html->find('title', 0)->plaintext;
// echo '<pre>';
// print_r($title);
}
}
$ncsv = fopen("updated.csv", "a");
$head = "Url,Title";
fwrite($ncsv, "\n" . $head);
foreach ($arraydata as $value) {
// $ar[]=$value;
$csvdata = "$value,$title";
fwrite($ncsv, "\n" . $csvdata);
}
fclose($ncsv);
I've changed the code so that you write the CSV file as you read the HTML pages. This saves having another loop and an extra array of titles.
I've also changed it to use fputcsv to write the data out as it sorts ot things like escaping values etc.
// Open file, using w to clear the old file down
$ncsv = fopen('updated.csv', 'w');
$head = 'Url,Title';
fwrite($ncsv, "Url,Title" . PHP_EOL . $head);
foreach ($arraydata as $ad) {
$html = file_get_html($ad);
// Fetch title, or set to blank if html is not loaded
if ($html) {
$title = $html->find('title', 0)->plaintext;
} else {
$title = '';
}
// Write record out
fputcsv($ncsv, [$value, $title]);
}
fclose($ncsv);
I was able to solve it finally.
Here is the updated code:
<?php
ini_set('max_execution_time', '300'); //300 seconds = 5 minutes
ini_set('max_execution_time', '0');
include('simple_html_dom.php');
// if (isset($_POST['resurl'])) {
// $url = $_POST['resurl'];
if (($csv_file = fopen("ntsurl.csv", "r", 'a')) !== FALSE) {
$arraydata = array();
while (($read_data = fgetcsv($csv_file, 1000, ",")) !== FALSE) {
$column_count = count($read_data);
for ($c = 0; $c < $column_count; $c++) {
array_push($arraydata, $read_data[$c]);
}
}
fclose($csv_file);
}
// print_r($arraydata);
$title=[];
$ncsv=fopen("ntsnew.csv","a");
$head="Website Url,title";
fwrite($ncsv,"\n".$head);
foreach($arraydata as $ad)
{
$ard = [];
$ard = $ad;
$html = file_get_html($ard);
if ($html) {
$title = $html->find('title', 0)->plaintext;
echo '<pre>';
print_r($title);
$csvdata="$ard,$title ";
fwrite($ncsv,"\n".$csvdata);
}
}
// fclose($ncsv);
I am trying to get the coin data of this website: http://www.tf2wh.com.
With this script:
$name = $_POST["item"];
$url = file_get_contents("http://www.tf2wh.com/allitems");
$dom = new DOMDocument();
#$dom->loadHTML($url);
$dom->saveHTML();
$code = "";
$xpath = new DOMXPath($dom);
foreach($xpath->query('//div[contains(attribute::class, "entry qual")]') as $e ) {
$code .= $e->nodeValue;
}
$code = substr($code,strpos($code,$name)-30,30);
$code = explode("(",$code);
$coins = "";
for($i = 0; $i < strlen($code[0]); $i++){
if(is_numeric($code[0][$i])){
$coins .= $code[0][$i];
}
}
echo $coins;
It works fine but there are two problems. First, its sooo slow, the time between request and response is around 15-30 seconds. Second, sometime this error occurs:
Fatal error: Maximum execution time of 30 seconds exceeded in
C:\xampp\htdocs\steammarket\getCoins.php on line 6
How can I fix this problem with the performance issue.
Connect site slow.
First php code set_time_limit(0); or ini_set('max_execution_time', 300); //300 seconds = 5 minutes
<?php
set_time_limit(0);
$name = $_POST["item"];
$url = file_get_contents("http://www.tf2wh.com/allitems");
$dom = new DOMDocument();
#$dom->loadHTML($url);
$dom->saveHTML();
$code = "";
$xpath = new DOMXPath($dom);
foreach($xpath->query('//div[contains(attribute::class, "entry qual")]') as $e ) {
$code .= $e->nodeValue;
}
$code = substr($code,strpos($code,$name)-30,30);
$code = explode("(",$code);
$coins = "";
for($i = 0; $i < strlen($code[0]); $i++){
if(is_numeric($code[0][$i])){
$coins .= $code[0][$i];
}
}
echo $coins;
So far my script is working fine, basically it gets all htm files, out puts results, however im using DOM to get the HTML title tag from each file, that's where im not get to get it in the random array.. (image basenames and htm basename files are the same (firstresult.htm has picture firstresult.jpg)
I hope the code I provide and answer will be useful
<?php
// loop through the images
$count = 0;
$filenamenoext = array();
foreach (glob("/mydirectory/*.htm") as $filename) {
$filenamenoext[$count] = basename($filename, ".htm");
$count++;
}
for ($i = 0; $i < 10; $i++) {
$random = mt_rand(1, $count - 1);
$cachefile = "$filename";
$contents = file($cachefile);
$string = implode($contents);
$doc = new DOMDocument();
#$doc->loadHTML($string);
$nodes = $doc->getElementsByTagName('title');
//get and display what you need:
$title = $nodes->item(0)->nodeValue;
echo '<img class="image" src="'.$filenamenoext[$random].'.jpg" " />"'.$title.'"<BR><BR>';
}
?>
It looks like the $filename variable you use on the line $cachefile = "$filename"; hasn't been set. It's only defined in the foreach loop's scope.
You should change it to
$cachefile = $filenamenoext[$random] . '.htm';
Also, it's a better practice to use array_push() and count() functions, instead of using a counter and manually filling the array. At least the code is better looking and more readable.
<?php
// loop through the images
$count = 0;
$filenamenoext = array();
foreach (glob("/mydirectory/*.htm") as $filename) {
array_push($filenamenoext, basename($filename, ".htm"));
}
for ($i = 0; $i < 10; $i++) {
$random = mt_rand(1, count($filenamenoext) - 1);
$cachefile = $filenamenoext[$random] . '.htm';
$contents = file($cachefile);
$string = implode($contents);
$doc = new DOMDocument();
#$doc->loadHTML($string);
$nodes = $doc->getElementsByTagName('title');
//get and display what you need:
$title = $nodes->item(0)->nodeValue;
echo '<img class="image" src="' . $filenamenoext[$random] . '.jpg" " />"' . $title . '"<BR><BR>';
}
?>
I'm trying to display the latest additions to this NVD XML file:
http://static.nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-recent.xml
I can get all of them to list using the following code, but I'm only interested in displaying the most recent ten (from 2013 for the time being) and the XML file lists them in chronological order (starting in 2011).
<?php
$file= 'http://static.nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-recent.xml';
$xml = file_get_contents($file);
$sxe = new SimpleXMLElement($xml);
$ns = $sxe->getNamespaces(true);
echo "<b>Latest Vulnerabilities:</b><p>";
foreach($sxe->entry as $entry)
{
$vuln = $entry->children($ns['vuln']);
$href = $vuln->references->reference->attributes()->href;
echo "" . $vuln->{'cve-id'} . "<br>";
}
?>
Since you cannot manipulate the XML arrays directly, something like this should work for your needs:
$file= 'http://static.nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-recent.xml';
$xml = file_get_contents($file);
$sxe = new SimpleXMLElement($xml);
$ns = $sxe->getNamespaces(true);
echo "<b>Latest Vulnerabilities:</b><p>";
$all = $sxe->entry;
$length = count($all);
$offset_start = $length - 10;
for($i = 0; $i < $length; $i++)
{
if($i >= $offset_start)
{
$entry = $all[$i];
$vuln = $entry->children($ns['vuln']);
$href = $vuln->references->reference->attributes()->href;
echo "" . $vuln->{'cve-id'} . "<br>";
}
}
I made a script that reads data from a .xls file and converts it into a .csv, then I have a script that takes the .csv and puts it in an array, and then I have a script with a foreach loop and at the end should echo out the end variable, but it echos out nothing, just a blank page. The file writes okay, and that's for sure, but I don't know if the script read the csv, because if I put an echo after it reads, it just returns blank.
Here my code:
<?php
ini_set('memory_limit', '300M');
$username = 'test';
function convert($in) {
require_once 'Excel/reader.php';
$excel = new Spreadsheet_Excel_Reader();
$excel->setOutputEncoding('CP1251');
$excel->read($in);
$x=1;
$sep = ",";
ob_start();
while($x<=$excel->sheets[0]['numRows']) {
$y=1;
$row="";
while($y<=$excel->sheets[0]['numCols']) {
$cell = isset($excel->sheets[0]['cells'][$x][$y]) ? $excel->sheets[0]['cells'][$x][$y] : '';
$row.=($row=="")?"\"".$cell."\"":"".$sep."\"".$cell."\"";
$y++;
}
echo $row."\n";
$x++;
}
return ob_get_contents();
ob_end_clean();
}
$csv = convert('usage.xls');
$file = $username . '.csv';
$fh = fopen($file, 'w') or die("Can't open the file");
$stringData = $csv;
fwrite($fh, $stringData);
fclose($fh);
$maxlinelength = 1000;
$fh = fopen($file);
$firstline = fgetcsv($fh, $maxlinelength);
$cols = count($firstline);
$row = 0;
$inventory = array();
while (($nextline = fgetcsv($fh, $maxlinelength)) !== FALSE )
{
for ( $i = 0; $i < $cols; ++$i )
{
$inventory[$firstline[$i]][$row] = $nextline[$i];
}
++$row;
}
fclose($fh);
$arr = $inventory['Category'];
$texts = 0;
$num2 = 0;
foreach($inventory['Category'] as $key => $value) {
$val = $value;
if (is_object($value)) { echo 'true'; }
if ($value == 'Messages ') {
$texts++;
}
}
echo 'You have used ' . $texts . ' text messages';
?>
Once you return. you cannot do anything else in the function:
return ob_get_contents();
ob_end_clean();//THIS NEVER HAPPENS
Therefore the ob what never flushed and won't have any output.
I see a lot of repetitive useless operations there. Why not simply build an array with the data you're pulling out of the Excel file? You can then write out that array with fputcsv(), instead of building the CSV string yourself.
You then write the csv out to a file, then read the file back in and process it back into an array. Which begs the question... why? You've already got the raw individual bits of data at the moment you read from the excel file, so why all the fancy-ish giftwrapping only to tear it all apart again?