I am building a website which scrapes data from another website,stores it in a database and shows it in the form of a table. Everything works fine as long as the number of rows are less (around 100), but when the data set increases, say 300 rows or more the data gets stored in the database (phpmyadmin) but nothing shows on the screen and the site just keeps loading. Below is a section of the php script i am running:
<?php
// configuration
require("../includes/helpers.php");
// initializing current page and number of pages
$page = 0;
$pages = 1;
// scrape data from each page
while($pages--)
{
// next page
$page++;
// scrape data from shiksha.com
$string = #file_get_contents("http://www.shiksha.com/b-tech/colleges/b-tech-colleges-".urlencode($_POST["city"])."-{$page}");
if($string === false)
apologize("Please enter a valid city name");
if($page === 1)
{
// counting total number of pages
preg_match_all('/class=" linkpagination">/',$string,$result);
$pages = sizeof($result[0]);
}
// passing the string for scraping data and storing in database
get_college_info($string,$page);
// delay for 2s
sleep(2);
}
// querying the infrastructure table for facilities of all colleges
$infra = query("SELECT college_id,facilities FROM infrastructure ");
// preparing query and selecting data from table college_info
$result = query("SELECT * FROM college_info");
// render(output) results
render("result.php",["title" => "result","infra" => $infra,"result" => $result]);
}
}?>
interestingly, if i already have the data stored in my db and I just retrieve and print it , everything works fine and all the data ,however large it is,gets printed. I have no clue whats the problem.
PS : I have already tried set_time_limit().
you are creating an infinite loop. so to fix the issue change the criteria for your while loop to the below.
while($page<$pages)
{
//your same code here
}
Related
I have to start by saying I am stil fairly new to PHP here. I'm slowly getting my head round the functions by RTFM... I'm using procedural style PHP with MYSQLi as I struggle with PDO at present.
The question below I have been stuck on a while & I fear if I keep changing and amending I am digging deeper into a hole rather than putting it out there for some hopeful assistance.
I have data from an EPOS system within a MySQL database. I have populated this data by a number of Curl handlers inside foreach loops in PHP as the EPOS system API requires different calls for each data piece e.g. transaction header, transaction items, tender, customers etc...
Now I have this data, I need to gather the parent:child elements in a multi-dimensioanl array & eventually once I have these arrays correct, output data to a pipe delimited file for import into a 3rd party system. I was planning to run this PHP file to poll the database at a configured frequency via a windows scheduled task which I have several of already running succesfully in a production environment for other processes.
I hoped I could process this data in a similar way to the process I used for getting it and use nested foreach loops within PHP to populate an array that I can then ouput to file.
I can get the first array to populate with the transaction id's to create the primary key for getting parent records but I then seem to be struggling with unexpected results for the sub-query within a foreach loop.
I seem to be creating 12 empty arrays as the sub-query using the below, & am unsure of how I would need to go about adding child elements to these parent arrays as a multi-dimensional array?
The current code I have is as per below:
<?php
//Error Reporting in Browser - DEV ONLY.
error_reporting(E_ALL); ini_set('display_errors', 1);
//get config & includes
include 'config.php';
include 'functions.php';
//Set runtime var of script
$rt = date('Y-m-d H:i:s');
// connect to the mysql database
$link = $con;
mysqli_set_charset($link,'utf8');
$sql = "SELECT DISTINCT TransactionId FROM epos_transactions WHERE Processed = 0";
$tranresult = mysqli_query($link, $sql);
if (mysqli_affected_rows($link) >= "0") {
http_response_code(200);
}
elseif (mysqli_affected_rows($link) == "-1") {
$err = mysqli_errno($link) . " : " . mysqli_error($link);
exceptions_logger($err);
}
$transac = $tranresult->fetch_all(MYSQLI_ASSOC);
//Loop each transactionID & Produce TH Records.
foreach($transac as $row)
{
$TransactionId = $row['TransactionId'];
unset($sql);
$sql = "SELECT * FROM EPOS.TH where Trans_ID = $TransactionId";
$resulth = mysqli_query($link, $sql);
if (mysqli_affected_rows($link) >= "0") {
http_response_code(200);
}
elseif (mysqli_affected_rows($link) == "-1") {
$err = mysqli_errno($link) . " : " . mysqli_error($link);
exceptions_logger($err);
}
$throw = $resulth->fetch_all(MYSQLI_ASSOC);
}
var_dump($throw); //Only returning one result instead of 12 outside of loop?
?>
The code above is selecting the ID's of transactions eligible for export & then trying to selct the data for the parent rows of an array which afterwards I will then need to select the child elements for each parent row and store them in an array also that can eventually be output to a file with each transaction listing it's parent & child items.
The file out process I will tackle later on but was really hoping for some tips on the loopoing process to help get a data set initially.
Thanks in advance.
Resolved this one by some further research on here. Needed to set the array initially outside of the foreach loop which then allows results to be read outside of the loop once populated.
The following script updates the pageviews if its from a unique visitor. The page retrieves blog posts from databases and prints on the screen. When a blog post is visited first time the script should update its pageview field by 1. But the script is updating the pageview on every page refresh rather than recording only unique views.
if($_SESSION[$isPostID] != $isPostID)
{
try
{
$updatePageViews = $db2->prepare("UPDATE articles SET pageviews = pageviews+1 WHERE id = :id");
$updatePageViews->execute(array(':id' => $isPostID));
if($updatePageViews->rowCount() != 1)
{
#createLog("Unable to update pageviews.","Unable to update pageviews!!! Title = [".$istitle."].");
}
else{ $_SESSION[$isPostID] = $isPostID;}
}
catch(PDOException $updatePageViewsERR)
{
$subject = "Pageviews Updation--Update data into database. [PAGE= ".$istitle."]. Error Code: #15";
$text = $updatePageViewsERR->getMessage();
#createLog($subject,$text);
}
}
$isPostID is the unique ID assigned to every blog post in the database table.
Note: Session is already started in the script.
You have two errors in the first line.
First: you have no closing parenthesis in the condition. I guess it's just a typo, otherwise it would throw you fatal errors.
Second: you are comparing $isPostID to $isPostId which are two different variables. This may be exactly why it doesn't work.
See if it solves the problem
I have a two PHP scripts that are loading many variable resources from APIs, causing the response times to as long as 2.2 seconds to 4 seconds. Any suggestions on how to decrease response times and increase efficiency would be very appreciated?
FIRST SCRIPT
require('path/to/local/API_2');
//Check if user has put a query and that it's not empty
if (isset($_GET['query']) && !empty($_GET['query'])) {
//$query is user input
$query = str_replace(" ", "+", $_GET['query']);
$query = addslashes($query);
//HTTP Request to API_1
//Based on $query
//Max Variable is ammount of results I want to get back in JSON format
$varlist = file_get_contents("http://ADRESS_OF_API_1.com?$query&max=10");
//Convert JSON to Array()
$varlist = json_decode($varlist, true);
//Initializing connection to API_2
$myAPIKey = 'KEY';
$client = new APIClient($myAPIKey, 'http://ADRESS_OF_API_2.com');
$Api = new API_FUNCTION($client);
$queries = 7;
//Go through $varlist and get data for each element in array then use it in HTML
//Proccess all 8 results from $varlist array()
for ($i = 0; $i <= $queries; ++$i) {
//Get info from API based on ID included in first API data
//I don't use all info, but I can't control what I get back.
$ALL_INFO = $Api->GET_FUNCTION_1($varlist[$i]['id']);
//Seperate $ALL_INFO into info I use
$varlist[$i]['INFO_1'] = $ALL_INFO['PATH_TO_INFO_1'];
$varlist[$i]['INFO_2'] = $ALL_INFO['PATH_TO_INFO_2'];
//Check if info exists
if($varlist[$i]['INFO_1']) {
//Concatenate information into HTML
$result.='
<div class="result">
<h3>'.$varlist[$i]['id'].'</h3>
<p>'.$varlist[$i]['INFO_1'].'</p>
<p>'.$varlist[$i]['INFO_2'].'</p>
</div>';
} else {
//In case of no result for specific Info ID increase
//Allows for 3 empty responses
++$queries;
}
}
} else {
//If user didn't enter a query, relocates them back to main page to enter one.
header("Location: http://websitename.com");
die();
}`
NOTE: $result equals HTML information from each time arround the loop.
NOTE: Almost all time is spent in the for ($i = 0; $i <= 7; ++$i)
loop.
SECOND SCRIPT
//Same API as before
require('path/to/local/API_2');
//Check if query is set and not empty
if (isset($_GET['query']) && !empty($_GET['query'])) {
//$query is specific $varlist[$i]['id'] for more information on that data
$query['id'] = str_replace(" ", "+", $_GET['query']);
$query['id'] = addslashes($query['id']);
//Initializing connection to only API used in this script
$myAPIKey = 'KEY';
$client = new APIClient($myAPIKey, 'http://ADRESS_OF_API_2.com');
$Api = new API_FUNCTION($client);
$ALL_INFO_1 = $Api->GET_FUNCTION_1($query['id']);
$query['INFO_ADRESS_1.1'] = $ALL_INFO_1['INFO_ADRESS_1'];
$query['INFO_ADRESS_1.2'] = $ALL_INFO_2['INFO_ADRESS_2'];
$ALL_INFO_2 = $Api->GET_FUNCTION_2($query['id']);
$query['INFO_ADRESS_2.1'] = $ALL_INFO_3['INFO_ADRESS_3'];
$ALL_INFO_3 = $Api->GET_FUNCTION_3($query['id']);
$query['INFO_ADRESS_3.1'] = $ALL_INFO_4['INFO_ADRESS_4'];
$ALL_INFO_4 = $Api->GET_FUNCTION_4($query['id']);
$query['INFO_ADRESS_4.1'] = $ALL_INFO_5['INFO_ADRESS_5'];
$query['INFO_ADRESS_4.2'] = $ALL_INFO_6['INFO_ADRESS_6'];
$ALL_INFO_5 = $Api->GET_FUNCTION_5($query['id']);
$query['INFO_ADRESS_5.1'] = $ALL_INFO_7['INFO_ADRESS_7'];
}
$result = All of the $query data from the API;
} else {
//If no query relocates them back to first PHP script page to enter one.
header("Location: http://websitename.com/search");
die();
}`
NOTE: Similiarly to the first script, most time is spent getting info
from the secondary API.
NOTE: In the second script, the first API is replaced by a single
specific variable from the first script page,so $varlist[$i]['id'] =
$query['id'].
NOTE: Again, $result is the HTML data.
You could also move the API calls out from your normal page load. Respond to the user with a generic page to show something is happening and then make an ajax request to query the APIs and respond with data. There really is no way to speed up an individual external request. Your best bet is to:
try to minimize the number of requests (even if it means you request a little more data once then filter out on your side vs sending multiple requests for a small subset of data).
cache any remaining requests and pull from cache.
respond with a small page to let the user know something is happening and make separate ajax requests for the queried data.
I am having an issue. I have a bunch of inputs that share the same name. This creates arrays which are placed into the database and that all works fine but I am having a major dilemma in trying to keep the blank rows of inputs from creating blank entries in the database table.
The code is as such:
foreach($_POST['datetime'] as $dbrow=>$startdate) {
if(isset($_POST['def'])) {
$start = $startdate;
$abc = $_POST['abc'][$dbrow];
$def = $_POST['def'][$dbrow];
$ghi = $_POST['ghi'][$dbrow];
$db_insert = "INSERT INTO tablename (start, abc, def, ghi) VALUES('$start', '$abc', '$def', '$ghi')";
This is fine and dandy but I cant get the if statement to work. The form that is using POST to get the users information has 5 rows of inputs each with fours columns (start, abc, def, ghi). If I enter data in the form and submit it, then all the data goes to the database (yay - success), if I only enter the data in rows 1-4 then the database still enters 5 rows of data. What am I doing wrong?
---------------------EDIT---------------------
Ok so upon a deeper look what appears to be happening is the code keeps wanting to submit ALL of rows of dynamically generated inputs whether they contain content or not. So I devised the following code:
$db_tur = count(array_filter($start_date));
for($db_total_used_rows=0;$db_total_used_rows<=$db_tur;$db_total_used_rows++){
$db_start_date = $_POST['datetime'][$db_total_used_rows];
$db_abc = $_POST['abc'][$db_total_used_rows];
$db_def = $_POST['def'][$db_total_used_rows];
$db_ghi = $_POST['ghi'][$db_total_used_rows];
}
$db_insert = .....;
In theory what this does is $db_tur counts up all of the inputs being used under the start_date variable. I then break it into a count array which should be able to be used as $db_total_used_rows. However this doesnt seem to be limiting the total number of rows of inputs being inserted into the database. So I guess its back to the drawing board unless someone else has a better idea of how to accomplish this. I'm so ready to give up right now.
Use this loop structure:
foreach ($_POST['datetime'] as $dbrow => $start) {
$abc = $_POST['abc'][$dbrow];
$def = $_POST['def'][$dbrow];
$ghi = $_POST['ghi'][$dbrow];
if (!empty($abc) && !empty($def) && !empty($ghi) && !empty($start)) {
$db_insert = ...;
...
}
}
You could change && to || if it's OK for the user to leave some of the fields blank -- it will only skip rows where all fields are blank.
Except for checkboxes, all form fields will POST, even when empty. So if def is a text box it will always be present in your POST. A better test would be
if(!empty($_POST['def'])) {
Reason: I was assigned to run some script that advances a website,it's a fantasy football site and there are several instants of the site located into different domains. Some has more than 80k users and each users supposed to have a team that consists of 15 players. Hence some tables have No.users x No.players rows.
However Sometimes the script fails and the result gets corrupted, therefore I must backup 10 tables in question before i execute the script. Nevertheless, I still need to backup the tables to keep historical record of users action. Because football matches may last for 50+ game weeks.
Task: To duplicate db tables using php script. When i started i used to backup the tables using sqlyog. it's works but it's time consuming since I have to wait for each table to be duplicated. Besides, for large tables the sqlyog application crashes during the duplicating of large tables which may be very annoying.
Current solution: I have created a simple application with interface that does the job and it works great. It consist of three files, one for db connection, 2nd for db manipulation, 3rd for user interface and to use the 2nd file's code.
The thing is, sometimes it get stuck at the middle of duplicating tables process.
Objective: To create an application to be used by admin to facilitate database backing up using mysql+php.
My Question: How to ensure that the duplicating script will definitely backup the table completely without hanging the server or interrupting the script.
Down here I will include my code for duplicating function, but basically these are the two crucial lines that i think the problem is located in them:
//duplicate tables structure
$query = "CREATE TABLE $this->dbName.`$newTableName` LIKE $this->dbName.`$oldTable`";
//duplicate tables data
$query = "INSERT INTO $this->dbName.`$newTableName` SELECT * FROM $this->dbName.`$oldTable`";
The rest of the code is solely for validation in case error occur. If you wish to take a look at the whole code, be my guest. Here's the function:
private function duplicateTable($oldTable, $newTableName) {
if ($this->isExistingTable($oldTable))
{
$this->printLogger("Original Table is valid -table exists- : $oldTable ");
}
else
{
$this->printrR("Original Table is invalid -table does not exist- : $oldTable ");
return false;
}
if (!$this->isExistingTable($newTableName))// make sure new table does not exist alrady
{
$this->printLogger("Distination Table name is valid -no table with this name- : $newTableName");
$query = "CREATE TABLE $this->dbName.`$newTableName` LIKE $this->dbName.`$oldTable`";
$result = mysql_query($query) or $this->printrR("Error in query. Query:\n $query\n Error: " . mysql_error());
}
else
{
$this->printrR("Distination Table is invalid. -table already exists- $newTableName");
$this->printr("Now checking if tables actually match,: $oldTable => $newTableName \n");
$varifyStatus = $this->varifyDuplicatedTables($oldTable, $newTableName);
if ($varifyStatus >= 0)
{
$this->printrG("Tables match, it seems they were duplicated before $oldTable => $newTableName");
}
else
{
$this->printrR("The duplicate table exists, yet, doesn't match the original! $oldTable => $newTableName");
}
return false;
}
if ($result)
{
$this->printLogger("Query executed 1/2");
}
else
{
$this->printrR("Something went wrong duplicateTable\nQuery: $query\n\n\nMySql_Error: " . mysql_error());
return false;
}
if (!$this->isExistingTable($newTableName))//validate table has been created
{
$this->printrR("Attemp to duplicate table structure failed $newTableName table was not found after creating!");
return false;
}
else
{
$this->printLogger("Table created successfully: $newTableName");
//Now checking table structure
$this->printLogger("Now comparing indexes ... ");
$autoInc = $this->checkAutoInc($oldTable, $newTableName);
if ($autoInc == 1)
{
$this->printLogger("Auto inc seems ok");
}
elseif ($autoInc == 0)
{
$this->printLogger("No inc key for both tables. Continue anyways");
}
elseif ($autoInc == -1)
{
$this->printLogger("No match inc key!");
}
$time = $oldTable == 'team_details' ? 5 : 2;
$msg = $oldTable == 'team_details' ? "This may take a while for team_details. Please wait." : "Please wait.";
$this->printLogger("Sleep for $time ...\n");
sleep($time);
$this->printLogger("Preparing for copying data ...\n");
$query = "INSERT INTO $this->dbName.`$newTableName` SELECT * FROM $this->dbName.`$oldTable`";
$this->printLogger("Processing copyign data query.$msg...\n\n\n");
$result = mysql_query($query) or $this->printrR("Error in query. Query:\n $query\n Error: " . mysql_error());
// ERROR usually happens here if large tables
sleep($time); //to make db process current requeste.
$this->printLogger("Query executed 2/2");
sleep($time); //to make db process current requeste.
if ($result)
{
$this->printLogger("Table created ($newTableName) and data has been copied!");
$this->printLogger("Confirming number of rows ... ");
/////////////////////////////////
// start checking count
$numRows = $this->checkCountRows($oldTable, $newTableName);
if ($numRows)
{
$this->printLogger("Table duplicated successfully ");
return true;
}
else
{
$this->printLogger("Table duplicated, but, please check num rows $newTableName");
return -3;
}
// end of checking count
/////////////////////////////////
}//end of if(!$result) query 2/2
else
{
$this->printrR("Something went wrong duplicate Table\nINSERT INTO $oldTable -> $newTableName\n\n$query\n mysql_error() \n " . mysql_error());
return false;
}
}
}
AS you noticed the function is only to duplicate one table, that's why there is another function that that takes an array of tables from the user and pass the tables names array one by one to duplicateTable().
Any other function should be included for this question, please let me know.
One solution pops into my mind, would duplicating tables by part by part add any improvement, I'm not sure how Insert into works, but maybe if I could insert let's say 25% at a time it may help?
However Sometimes the script fails and the result gets corrupted,
therefore I must backup 10 tables in question before i execute the
script.
Probably you need to use another solution here: transactions. You need to wrap up all queries you are using in failing script into transaction. If transaction fails all data will be the same as in the beginning of the operation. If queries got executed correctly - you are OK.
why are you every time duplicating the table..
CLUSTERS are good option which can make duplicate copies of your table in distributed manner and is much more reliable and secure.