mysql_fetch_array won't loop through all results - php

First off, I know we have to get off of the depreciated php mysql functions and move to mysqli or PDO. However, that transition won't be happening here for a few weeks and I need to get this working like ASAP.
Basically, I have code that works fine on our old server (PHP 5.2.13), as well as smaller queries on our new server (PHP 5.4.20), but for larger queries will only return a partial record set and then just... die I guess? What record it dies on depends on the query, but it pretty much always dies somewhere in the range of record 10k to 15k. I suspect it is dying because of some kind of php.ini setting that sets a limit or something but I have no idea what it would be. I've streamlined the code to the essentials here:
$query = $my_query;
$result = mysql_query($query) or die(mysql_error());
$record_count = mysql_num_rows($result);
echo "Query has returned " . $record_count . " records.<br>";
$y=0;
while($row=mysql_fetch_array($result, MYSQL_ASSOC))
{
echo "START";
echo $y . " ";
foreach($row as $key => $value)
{ echo $value . " "; }
$y=$y+1;
echo "END" . "<br>";
}
echo "GOT OUT OF THERE!";
So yeah, the record_count will echo that the query returned about 250k records, but in the loop it basically will do somewhere between 10-15k records, echo the final "END", but then the loop just plain stops. It doesn't get back to the next "START" nor does it ever get to the "GOT OUT OF THERE!" And again, this same exact code works fine on our old server, as well as smaller queries on our new server.
Anyone have any ideas what the issue is?

It's probably just timing out. You can override the server's default timeout settings for an individual script by adding this line:
set_time_limit(0);
This will allow the script to run forever. If you want to set a different time limit, the parameter is in seconds, so for instance, this will allow the script to run for 5 minutes:
set_time_limit(300);

Related

MySQL PHP Insert Limit

I'm trying to insert a bunch of data into a database, but it's getting hung up on inserting, it gets to 5000 entries and then stops. Problem is at least one table has 44,000 entries.
I'm using PHP to gather and the info that is going into the database. I'd like to enter the data using a loop, just submit 5000 entries at a time, but I'm not sure how to write that.
Is there a way without editing the initial query that I can loop through results, 5000 at a time? It would have to stop inserting after 5000 entries and then start up again, but at the same spot it was at when it stopped.
$listings = $rets->SearchQuery("Property","Listing",$query);
echo "Total Listings found: {$rets->TotalRecordsFound()}<br>\n";
if ($listings) {
echo "'Listings' Success.<br><br />";
} else {
echo "'Listings' - Error!<br><br />";
print_r($rets->Error());
exit;
}
while ($record = $rets->FetchRow($listings)) {
$mysqli->query("INSERT INTO Property VALUES (
...
}
There is a limit on each server. 5000 a time is lot. Use your php script to insert about 500-1000 a time let's say each 10 minutes or so. You can use cron jobs in your server to automate your script run. It might take a day or so but you won't run out of bandwidth.

Data transfer between web/db server

I have a PHP script on webserver 1 located in country A. I have a DB server located in country B.
The php script queries a large table from the DB server, groups the results, and inserts them back into the DB server (to another table). This is done by a single query (INSERT INTO SELECT...)
My question here is, does the data actually transfer between the web/db server? E.g. is this using GB's of bandwidth on both servers?
If you never deal with any retrieved data from DB server, then the query won't send any data to web server 1. Basically, if you just run the query, the only data that's sent is the text of the query (e.g. INSERT INTO SELECT...), which is probably just a few bytes, and then the response, which is just a success/fail value. That's it.
To say it another way, the data from the SELECT part of your INSERT INTO SELECT query is all dealt with on the DB server, it's never sent to webserver 1.
Even if you did run a SELECT query on a remote databse, you wouldn't get all of the results back. You actually get a resource. This resource is a reference to a set of data still on the remote database. When you do something like fetch_row on that resource, it fetches the next row. At that point, the data is transferred.
You can test this by monitoring the memory usage of your PHP script at various points in it's execution using memory_get_usage. Try:
echo "Memory before query: " . memory_get_usage() . "\n";
$result = $mysqli->query("your select query");
echo "Memory after query: " . memory_get_usage() . "\n";
$data = array();
$i=1;
while ($row = $result->fetch_row()) {
$data[] = $row;
echo "Memory after reading row " . $i++ . ": " . memory_get_usage() . "\n";
}
You should see a very small increase of used memory after your SELECT, and then a steady increase as your iterate over the results.

How I do optimize a slow query caused by a large dataset?

I have pull back a lot of information and as a result, my page is loading in about 22~24 seconds. Is there anything I can do to optimize my code?
Here is my code:
<?php
$result_rules = $db->query("SELECT source_id, destination_id FROM dbo.rules");
while($row_rules = sqlsrv_fetch_array($result_rules)){
$result_destination = $db->query("SELECT pk_id, project FROM dbo.destination WHERE pk_id=" . $row_rules['destination_id'] . " ORDER by project ASC");
while($row_destination = sqlsrv_fetch_array($result_destination)){
echo "Destination project: ";
echo "<span class='item'>".$row_destination['project']."</span>";
echo "ID: ".$row_rules['destination_id']."<br>";
if ($row_rules['source_id'] == null) {
echo "Source ID for Destination ID".$row_rules['destination_id']." is NULL<br>";
} else {
$result_source = $db->query("SELECT pk_id, project FROM dbo.source WHERE pk_id=" . $row_rules['source_id'] . " ORDER by project ASC");
while($row_source = sqlsrv_fetch_array($result_source)){
echo "Source project: ";
echo $row_source['project'];
echo " ID: ".$row_rules['source_id']."<br>";
}
}
}
}
?>
Here's what my tables look like:
Source table: pk_id:int, project:varchar(50), feature:varchar(50), milestone:varchar(50), reviewGroup:varchar(125), groupId:int
Rules table: pk_id:int, source_id:int, destination_id:int, login:varchar(50), status:varchar(50), batchId:int, srcPGroupId:int, dstPGroupId:int
Destination table: pk_id:int, project:varchar(50), feature:varchar(50), milestone:varchar(50), QAAssignedTo:varchar(50), ValidationAssignedTo:varchar(50), Priority:varchar(50), groupId:int
If you want help with optimizing queries then please provide details of the schema and the output of the explain plan.
Running nested loops is bad for performance. Running queries inside nested loops like this is a recipe for VERY poor performance. Using '*' in select is bad for performance too (particularly as your only ever using a couple of columns).
You should start by optimizing your PHP and merging the queries:
$result_rules = $db->query(
"SELECT rule.destination_id, [whatever fields you need from dbo.rules]
dest.project AS dest_project,
src.project AS src_project,
src.pk_id as src_id
FROM dbo.rules rule
INNER JOIN dbo.destination dest
ON dest.pk_id=rule.destination_id
LEFT JOIN dbo.source src
ON src.pk_id=rule.source_id
ORDER BY rule.destination_id, dest.project, src.project");
$last_dest=false;
$last_src=false;
while($rows = sqlsrv_fetch_array($result)){
if ($row['destination_id']!==$last_dest) {
echo "Destination project: ";
echo "<span class='item'>".$row['dest_project']."</span>";
echo "ID: ".$row['destination_id']."<br>";
$last_dest=$row['destination_id'];
}
if (null===$row['src_id']) {
... I'll let you sort out the rest.
Add an index on (pk_id, project) so it includes all fields important for the query.
Make sure that pk_Id is indexed: http://www.w3schools.com/sql/sql_create_index.asp
Rather than using select *, return only the columns you need, unless you need all of them.
I'd also recommend moving your SQL code to the server and calling the stored procedure.
You could consider using LIMIT if your back end is mysql: http://php.about.com/od/mysqlcommands/g/Limit_sql.htm .
I'm assuming that the else clause is what's slowing up your code. I would suggest saving all the data you're going to need at the start and then accessing the array again in the else clause. Basically, you don't need this to run every time.
$result_destination = $db->query("SELECT * FROM dbo.destination WHERE pk_id=" . $row_rules['destination_id'] . " ORDER by project ASC")
You could grab the data earlier and use PHP to iterate over it.
$result_destinations = $db->query("SELECT * FROM dbo.destination ORDER by project ASC")
And then later in your code use PHP to determine the correct destination. Depending on exactly what you're doing it should shave some amount of time off.
Another consideration is the time it takes for your browser to render the html generated by your php code. The more data you are presenting, the longer it's going to take. Depending on the requirements of your audience, you might want to display only x records at a time.
There are jquery methods of increasing the number of records displayed without going back to the server.
For starters you would want to lower the number of queries run. For example doing a query, looping through those results and running another query, then looping through that result set running more queries is generally considered bad. The number of queries run goes up exponentially.
For example, if you have 100 rows coming back from the first query and 10 rows from each sub-query. The first query returns 100 rows that you loop over. For each of those you query again. You are now at 101 queries. Then, for each of those 100 you run another query each returning 10 rows. You are now at 1001 queries. Each query has to send data to the server (the query text), wait for a response and get data back. That is what takes so long.
Use a join to do a single query on all the tables and loop over the single result.

MySQL to Redis on a huge table, how to speed things up?

I have a bit of a problem when I try to take a huge amount of data from a mysql table to a redis database. Anyway I'm getting the error "MySQL server has gone away" after a while and I have no idea why..
EDIT:
OR when I use the commented code that breaks the loop it just goes "finished" when it isn't finished.
This is the php code I use (runned by php-cli):
<?php
require 'Predis/Autoloader.php';
Predis\Autoloader::register();
mysql_connect('localhost', 'root', 'notcorrect') or die(mysql_error());
mysql_select_db('database_that_i_use') or die(mysql_error());
$redis = new Predis\Client();
//starting on 0 but had to edit this when it crashed :(
for($i = 3410000; $i<999999999999; $i += 50000) {
echo "Query from $i to " . ($i + 50000) . ", please wait...\n";
$query = mysql_unbuffered_query('SELECT * FROM table LIMIT ' . $i . ', 50000')or die(mysql_error());
// This was code I used before, but for some reason it got valid when it wasn't supposed to.
/*if(mysql_num_rows($query) == 0) {
echo "Script finished!\n";
break;
}*/
while($r = mysql_fetch_assoc($query)) {
$a = array('campaign_id' => $r['campaign_id'],
'criteria_id' => $r['criteria_id'],
'date_added' => $r['date_added'],
);
$redis->hmset($r['user_id'], $a);
unset($a);
usleep(10);
}
echo "Query completed for 50000 rows..\n";
sleep(2);
}
unset($redis);
?>
My question is how to do this better, I have seriously no idea why it crashes. My server is pretty old and slow and maybe can't handle this large amount of data? This is just a testserver before we switch to real production.
Worth to notice is that the script ran fine for maybe half an hour and it may be the limit statement that makes it very slow when the number get high? Is there then an easier way to do this? I need to transfer all the data today! :)
Thanks in advance.
EDIT: running example:
Query from 3410000 to 3460000, please wait...
Query completed for 50000 rows..
Query from 3460000 to 3510000, please wait...
Query completed for 50000 rows..
Query from 3510000 to 3560000, please wait...
Query completed for 50000 rows..
Query from 3560000 to 3610000, please wait...
MySQL server has gone away
EDIT:
The table consist of ~5 million rows of data and is approx. 800 MB in size.
But I need to do similar things for even larger tables later on..
First, you may want to use another script language. Perl, Python, Ruby, anything is better than PHP to run this kind of scripts.
I cannot comment on why the mysql connection is lost, but to get better performance you need to try to eliminate as many roundtrips as you can with the mysql server and the redis server.
It means:
you should not use unbuffered queries but buffered ones (provided LIMIT is used in the query)
OR
you should not iterate on the mysql query using LIMIT since you get a quadratic complexity while it should be only linear. I don't know if it can be avoided in PHP though.
you should pipeline the commands you sent to Redis
Here is an example of pipelining with Predis:
https://github.com/nrk/predis/blob/v0.7/examples/PipelineContext.php
Actually, if I really had to use PHP for this, I would export the mysql data in a text file (using "select into outfile" for instance), and then read the file and use pipelining to push data to Redis.

Php query MYSQL very slow. what possible to cause it?

I have a php page query mysql database, it will return about 20000 rows. However the browser will take above 20 minutes to present. I have added index on my database and it do used it, the query time in command line is about 1 second for 20000 rows. but in web application, it takes long. is anyone know which causing this problem? and better way to improve it?Below is my php code to retrieve the data:
select * from table where Date between '2010-01-01' and '2010-12-31'
$result1 = mysql_query($query1) or die('Query failed: ' . mysql_error());
while ($line = mysql_fetch_assoc($result1)) {
echo "\t\t<tr>\n";
$Data['Date'] = $line['Date'];
$Data['Time'] = $line['Time'];
$Data['Serial_No'] = $line['Serial_No'];
$Data['Department'] = $line['Department'];
$Data['Team'] = $line['Team'];
foreach ($Data as $col_value) {
echo "\t\t\t<td>$col_value</td>\n";
};
echo "\t\t</tr>\n";
}
Try adding an index to your date column.
Also, it's a good idea to learn about the EXPLAIN command.
As mentioned in the comments above, 1 second is still pretty long for your results.
You might consider putting all your output into a single variable and then echoing the variable once the loop is complete.
Also, browsers wait for tables to be completely formed before showing them, so that will slow your results (at least slow the process of building the results in the browser). A list may work better - or better yet a paged view if possible (as recommended in other answers).
It's not PHP that's causing it to be slow, but the browser itself rendering a huge page. Why do you have to display all that data anyway? You should paginate the results instead.
Try constructing a static HTML page with 20,000 table elements. You'll see how slow it is.
You can also improve that code:
while ($line = mysql_fetch_assoc($result1)) {
echo "\t\t<tr>\n";
foreach ($line as $col_value) {
echo "\t\t\t<td>$col_value</td>\n";
flush(); // optional, but gives your program a sense of responsiveness
}
echo "\t\t</tr>\n";
}
In addition, you should increase your acceptance rate.
You could time any steps of the script, by echoing the time before and after connecting to the database, running the query and outputting the code.
This will tell you how long the different steps will take. You may find out that it is indeed the traffic causing the delay and not the query.
On the other hand, when you got a table with millions of records, retreiving 20000 of them can take a long time, even when it is indexed. 20 minutes is extreme, though...

Categories