Optimizing for loop in php - php

I have been running a foreach loop 1000 times on php page. The code inside the foreach loop looks like below:
$first = mysql_query("SELECT givenname FROM first_names order by rand() LIMIT 1");
$first_n = mysql_fetch_array($first);
$first_name = $first_n['givenname'];
$last = mysql_query("SELECT surname FROM last_name order by rand() LIMIT 1");
$last_n = mysql_fetch_array($last);
$last_name = $last_n['surname'];
$first_lastname = $first_name . " " . $last_name;
$add = mysql_query("SELECT streetaddress FROM user_addresss order by rand() LIMIT 1");
$addr = mysql_fetch_array($add);
$address = $addr['streetaddress'];
$unlisted = "unlisted";
$available = "available";
$arr = array(
$first_lastname,
$address,
$unlisted,
$available
);
Then I have been using array_rand function to get a randomized value each time the loop runs:
<td><?php echo $arr[array_rand($arr)] ?></td>
So loading the php page is taking a really long time. Is there a way I could optimize this code. As I need a unique value each time the loop runs

The problem is not your PHP foreach loop. If you order your MySQL table by RAND(), you are making a serious mistake. Let me explain to you what happens when you do this.
Every time you make a MySQL request, MySQL will attempt to map your search parameters (WHERE, ORDER BY) to indices to cut down on the data read. It will then load the relevant info in memory for processing. If the info is too large, it will default to writing it to disk and reading from disk to perform the comparison. You want to avoid disk reads at all costs as they are inefficient, slow, repetitive and can sometimes be flat-out wrong under specific circumstances.
When MySQL finds an index that is possible to be used, it will load the index table instead. An index table is a hash table between memory location and the value of the index. So, for instance, the index table for a primary key looks like this:
id location
1 0 bytes in
2 17 bytes in
3 34 bytes in
This is extremely efficient as even very large index tables can fit in tiny amounts of memory.
Why am I talking about indices? Because by using RAND(), you are preventing MySQL from using them. ORDER BY RAND() forces MySQL to create a new random value for each row. This requires MySQL to copy all your table data in what is called a temporary table, and to add a new field with the RAND() value. This table will be too big to store in memory, so it will be stored to disk.
When you tell MySQL to ORDER BY RAND(), and the table is created, MySQL will then compare every single row by pairs (MySQL sorting uses quicksort). Since the rows are too big, you're looking at quite a few disk reads for this operation. When it is done, it returns, and you get your data -at a huge cost.
There are plenty of ways to prevent this massive overhead SNAFU. One of them is to select ID from RAND() to maximum index and limit by 1. This does not require the creation of an extra field. There are plenty of similar Stack questions.

It has already been explained why ORDER BY RAND() should be avoided, so I simply provide a way to do it with some faster queries.
First get a random number based on your table size:
SELECT FLOOR(RAND()*COUNT(*)) FROM first_names
Second use the random number in a limit
SELECT * FROM first_names $pos,1
Unfortunately I don't think there is any way to combine the two queries into one.
Also you can do a SELECT COUNT(*) FROM first_names, store the number, and generate random $pos in PHP as many times as you like.

You should switch to using either mysqli or pdo if your host supports it but something like this should work. You will have to determine what you want to do if you don't have a enough record in either of the tables though (array_pad or wrap the indexes and restart)
function getRandomNames($qty){
$qty = (int)$qty;
$fnames = array();
$lnames = array();
$address = array();
$sel =mysql_query("SELECT givenname FROM first_names order by rand() LIMIT ".$qty);
while ($rec = mysql_fetch_array($sel)){$fnames[] = $rec[0]; }
$sel =mysql_query("SELECT surname FROM last_name order by rand() LIMIT ".$qty);
while ($rec = mysql_fetch_array($sel)){ $lnames[] = $rec[0]; }
$sel =mysql_query("SELECT streetaddress FROM user_addresss order by rand() LIMIT ".$qty);
while ($rec = mysql_fetch_array($sel)){ $address[] = $rec[0]; }
// lets stitch the results together
$results = array();
for($x = 0; $x < $qty; $x++){
$results[] = array("given_name"=>$fnames[$x], "surname"=>$lnames[$x], "streetaddress"=>$address[$x]);
}
return $results;
}
Hope this helps
UPDATE
Based on Sébastien Renauld's answer a more complete solution may be to structure the queries more like
"SELECT givenname from first_names where id in (select id from first_names order by rand() limit ".$qty.")";

Related

Compare performance between loop(foreach) and query database mysql with 1000 rows in PHP?

I want to compare performance between loop(foreach) and select data with same condition in more than 1000 rows or bigger.
Example
Solution 1
$query1 = "select * from table
where table.name LIKE 'like-name'
and table.age = 1
and table.address = 'USA' ";
...proccess something database...
then return $results about more than 100 or 1000 rows that i want.
Solution 2
$query2 = "select * from table where table.name LIKE 'like-name' ";
...proccess something database...
then $results and i use foreach to filter results like this
$results2 = [];
foreach ($results as $result) {
if($result['age'] == 1 && $result['address'] == 'USA') {
$results2 = $result;
}
}
return $results2;
and then i get result that i want
So, what solution is performance better? :)
Mostly, the first solution will be faster just because it reduces rows count that will be transferred from DB to PHP. Moreover if there is will be an index for age and address it may be more faster (but depends on source data).
Another advantage of using MySQL is caching data, so there is possibility that someone just has run the same query and MySQL can return exact data -- obviously it will increase performance.

how to fix error with mysql random

I have project in php + mysql (over 2 000 000 rows). Please view this php code.
<?php
for($i=0;$i<20;$i++)
{
$start = rand(1,19980);
$select_images_url_q = "SELECT * FROM photo_gen WHERE folder='$folder' LIMIT $start,2 ";
$result_select = (mysql_query($select_images_url_q));
while($row = mysql_fetch_array($result_select))
{
echo '<li class="col-lg-2 col-md-3 col-sm-3 col-xs-4" style="height:150px">
<img class="img-responsive" src="http://static.gif.plus/'.$folder.'/'.$row['code'].'_s.gif">
</li>';
}
}
?>
This code work very slowly in $start = rand(1,19980); position, Please help how I can make select request with mysql random function, thank you
Depending on what your code is doing with $folder, you may be vulnerable to SQL injection.
For better security, consider moving to PDO or MySQLi and using prepared statements. I wrote a library called EasyDB to make it easier for developers to adopt better security practices.
The fast, sane, and efficient way to select N distinct random elements from a database is as follows:
Get the number of rows that match your condition (i.e. WHERE folder = ?).
Generate a random number between 0 and this number.
Select a row with a given offset like you did.
Store the ID of the previously generated row in an ever-growing list to exclude from the results, and decrement the number of rows.
An example that uses EasyDB is as follows:
// Connect to the database here:
$db = \ParagonIE\EasyDB\Factory::create(
'mysql;host=localhost;dbname=something',
'username',
'putastrongpasswordhere'
);
// Maintain an array of previous record IDs in $exclude
$exclude = array();
$count = $db->single('SELECT count(id) FROM photo_gen WHERE folder = ?', $folder);
// Select _up to_ 40 values. If we have less than 40 in the folder, stop
// when we've run out of photos to load:
$max = $count < 40 ? $count : 40;
// The loop:
for ($i = 0; $i < $max; ++$i) {
// The maximum value will decrease each iteration, which makes
// sense given that we are excluding one more result each time
$r = mt_rand(0, ($count - $i - 1));
// Dynamic query
$qs = "SELECT * FROM photo_gen WHERE folder = ?";
// We add AND id NOT IN (2,6,7,19, ...) to prevent duplicates:
if ($i > 0) {
$qs .= " AND id NOT IN (" . implode(', ', $exclude) . ")";
}
$qs .= "ORDER BY id ASC LIMIT ".$r.", 1";
$row = $db->row($qs, $folder);
/**
* Now you can operate on $row here. Feel free to copy the
* contents of your while($row=...) loop in place of this comment.
*/
// Prevent duplicates
$exclude []= (int) $row['id'];
}
Gordon's answer suggests using ORDER BY RAND(), which in general is a bad idea and can make your queries very slow. Furthermore, although he says that you shouldn't need to worry about there being less than 40 rows (presumably, because of the probability involved), this will fail in edge cases.
A quick note about mt_rand(): It's a biased and predictable random number generator with only 4 billion possible seeds. If you want better results, look into random_int() (PHP 7 only, but I'm working on a compatibility layer for PHP 5 projects. See the linked answer for more information.)
Actually, even though the table has 2+ million rows, I'm guessing that a given folder has many fewer. Hence, this should be reasonable with an index on photo_gen(folder):
SELECT *
FROM photo_gen
WHERE folder = '$folder'
ORDER BY rand()
LIMIT 40;
If a folder can still have tens or hundreds of thousands of examples, I would suggest a slight variation:
SELECT pg.**
FROM photo_gen pg cross join
(select count(*) cnt from photo_gen where folder = $folder) as cnt
WHERE folder = '$folder' and
rand() < 500 / cnt
ORDER BY rand()
LIMIT 40;
The WHERE expression should get about 500 rows (subject to the vagaries of sample variation). There is a really high confidence that there will be at least 40 (you don't need to worry about it). The final sort should be fast.
There are definitely other methods, but they are complicated by the where clause. The index is probably the key thing you need for improved performance.
It's better to firstly compose your SQL query (as a string in PHP) once and then just execute it once.
Or you could use this way to select values if it fits your case: Select n random rows from SQL Server table

PHP how to run sql query one part at a time?

I have a table with roughly 1 million rows. I'm doing a simple program that prints out one field from each row. However, when I started using mysql_pconnect and mysql_query the query would take a long time, I am assuming the query needs to finish before I can print out even the first row. Is there a way to process the data a bit at a time?
--Edited--
I am not looking to retrieve a small set of the data, I'm looking for a way to process the data a chunk at a time (say fetch 10 rows, print 10 rows, fetch 10 rows, print 10 rows etc etc) rather than wait for the query to retrieve 1 million rows (who knows how long) and then start the printing.
Printing one million fields will take some time. Retrieving one million records will take some time. Time adds up.
Have you profiled your code? I'm not sure using limit would make such a drastic difference in this case.
Doing something like this
while ($row = mysql_fetch_object($res)) {
echo $row->field."\n";
}
outputs one record at a time. It does not wait for the whole resultset to be returned.
If you are dealing with a browser you will need something more.
Such as this
ob_start();
$i = 0;
while ($row = mysql_fetch_object($res)) {
echo $row->field."\n";
if (($i++ % 1000) == 0) {
ob_flush();
}
}
ob_end_flush();
Do you really want to print one million fields?
The customary solution is to use some kind of output pagination in your web application, showing only part of the result. On SELECT queries you can use the LIMIT keyword to return only part of the data. This is basic SQL stuff, really. Example:
SELECT * FROM table WHERE (some conditions) LIMIT 40,20
shows 20 entries, starting from the 40th (off by one mistakes on my part may be possible).
It may be necessary to use ORDER BY along with LIMIT to prevent the ordering from randomly changing under your feet between requests.
This is commonly needed for pagination. You can use the limit keyword in your select query. Search for limit here:
The LIMIT clause can be used to constrain the number of rows returned by the SELECT statement. LIMIT takes one or two numeric arguments, which must both be nonnegative integer constants (except when using prepared statements).
With two arguments, the first argument specifies the offset of the first row to return, and the second specifies the maximum number of rows to return. The offset of the initial row is 0 (not 1):
SELECT * FROM tbl LIMIT 5,10; # Retrieve rows 6-15
To retrieve all rows from a certain offset up to the end of the result set, you can use some large number for the second parameter. This statement retrieves all rows from the 96th row to the last:
SELECT * FROM tbl LIMIT 95,18446744073709551615;
With one argument, the value specifies the number of rows to return from the beginning of the result set:
SELECT * FROM tbl LIMIT 5; # Retrieve first 5 rows
In other words, LIMIT row_count is equivalent to LIMIT 0, row_count.
You might be able to use
Mysqli::use_result
combined with a flush to output the data set to the browser. I know flush can be used to output data to the browser at an incremental state as I have used it before to do just that, however I am not sure if mysqli::use_result is the correct function to retrieve incomplete result sets.
This is how I do something like that in Oracle. I'm not sure how it would cross over:
declare
my_counter integer := 0;
begin
for cur in (
select id from table
) loop
begin
-- do whatever your trying to do
update table set name = 'steve' where id = cur.id;
my_counter := my_counter + 1;
if my_counter > 500 then
my_counter := 0;
commit;
end if;
end;
end loop;
commit;
end;
An example using the basic mysql driver.
define( 'CHUNK_SIZE', 500 );
$result = mysql_query( 'select count(*) as num from `table`' );
$row = mysql_fetch_assoc( $result );
$totalRecords = (int)$row['num'];
$offsets = ceil( $totalRecords / CHUNK_SIZE );
for ( $i = 0; $i < $offsets; $i++ )
{
$result = mysql_query( "select * from `table` limit " . CHUNK_SIZE . " offset " . ( $i * CHUNK_SIZE ) );
while ( $row = mysql_fetch_assoc( $result ) )
{
// your per-row operations here
}
unset( $result, $row );
}
This will iterate over your entire row volume, but do so only 500 rows at a time to keep memory usage down.
It sounds like you're hitting the limits of various buffer sizes within the mysql server... Some methods you could do would be to specify the field you want in the SQL statement to reduce this buffer size, or play around with the various admin settings.
OR, you can use a pagination like method but have it output all on one page...
(pseudocode)
function q($part) {
$off = $part*SIZE_OF_PARTITIONS;
$size = SIZE_OF_PARTITIONS;
return( execute_and_return_sql('SELECT `field` FROM `table` LIMIT $off, $size'));
}
$ii = 0;
while ($elements = q($ii)) {
print_fields($elements);
$ii++;
}
Use mysql_unbuffered_query() or if using PDO make sure PDO::MYSQL_ATTR_USE_BUFFERED_QUERY is false.
Also see this similar question.
Edit: and as others have said, you may wish to combine this with flushing your output buffer after each batch of processing, depending on your circumstances.

generating unique combinations without running out of memory in php

I am writing an algorithm to generate combinations of items from a database. They need to be unique permutations (i.e. 145, 156 == 156, 145). The problem I am running into is how to keep track of previous combinations so that i do not end up with 145, 156 and 156, 145.
Currently I am adding them to an array with index of id1_id2... (sorted so id's are always be lowest to highest) and setting the value equal to 1 when a combo is generated so that i can check if $combos[$index] exists or not. If it does not exist, create it. (there are other criteria to weed out EVERY permutation, but they are irrelevant) Once these combinations are generated, they are being stored in a table in MySQL.
The problem I am running into is that with the test items i'm using (about 85) I cannot generate a combinations with more than 3 items (id1_id2_id3) without running out of memory as the number of combinations is MASSIVE and the $combos array takes up more than the 64M i am allotted in PHP memory.
Is there a way that I can do this a) without keeping track of previous combos or b) skipping the $combos array route and only adding a unique row to mysql and let mysql handle the duplicate checking.
Here is some pseudo code for reference:
$items = array(/*85 items*/);
foreach ($items as $item1){
generate(array($item1));
foreach($items as $item2){
generate(array($item1, $item2));
}
}
}
function generate($items_arary){
$temp_array = array();
foreach ($items_array as $item){
$temp_array[] = $item['id'];
}
sort($temp_array);
$index = implode("_", $temp_array);
if (!$combos[$index]){
$combos[$index] = 1;
/* some code to generate query to store to db */
}
}
the query ends up looking like this: (the database is truncated at beginning of script)
INSERT INTO `combos` (combo_id, more_info) VALUES ('id1_id2', 'Item Name');
In the process of writing this question, I thought of a possible solution: Making sure id3 > id2 > id1. Would this be a viable solution to remove the need for $combos?
The reason I asked about the before data structure is because you could do something like this:
$sql = "SELECT id FROM test_a";
$result = mysql_query($sql);
while ($row = mysql_fetch_array($result)) {
$item1 = $row['id'];
$sql2 = "SELECT id FROM test_a";
$result2 = mysql_query($sql2);
while ($row2 = mysql_fetch_array($result2)) {
$item2 = $row2['id'];
$combo1 = $item1 . "_" . $item2;
$combo2 = $item2 . "_" . $item1;
$sql3 = "SELECT * FROM combos WHERE combo_id = '$combo1' OR combo_id = '$combo2'";
$result3 = mysql_query($sql3);
if (mysql_num_rows($result3) == 0) {
$sql4 = "INSERT INTO combos (combo_id, more_info) VALUES ('$combo1','Item Name')";
$result4 = mysql_query($sql4);
}
}
}
When table test_a has the values 1,2,3, and 4 this script inserts:
1_1
1_2
1_3
1_4
2_2
2_3
2_4
3_3
3_4
4_4
This shouldn't have any memory problems. Although if you have a huge database you may run into a issue with php's time limit
Here is the same concept as my other answer but in an all SQL format.
INSERT INTO combos (combo_id, more_info)
SELECT CONCAT_WS("_",t1.id,t2.id), "item_name"
FROM test_a t1, test_a t2
WHERE NOT EXISTS (SELECT * FROM combos WHERE combo_id = CONCAT_WS("_",t1.id,t2.id))
AND NOT EXISTS (SELECT * FROM combos WHERE combo_id = CONCAT_WS("_",t2.id,t1.id))
Assuming you can get item_name from the db somewhere, this will probably be your fastest and least memory intensive solution. I am running a test on around 1000 ids at the moment. I'll update this when it finishes.
Yes. You can store and use the lexicographical index of the combination to reconstruct/iterate them, or Grey Codes if you need to iterate all of them.
Take a look at: "Algorithm 515: Generation of a Vector from the Lexicographical Index"; Buckles, B. P., and Lybanon, M. ACM Transactions on Mathematical Software, Vol. 3, No. 2, June 1977.
I've translated into C here, and describe more here.
If you don't need to enforce referential integrity automatically (which you're not if you use string concatenation), use one table for the 85 items, give them each an index (0-84), and use a second table to represent a given set of items, using a numeric datatype where each bit position in the number represents one item. (e.g. 000001101 represents items 0, 2, and 3)
For items more than 64 you may have to split them up into more than one field, or use a BLOB or a string (gack!).
If you use this as a primary key field, you can enforce non-duplicates.
In TSQL you can use a recursive CTE, Can''t remember where I got it, but pretty sweet. Note MYSQL doesn't use "With" option, so it won't work in MySQL
WITH Numbers(N) AS (
SELECT N
FROM ( VALUES(1), (2), (3), (4), (5), (6)) Numbers(N)),
Recur(N,Combination) AS (
SELECT N, CAST(N AS VARCHAR(20))
FROM Numbers
UNION ALL
SELECT n.N,CAST(r.Combination + ',' + CAST(n.N AS VARCHAR(10)) AS VARCHAR(20))
FROM Recur r
INNER JOIN Numbers n ON n.N > r.N)
select Combination
from RECUR
ORDER BY LEN(Combination),Combination;
to increase memory change
memory_limit = 512M in your php.ini
or
ini_set('memory_limit', '512M') in your php script
or
php_value memory_limit 512M in your .htaccess

saving a column in an array

I'm trying to fetch random no. of entries from a database by using
SELECT QNO FROM TABLE ORDER BY RAND() LIMIT 10
it returns a column of database.
If I want to save all the entries in a array, then which php function do I have to use to save the column.
Something along the lines of this?
$result = mysql_query("SELECT QNO FROM TABLE ORDER BY RAND() LIMIT 10");
$rows = array();
while ($row = mysql_fetch_row($result)) {
$rows[] = $row[0];
}
Updated to not use the $i variable as pointed out in the first post and the comment.
Look at some examples for how to run a query and get a result set.
http://www.php.net/mysqli
Once you have the result in a variable, do this:
$myarray = array();
while($row = mysqli_fetch_row($result))
$myarray[] = $row[0];
With PDO:
$qryStmt = $dbc->query('SELECT QNO FROM TABLE ORDER BY RAND() LIMIT 10');
$a = $qryStmt->fetchAll( PDO::FETCH_COLUMN );
BTW: If you just want to get one row by random, this is much faster esp. for large tables:
select * from table limit 12345,1;
where 12345 is just a random number calculated from the count() of rows.
see here, which is more for rails, but have a look at the comments too.
But be careful: in limit 12345,2 - the second row is not random but just the next row after the random row. And be careful: When I remember right (eg. SQLServer) rand() could be optimized by databases other than mysql resulting in the same random number for all rows which makes the result not random. This is important, when your code should be database agnostic.
a last one: do not mix up "random" with "hard to predict", which is not the same. So the order by example "select top 10 ... order by rand()" on SQLServer results in two different result sets when run twice, BUT: if you look at the 10 records, they lie close to each other in the db, which means, they are not random.

Categories