MySQL Prevent sum twice - php

I have the following query in my PHP script:
SELECT SUM(score) + 1200 AS rating
FROM users_ratings
WHERE user_fk = ?
The problem is that if the user has no rows in the table, the rating is returned as NULL. So I added an if case:
SELECT IF(SUM(score) IS NULL, 0, SUM(score)) + 1200 AS rating
FROM users_ratings
WHERE user_fk = ?
But now I'm wondering how to do the query, without repeating the SUM(score). I'm guessing if I have lot of rows, the sum would repeat twice, which would affect the performance of the application.

Here is a simpler method:
SELECT ( COALESCE(SUM(score), 0) + 1200 ) AS rating
FROM users_ratings
WHERE user_fk = ?;
COALESCE() is an ANSI standard function that returns the first non-NULL value from a list of expressions.
However, the additional overhead of calculating SUM(score) twice -- even if it happens -- should be very minimal compared to the rest of the query (finding the data, reading it in, summarizing it to one row).

Related

How to Limit MySQL query based on total records count percentage [duplicate]

Let's say I have a list of values, like this:
id value
----------
A 53
B 23
C 12
D 72
E 21
F 16
..
I need the top 10 percent of this list - I tried:
SELECT id, value
FROM list
ORDER BY value DESC
LIMIT COUNT(*) / 10
But this doesn't work. The problem is that I don't know the amount of records before I do the query. Any idea's?
Best answer I found:
SELECT*
FROM (
SELECT list.*, #counter := #counter +1 AS counter
FROM (select #counter:=0) AS initvar, list
ORDER BY value DESC
) AS X
where counter <= (10/100 * #counter);
ORDER BY value DESC
Change the 10 to get a different percentage.
In case you are doing this for an out of order, or random situation - I've started using the following style:
SELECT id, value FROM list HAVING RAND() > 0.9
If you need it to be random but controllable you can use a seed (example with PHP):
SELECT id, value FROM list HAVING RAND($seed) > 0.9
Lastly - if this is a sort of thing that you need full control over you can actually add a column that holds a random value whenever a row is inserted, and then query using that
SELECT id, value FROM list HAVING `rand_column` BETWEEN 0.8 AND 0.9
Since this does not require sorting, or ORDER BY - it is O(n) rather than O(n lg n)
You can also try with that:
SET #amount =(SELECT COUNT(*) FROM page) /10;
PREPARE STMT FROM 'SELECT * FROM page LIMIT ?';
EXECUTE STMT USING #amount;
This is MySQL bug described in here: http://bugs.mysql.com/bug.php?id=19795
Hope it'll help.
I realize this is VERY old, but it still pops up as the top result when you google SQL limit by percent so I'll try to save you some time. This is pretty simple to do these days. The following would give the OP the results they need:
SELECT TOP 10 PERCENT
id,
value
FROM list
ORDER BY value DESC
To get a quick and dirty random 10 percent of your table, the following would suffice:
SELECT TOP 10 PERCENT
id,
value
FROM list
ORDER BY NEWID()
I have an alternative which hasn't been mentionned in the other answers: if you access from any language where you have full access to the MySQL API (i.e. not the MySQL CLI), you can launch the query, ask how many rows there will be and then break the loop if it is time.
E.g. in Python:
...
maxnum = cursor.execute(query)
for num, row in enumerate(query)
if num > .1 * maxnum: # Here I break the loop if I got 10% of the rows.
break
do_stuff...
This works only with mysql_store_result(), not with mysql_use_result(), as the latter requires that you always accept all needed rows.
OTOH, the traffic for my solution might be too high - all rows have to be transferred.

Update Current Row in MySQL Loop

I have a MySQL table with over 16 million rows and there is no primary key. Whenever I try to add one, my connection crashes. I have tried adding one as an auto increment in PHPMyAdmin and in shell but the connection is always lost after about 10 minutes.
What I would like to do is loop through the table's rows in PHP so I can limit the number of results and with each returned row add an auto-incremented ID number. Since the number of impacted rows would be reduced by reducing the load on the MySQL query, I won't lose my connection.
I want to do something like
SELECT * FROM MYTABLE LIMIT 1000001, 2000000;
Then, in the loop, update the current row
UPDATE (current row) SET ID='$i++'
How do I do this?
Note: the original data was given to me as a txt file. I don't know if there are duplicates but I cannot eliminate any rows. Also, no rows will be added. This table is going to be used only for querying purposes. When I have added indexes, however, there were no problems.
I suspect you are trying to use phpmyadmin to add the index. As handy as it is, it is a PHP script and is limited to the same resources as any PHP script on your server, typically 30-60 seconds run time, and a limited amount of ram.
Suggest you get the mysql query you need to add the index, then use SSH to shell in, and use command line MySQL to add your indexes.
If you don't have duplicate rows then the following way might shed some light:
Suppose you want to update the auto incremented value for first 10000 rows.
UPDATE
MYTABLE
INNER JOIN
(SELECT
*,
#rn := #rn + 1 AS row_number
FROM MYTABLE,(SELECT #rn := 0) var
ORDER BY SOME_OF_YOUR_FIELD
LIMIT 0,10000 ) t
ON t.field1 = MYTABLE.field1 AND t.field2 = MYTABLE.field2 AND .... t.fieldN = MYTABLE.fieldN
SET MYTABLE.ID = t.row_number;
For next 10000 rows just need to change two things:
(SELECT #rn := 10000) var
LIMIT 10000,10000
Repeat..
Note: ORDER BY SOME_OF_YOUR_FIELD is important otherwise you would get results in random order. Better create a function which might take limit,offset as parameter and do this job. Since you need to repeat the process.
Explanation:
The idea is to create a temporary table(t) having N number of rows and assigning a unique row number to each of the row. Later make an inner join between your main table MYTABLE and this temporary table t ON matching all the fields and then update the ID field of the corresponding row(in MYTABLE) with the incremented value(in this case row_number).
Another IDEA:
You may use multithreading in PHP to do this job.
Create N threads.
Assign each thread a non overlapping region (1 to 10000, 10001 to
20000 etc) like the above query.
Caution: The query will get slower in higher offset.

MYSQL from the 20th row to the last row [duplicate]

I would like to construct a query that displays all the results in a table, but is offset by 5 from the start of the table. As far as I can tell, MySQL's LIMIT requires a limit as well as an offset. Is there any way to do this?
From the MySQL Manual on LIMIT:
To retrieve all rows from a certain
offset up to the end of the result
set, you can use some large number for
the second parameter. This statement
retrieves all rows from the 96th row
to the last:
SELECT * FROM tbl LIMIT 95, 18446744073709551615;
As you mentioned it LIMIT is required, so you need to use the biggest limit possible, which is 18446744073709551615 (maximum of unsigned BIGINT)
SELECT * FROM somewhere LIMIT 18446744073709551610 OFFSET 5
As noted in other answers, MySQL suggests using 18446744073709551615 as the number of records in the limit, but consider this: What would you do if you got 18,446,744,073,709,551,615 records back? In fact, what would you do if you got 1,000,000,000 records?
Maybe you do want more than one billion records, but my point is that there is some limit on the number you want, and it is less than 18 quintillion. For the sake of stability, optimization, and possibly usability, I would suggest putting some meaningful limit on the query. This would also reduce confusion for anyone who has never seen that magical looking number, and have the added benefit of communicating at least how many records you are willing to handle at once.
If you really must get all 18 quintillion records from your database, maybe what you really want is to grab them in increments of 100 million and loop 184 billion times.
Another approach would be to select an autoimcremented column and then filter it using HAVING.
SET #a := 0;
select #a:=#a + 1 AS counter, table.* FROM table
HAVING counter > 4
But I would probably stick with the high limit approach.
As others mentioned, from the MySQL manual. In order to achieve that, you can use the maximum value of an unsigned big int, that is this awful number (18446744073709551615). But to make it a little bit less messy you can the tilde "~" bitwise operator.
LIMIT 95, ~0
it works as a bitwise negation. The result of "~0" is 18446744073709551615.
You can use a MySQL statement with LIMIT:
START TRANSACTION;
SET #my_offset = 5;
SET #rows = (SELECT COUNT(*) FROM my_table);
PREPARE statement FROM 'SELECT * FROM my_table LIMIT ? OFFSET ?';
EXECUTE statement USING #rows, #my_offset;
COMMIT;
Tested in MySQL 5.5.44. Thus, we can avoid the insertion of the number 18446744073709551615.
note: the transaction makes sure that the variable #rows is in agreement to the table considered in the execution of statement.
I ran into a very similar issue when practicing LC#1321, in which I have to select all the dates but the first 6 dates are skipped.
I achieved this in MySQL with the help of ROW_NUMBER() window function and subquery. For example, the following query returns all the results with the first five rows skipped:
SELECT
fieldname1,
fieldname2
FROM(
SELECT
*,
ROW_NUMBER() OVER() row_num
FROM
mytable
) tmp
WHERE
row_num > 5;
You may need to add some more logics in the subquery, especially in OVER() to fit your need. In addition, RANK()/DENSE_RANK() window functions may be used instead of ROW_NUMBER() depending on your real offset logic.
Reference:
MySQL 8.0 Reference Manual - ROW_NUMBER()
Just today I was reading about the best way to get huge amounts of data (more than a million rows) from a mysql table. One way is, as suggested, using LIMIT x,y where x is the offset and y the last row you want returned. However, as I found out, it isn't the most efficient way to do so. If you have an autoincrement column, you can as easily use a SELECT statement with a WHERE clause saying from which record you'd like to start.
For example,
SELECT * FROM table_name WHERE id > x;
It seems that mysql gets all results when you use LIMIT and then only shows you the records that fit in the offset: not the best for performance.
Source: Answer to this question MySQL Forums. Just take note, the question is about 6 years old.
I know that this is old but I didnt see a similar response so this is the solution I would use.
First, I would execute a count query on the table to see how many records exist. This query is fast and normally the execution time is negligible. Something like:
SELECT COUNT(*) FROM table_name;
Then I would build my query using the result I got from count as my limit (since that is the maximum number of rows the table could possibly return). Something like:
SELECT * FROM table_name LIMIT count_result OFFSET desired_offset;
Or possibly something like:
SELECT * FROM table_name LIMIT desired_offset, count_result;
Of course, if necessary, you could subtract desired_offset from count_result to get an actual, accurate value to supply as the limit. Passing the "18446744073709551610" value just doesnt make sense if I can actually determine an appropriate limit to provide.
WHERE .... AND id > <YOUROFFSET>
id can be any autoincremented or unique numerical column you have...

Repeated Insert copies on ID

We have records with a count field on an unique id.
The columns are:
mainId = unique
mainIdCount = 1320 (this 'views' field gets a + 1 when the page is visited)
How can you insert all these mainIdCount's as seperate records in another table IN ANOTHER DBASE in one query?
Yes, I do mean 1320 times an insert with the same mainId! :-)
We actually have records that go over 10,000 times an id. It just has to be like this.
This is a weird one, but we do need the copies of all these (just) counts like this.
The most straightforward way to this is with a JOIN operation between your table, and another row source that provides a set of integers. We'd match each row from our original table to as many rows from the set of integer as needed to satisfy the desired result.
As a brief example of the pattern:
INSERT INTO newtable (mainId,n)
SELECT t.mainId
, r.n
FROM mytable t
JOIN ( SELECT 1 AS n
UNION ALL SELECT 2
UNION ALL SELECT 3
UNION ALL SELECT 4
UNION ALL SELECT 5
) r
WHERE r.n <= t.mainIdCount
If mytable contains row mainId=5 mainIdCount=4, we'd get back rows (5,1),(5,2),(5,3),(5,4)
Obviously, the rowsource r needs to be of sufficient size. The inline view I've demonstrated here would return a maximum of five rows. For larger sets, it would be beneficial to use a table rather than an inline view.
This leads to the followup question, "How do I generate a set of integers in MySQL",
e.g. Generating a range of numbers in MySQL
And getting that done is a bit tedious. We're looking forward to an eventual feature in MySQL that will make it much easier to return a bounded set of integer values; until then, having a pre-populated table is the most efficient approach.

Returning random rows from mysql database without using rand()

I would like to be able to pull back 15 or so records from a database. I've seen that using WHERE id = rand() can cause performance issues as my database gets larger. All solutions I've seen are geared towards selecting a single random record. I would like to get multiples.
Does anyone know of an efficient way to do this for large databases?
edit:
Further Edit and Testing:
I made a fairly simple table, on a new database using MyISAM. I gave this 3 fields: autokey (unsigned auto number key) bigdata (a large blob) and somemore (a medium int). I then applied random data to the table and ran a series of queries using Navicat. Here are the results:
Query 1: select * from test order by rand() limit 15
Query 2: select *
from
test
join
(select round(rand()*(select max(autokey) from test)) as val from test limit 15) as rnd
on
rnd.val=test.autokey;`
(I tried both select and select distinct and it made no discernible difference)
and:
Query 3 (I only ran this on the second test):
SELECT *
FROM (
SELECT #cnt := COUNT(*) + 1,
#lim := 10
FROM test
) vars
STRAIGHT_JOIN
(
SELECT r.*,
#lim := #lim - 1
FROM test r
WHERE (#cnt := #cnt - 1)
AND RAND(20090301) < #lim / #cnt
) i
ROWS: QUERY 1: QUERY 2: QUERY 3:
2,060,922 2.977s 0.002s N/A
3,043,406 5.334s 0.001s 1.260
I would like to do more rows so I can see how query 3 scales, but at the moment, it seems as though the clear winner is query 2.
Before I wrap up this testing and declare an answer, and while I have all this data and the test environment set up, can anyone recommend any further testing?
Try:
select * from table order by rand() limit 15
Another (and possibly more efficient way) would be to join against a set of random values. This should work, if there's some contiguous integer key in the table. Here is how I would do it in postgres (My MySQL is a bit rusty)
select * from table join
(select (random()*maxid)::integer as val from generate_series(1,15)) as rnd
on rand.val=table.id;
where maxid is the highest id in table. If id has an index, then this would mean only 15 index lookup, so its very fast.
UPDATE:
Looks like there no such thing as generate_series in MySQL. My fault. We don't need it actually:
select *
from
table
join
-- this just returns 15 random numbers.
-- I need `table` here only to produce rows for rand()
(select round(rand()*(select max(id) from table)) as val from table limit 15) as rnd
on
rnd.val=table.id;
P.S. If I don't want duplicates returned, I can use (select distinct [...]) in the random generator expression.
Update: Check out the accepted answer in this question. It's pure mySQL and even deals with even distribution.
The problem with id = rand() or anything comparable in PHP is that you can't be sure whether that particular ID still exists. Therefore, you need to work with LIMIT, and that can become slow for large amounts of data.
As an alternative to that, you could try using a loop in PHP.
What the loop does is
Create a random integer number using rand(), with a scope between 0 and the number of records in the database
Query the database whether a record with that ID exists
If it exists, add the number to an array
If it doesn't, go back to step 1
End the loop when the array of random numbers contains the desired number of elements
this method could cause a lot of queries in a fragmented table, but they should be pretty fast to execute. It may be faster than LIMIT rand() in certain situations.
The LIMIT method, as outlined by #Luther, is certainly the simplest code-wise.
You could do a query with all the results or however many limited, then use mysqli_fetch_all followed by:
shuffle($a);
$a = array_slice($a, 0, 15);
For a large dataset doing
select * from table order by rand() limit 15
can be quite time and memory consuming.
If your data records happen to be numbered you can put and index on the numbering colum and do a
select * from table where no >= rand() limit 15
Or even better do the random number generation in your application and do
select * from table where no >= $rand and no <= $rand+15
If your data doesn't change too often, it might be worth to add such a numbering a column to make the selection efficient.
Assuming MySQL supports nested queries and that operations on the primary key are fast, I'd try something like
select * from table where id in (select id from table order by rand() limit 15)

Categories