I am working on a project where a user can add comments and also hit any post.
Now I have to display the total number of comments and total number of hits and also show whether the user has already hitted that post or not.
So basically I need to do three sql queries for this action:
one counting comments,
one for counting hits and
one for checking whether the user has hitted the post or not.
I wanted to know that if it's possible to reduce these three sql queries into one and reduce the database load?
Any help is appreciated.
$checkifrated=mysql_query("select id from fk_views where (onid='$postid' and hit='hit' and email='$email')");//counting hits
$checkiffollowing=mysql_query("select id from fk_views where (onid='$postid' and hit='hit' and email='$email')");
$hitcheck=mysql_num_rows($checkifrated);//checking if already hited or not
$checkifrated=mysql_query("select id from fk_views where (onid='$postid' and comment !='' and email='$email')");//counting comments
This query returns the number of hits and number of nonempty comments.
select ifnull(sum(hit='hit'),0) as hits, ifnull(sum(comment !=''),0) as comments
from fk_views where onid='$postid' and email='$email'
Based on the queries you provided I dont think you need to query separately if he is hitted the post, just check in you code if number of hits is > 0
Yes, it may be possible to combine the three queries into a single query. That may (or may not) "reduce the database load". The key here is going to be an efficient execution plan, which is going to primarily depend on the availability of suitable indexes.
Combining three inefficient queries into one isn't going to magically make the query more efficient. The key is getting each of the queries to be as efficient as they can be.
If each of the queries is processing rows from the same table, then it may be possible to have a single SELECT statement process the entire set, to obtain the specified result. But if each of the the queries is referencing a different table, then it's likely the most efficient would be to combine them with a UNION ALL set operator.
Absent the schema definition, the queries that you are currently using, and the EXPLAIN output of each query, it's not practical to attempt to provide you with usable advice.
UPDATE
Based on the update to the question, providing sample queries... we note that two of the queries appear to be identical.
It would be much more efficient to have a query return a COUNT() aggregate, than pulling back all of the individual rows to the client and counting them on the client, e.g.
SELECT COUNT(1) AS count_hits
FROM fk_views v
WHERE v.onid = '42'
AND v.hit = 'hit'
AND v.email = 'someone#email.address'
To combine processing of the three queries, we can use conditional expressions in the SELECT list. For example, we could use the equality predicates on the onid and email columms in the WHERE clause, and do the check of the hit column with an expression...
For example:
SELECT SUM(IF(v.hit='hit',1,0)) AS count_hits
, SUM(1) AS count_all
FROM fk_views v
WHERE v.onid = '42'
AND v.email='someone#email.address'
The "trick" to getting three separate queries combined would be to use a common set of equality predicates (the parts of the WHERE clause that match in all three queries).
SELECT SUM(IF(v.hit='hit' ,1,0)) AS count_hits
, SUM(IF(v.comment!='',1,0)) AS count_comments
, SUM(1) AS count_all
FROM fk_views v
WHERE v.onid = '42'
AND v.email ='someone#email.address'
If we are going to insist on using the deprecated mysql interface (over PDO or mysqli) it's important that we use the mysql_real_escape_string function to avoid SQL Injection vulnerabilities
$sql = "SELECT SUM(IF(v.hit='hit' ,1,0)) AS count_hits
, SUM(IF(v.comment!='',1,0)) AS count_comments
, SUM(1) AS count_all
FROM fk_views v
WHERE v.onid = '" . mysql_real_escape_string($postid) . "'
AND v.email = '" . mysql_real_escape_string($email) ;
# for debugging
#echo $sql
$result=mysql_query($sql);
if (!$result) die(mysql_error());
while ($row = mysql_fetch_assoc($result)) {
echo $row['count_hits'];
echo $row['count_comments'];
}
For performance, we'd likely want an index with leading columns of onid and email, e.g.
... ON fk_views (onid,email)
The output from EXPLAIN will show the execution plan.
Related
SQL Queries /P1/
SELECT EXISTS(SELECT /p2/ FROM table WHERE id = 1)
SELECT /p2/ FROM table WHERE id = 1 LIMIT 1
SQL SELECT /P2/
COUNT(id)
id
PHP PDO Function /P3/
fetchColumn()
rowCount()
From the following 3 Parts, What is the best method to check if a row exists or not with and without the ability to retrieve data like.
Retrievable:
/Query/ SELECT id FROM table WHERE id = 1 LIMIT 1
/Function/ rowCount()
Irretrievable
/Query/ SELECT EXISTS(SELECT COUNT(id) FROM table WHERE id = 1)
/Function/ fetchColumn()
In your opinion, What is the best way to do that?
By best I guess you mean consuming the least resources on both MySQL server and client.
That is this:
SELECT COUNT(*) count FROM table WHERE id=1
You get a one-row, one-column result set. If that column is zero, the row was not found. If the column is one, a row was found. If the column is greater that one, multiple rows were found.
This is a good solution for a few reasons.
COUNT(*) is decently efficient, especially if id is indexed.
It has a simple code path in your client software, because it always returns just one row. You don't have to sweat edge cases like no rows or multiple rows.
The SQL is as clear as it can be about what you're trying to do. That's helpful to the next person to work on your code.
Adding LIMIT 1 will do nothing if added to this query. It is already a one-row result set, inherently. You can add it, but then you'll make the next person looking at your code wonder what you were trying to do, and wonder whether you made some kind of mistake.
COUNT(*) counts all rows that match the WHERE statement. COUNT(id) is slightly slower because it counts all rows unless their id values are null. It has to make that check. For that reason, people usually use COUNT(*) unless there's some chance they want to ignore null values. If you put COUNT(id) in your code, the next person to work on it will have to spend some time figuring out whether you meant anything special by counting id rather than *.
You can use either; they give the same result.
Okay so I have a table which currently has 40000 rows and I need to SELECT them all. I use a index for id and url column and if I have to SELECT a value by id or url it's instant but if I have to SELECT * it's very slow. What I'm trying to do is searching my database and output the matches and I did this with a
while($arr = mysqli_fetch_array($query))
{ #code... echo $arr['whatever_i_need']."<br>"; }
$query = mysqli_query($link,"SELECT * FROM table");
In the future I will have hundreds of millions of rows in the database so I would like to return the search results fast in 1 sec or something. If you can give me solutions I would really appreciate! Thanks!
EDIT:
I don't want to display all of the data but I want to loop through it quickly to find all the matches
If you want speed then you definitely don't want the query to return every row from the table, and then "loop through" every row returned by the query to identify the ones you are interested in returning. That approach might give acceptable performance with small tables, but it definitely doesn't scale.
For performance, you want the database to locate just the rows you want to return, filter out the ones you don't want, and return just the subset.
And that comes down to writing an appropriate SQL query; executing an appropriate SELECT statement.
SELECT t.col1
, t.col2
, t.col3
FROM mytable t
WHERE t.col3 LIKE '%foo%'
AND t.col2 >= '2016-03-15'
AND t.col2 < '2016-06-15'
ORDER BY t.col2 DESC, t.col1 DESC
LIMIT 200
Performance is about making sure appropriate indexes are available and that the query execution is making effective use of the available indexes.
I have to select 4 rows randomly from a column.
Is is better to generate randomly 4 id and to perform 4 requests 'select column from database where id = ... '
Or to select all the rows in one request and to choose after?
If you are capable of generating random existing id's, I think the best approach is to use a clause like where id in (id1, id2, id3, id4). This will result in getting 4 records in one query, so no unnecessary query's or records are fetched.
As told before, where id in (id1, id2, id3, id4) is the fastest way from the MySQL perspective. How ever, you will need some logic in the application generating those IDs : All 4 IDs shall exist, be randomly distributed, and you want to avoid duplicates. In worst case you will be retrieving a list of all existend IDs with a huge query, extracting 4 random values, and querying again.
With all that logic to be done, it can be wise to move selection into MySQL:
SELECT * FROM foobar
ORDER BY RAND()
LIMIT 4;
You must understand that this is slow in mysql, but you have a speed gain in the application logic and can be sure to get random values equally seed all over your table.
EDIT:
The comment asks if PHP is fasten in this task then MySQL. Answer is no.
It is not done by "using rand". You need to have an array containing all those IDs in PHP. That is a huge query, lots of TCP traffic, huge array to be buildt in php, huge btree to be buildt by zend engine. Then, with the IDs, you must fire a second query to get the rows for those IDs.
Although the RAND() function may be slow, so far I have not had significant problems with speed. MY strategy is actually to join the database back to a query of itself returning a list of random IDs with a limit.
SELECT *
FROM table AS t1
JOIN (
SELECT rowID
FROM table
ORDER BY RAND()
LIMIT 4
) AS t2
WHERE t1.rowID = t2.rowID
There is also a more robust solution that exist - try checking out this question (asked in 2010).
I have more than 400 000 id's in NOT IN statement. Whether it will execute or not ?
$query = "
SELECT
*
FROM
table_name
WHERE
my_field_id NOT IN(
34535345,3453451234,234242345,3465465,12234234,23435465,122343,345435,3453454,
34535345,3453451234,234242345,3465465,12234234,23435465,122343,345435,3453454,
34535345,3453451234,234242345,3465465,12234234,23435465,122343,345435,3453454,
34535345,3453451234,234242345,3465465,12234234,23435465,122343,345435,3453454,
34535345,3453451234,234242345,3465465,12234234,23435465,122343,345435,3453454,
34535345,3453451234,234242345,3465465,12234234,23435465,122343,345435,3453454
)
";
Yes it will (you could try that yourself without asking here actually). There is no reasonable limit for sql query string.
But you should keep in mind the more ids you add - the slower the query will be
PS: the only mysql setting you may be interested in is max_allowed_packet. From what I remember it is the only parameter that could bring some issues on extra-large queries
As I know the IN clause in MySQL does not have limitations, you may write as many id as you want. Agree with zerkms about the performance and max_allowed_packet variable.
As a workaround - try to populate another table (maybe temporary) with indexed id with your values, and then loin these two tables using JOIN clause.
AFAIK the limit is 1024 characters for NOT IN. If you use a sub-select there is no limit. You could populate a memory table with the id's, get them with a sub-select in the query and drop the memory table afterwards. This may even be faster.
so I have the following query:
SELECT DISTINCT d.iID1 as 'id',
SUM(d.sum + d.count*r.lp)/sum(d.count) AS avgrat
FROM abcd i, abce r, abcf d
WHERE r.aID = 1 AND
d.iID1 <> r.rID AND d.iID2 = r.rID GROUP BY d.iID1
ORDER BY avgrat LIMIT 50;
the problem is....with millions of entries in the table, SUM() and GROUP BY would freeze up the query....is there a way to do exactly this that would execute instantaneously using MySQL and/or PHP hacks (perhaps do the summing with PHP....but how would I go about doing that...)
To answer the direct question: no, there is no way to do anything instantaneously.
If you have control over the table updates, or the application which adds the relevant records, then you could add logic which updates another table with the sum, count, and id with each update. Then a revised query targets the "sum table" and trivially calculates the averages.
One solution is to create a rollup table that holds your aggregate values
using a triggers on your source tables to keep it up to date.
You will need to decide if the overhead of the triggers is less then that of the query.
some important factors are:
The frequency of the source table updates
The run frequency of the aggregate query.