mysql_affected_rows is to get number of affected rows in previous MySQL operation, but I want to get affected rows in previous MySQL operation.
For example:
update mytable set status=2 where column3="a_variable";
Before this operation, status of some rows is already 2, and I want to get affected rows in previous MySQL operation, you can not get it by issuing a query of
select * from mytable where status=2
So how to do this work?
It can be efficiently and simply achieved with the following:
select * from mytable where column3="a_variable" and status != 2;
update mytable set status=2 where column3="a_variable";
The result of the first query are the rows that are going to change, and the second query actually changes them.
If this is a high performance system, you may need to take special care with transactions to prevent a new row slipping in between those 2 queries.
The implementation and function you are asking about is from PHP, please tag your question with PHP.
In response to your question tho, in order to do what you are suggesting you would need to keep a copy of the SQL table prior to the change and compare it with the table after the change, the resulting difference would be your answer. You can do this in PHP by simply loading the table contents into an array and comparing the arrays before and after, any new/changed data is your answer (Please note: it is highly not recommended you do this as it can cause an increased load on your server depending on the amount of data stored in the tables.)
Related
What is the difference between(performance wise)
$var = mysql_query("select * from tbl where id='something'");
$count = mysql_num_rows($var);
if($count > 1){
do something
}
and
$var = mysql_query("select count(*) from tbl where id='somthing'");
P.S: I know mysql_* are deprecated.
The first version returns the entire result set. This can be a large data volume, if your table is wide. If there is an index on id, it still needs to read the original data to fill in the values.
The second version returns only the count. If there is an index on id, then it will use the index and not need to read in any data from the data pages.
If you only want the count and not the data, the second version is clearer and should perform better.
select * is asking mysql to fetch all data from that table (given the conditions) and give it to you, this is not a very optimizable operation and will result in a lot of data being organised and sent over the socket to PHP.
Since you then do nothing with this data, you have asked mysql to do a whole lot of data processing for nothing.
Instead, just asking mysql to count() the number of rows that fit the conditions will not result in it trying to send you all that data, and will result in a faster query, especially if the id field is indexed.
Overall though, if your php application is still simple, while still being good practice, this might be regarded as a micro-optimization.
I would use the second for 2 reasons :
As you stated, mysql_* are deprecated
if your table is huge, you're putting quite a big amount of data in $var only to count it.
SELECT * FROM tbl_university_master;
2785 row(s) returned
Execution Time : 0.071 sec
Transfer Time : 7.032 sec
Total Time : 8.004 sec
SELECT COUNT(*) FROM tbl_university;
1 row(s) returned
Execution Time : 0.038 sec
Transfer Time : 0 sec
Total Time : 0.039 sec
The first collects all data and counts the number of rows in the resultset, which is performance-intensive. The latter just does a quick count which is way faster.
However, if you need both the data and the count, it's more sensible to execute the first query and use mysql_num_rows (or something similar in PDO) than to execute yet another query to do the counting.
And indeed, mysql_* is to be deprecated. But the same applies when using MySQLi or PDO.
I think using
$var = mysql_query("select count(*) from tbl where id='somthing'");
Is more efficient because you aren't allocating memory based on the number of rows that gets returned from MySQL.
select * from tbl where id='something' selects all the data from table with ID condition.
The COUNT() function returns the number of rows that matches a specified criteria.
For more reading and practice and demonstration please visit =>>> w3schools
I have a simple query using server-side pagination. The issue is the WHERE Clause makes a call to an expensive function and the functions argument is the user input, eg. what the user is searching for.
SELECT
*
FROM
( SELECT /*+ FIRST_ROWS(numberOfRows) */
query.*,
ROWNUM rn FROM
(SELECT
myColumns
FROM
myTable
WHERE expensiveFunction(:userInput)=1
ORDER BY id ASC
) query
)
WHERE rn >= :startIndex
AND ROWNUM <= :numberOfRows
This works and is quick assuming numberOfRows is small. However I would also like to have the total row count of the query. Depending on the user input and database size the query can take up to minutes. My current approach is to cache this value but that still means the user needs to wait minutes to see first result.
The results should be displayed in the Jquery datatables plugin which greatly helps with things like serer-side paging. It however requires the server to return a value for the total records to correctly display paging controls.
What would be the best approach? (Note: PHP)
I thought if returning first page immediately with a fake (better would be estimated) row count. After the page is loaded do an ajax call to a method that determines total row count of the query (what happens if the user pages during that time?) and then update the faked/estimated total row count.
However I have no clue how to do an estimate. I tried count(*) * 1000 with SAMPLE (0.1) but for whatever reason that actually takes longer than the full count query. Also just returning a fake/random value seems a bit hacky too. It would need to be bigger than 1 page size so that the "Next" button is enabled.
Other ideas?
One way to do it is as I said in the comments, to use a 'countless' approach. Modify the client side script in such a way that the Next button is always enabled and fetch the rows until there are none, then disable the Next button. You can always add a notification message to say that there are no more rows so it will be more user friendly.
Considering that you are expecting a significant amount of records, I doubt that the user will paginate through all the results.
Another way is to schedule a cron job that will do the counting of the records in the background and store that result in a table called totals. The running intervals of the job should be set up based on the frequency of the inserts / deletetions.
Then in the frontend, just use the count previously stored in totals. It should make a decent aproximation of the amount.
Depends on your DB engine.
In mysql, solution looks like this :
mysql> SELECT SQL_CALC_FOUND_ROWS * FROM tbl_name
-> WHERE id > 100 LIMIT 10;
mysql> SELECT FOUND_ROWS();
Basically, you add another attribute on your select (SQL_CALC_FOUND_ROWS) which tells mysql to count the rows as if limit clause was not present, while executing the query, while FOUND_ROWS actually retrieves that number.
For oracle, see this article :
How can I perform this query in oracle
Other DBMS might have something similar, but I don't know.
I was wondering which of these would be faster (performance-wise) to query (on MySQL 5.x CentOS 5.x if this matters):
SELECT * FROM table_name WHERE id=1;
SELECT * FROM table_name WHERE id=2;
.
.
.
SELECT * FROM table_name WHERE id=50;
or...
SELECT * FROM table_name WHERE id IN (1,2,...,50);
I have around 50 ids to query for. I know usually DB connections are expensive, but I've seen the IN clause isn't so fast either [sometimes].
I'm pretty sure the second option gives you the best performance; one query, one result. You have to start looking for > 100 items before it may become an issue.
See also the accepted answer from here: MySQL "IN" operator performance on (large?) number of values
IMHO you should try it and measure response time: IN should give you better performances...
Anyway if your ids are sequential you could try
SELECT * FROM table_name WHERE id BETWEEN 1 AND 50
Here is another post where the discuss the performance of using OR vs IN. IN vs OR in the SQL WHERE Clause
You suggested using multiple queries, but using OR would also work.
2nd will be faster because resources are consumed when query gets interpreted and during php communication with mysql for sending query and waiting for result , if your data is sequential you can also do just
SELECT * FROM table_name WHERE id <= 50;
I was researching this after experimenting with 3000+ values in an IN clause. It turned out to be multitudes faster than individual SELECTs since the column referenced in the IN was not keyed. My guess is that in my case it only needed to build a temporary index for that column once instead of 3000 separate times.
For example, if I have to count the comments belonging to an article, it's obvious I don't need to cache the comments total.
But what if I want to paginate a gallery (WHERE status = 1) containing 1 million photos. Should I save that in a table called counts or SELECT count(id) as total every time is fine?
Are there other solutions?
Please advise. Thanks.
For MySQL, you don't need to store the counts, you can use SQL_CALC_FOUND_ROWS to avoid two queries.
E.g.,
SELECT SQL_CALC_FOUND_ROWS *
FROM Gallery
WHERE status = 1
LIMIT 10;
SELECT FOUND_ROWS();
From the manual:
In some cases, it is desirable to know how many rows the statement
would have returned without the LIMIT, but without running the
statement again. To obtain this row count, include a
SQL_CALC_FOUND_ROWS option in the SELECT statement, and then invoke
FOUND_ROWS() afterward.
Sample usage here.
It depends a bit on the amount of queries that are done on that table with 1 million records. Consider just taking care of good indexes, especially also multi-column indexes (because they are easily forgotton: here. That will do a lot. And, be sure the queries become cached also well on your server.
If you use this column very regular, consider saving it (if it can't be cached by MySQL), as things could become slow. But most of the times good indexing will take care of it.
Best try: setup some tests to find out if a query can still be fast and performance is not dropping when you execute it a lot of times in a row.
EXPLAIN [QUERY]
Use that command (in MySQL) to get information about the way the query is performed and if it can be improved.
Doing the count every time would be OK.
During paging, you can use SQL_CALC_FOUND_ROWS anyway
Note:
A denormalied count will become stale
No-one will page so many items
I have used MySQL a lot, but I always wondered exactly how does it work - when I get a positive result, where is the data stored exactly? For example, I write like this:
$sql = "SELECT * FROM TABLE";
$result = mysql_query($sql);
while ($row = mysql_fetch_object($result)) {
echo $row->column_name;
}
When a result is returned, I am assuming it's holding all the data results or does it return in a fragment and only returns where it is asked for, like $row->column_name?
Or does it really return every single row of data even if you only wanted one column in $result?
Also, if I paginate using LIMIT, does it hold THAT original (old) result even if the database is updated?
The details are implementation dependent but generally speaking, results are buffered. Executing a query against a database will return some result set. If it's sufficiently small all the results may be returned with the initial call or some might be and more results are returned as you iterate over the result object.
Think of the sequence this way:
You open a connection to the database;
There is possibly a second call to select a database or it might be done as part of (1);
That authentication and connection step is (at least) one round trip to the server (ignoring persistent connections);
You execute a query on the client;
That query is sent to the server;
The server has to determine how to execute the query;
If the server has previously executed the query the execution plan may still be in the query cache. If not a new plan must be created;
The server executes the query as given and returns a result to the client;
That result will contain some buffer of rows that is implementation dependent. It might be 100 rows or more or less. All columns are returned for each row;
As you fetch more rows eventually the client will ask the server for more rows. This may be when the client runs out or it may be done preemptively. Again this is implementation dependent.
The idea of all this is to minimize roundtrips to the server without sending back too much unnecessary data, which is why if you ask for a million rows you won't get them all back at once.
LIMIT clauses--or any clause in fact--will modify the result set.
Lastly, (7) is important because SELECT * FROM table WHERE a = 'foo' and SELECT * FROM table WHERE a = 'bar' are two different queries as far as the database optimizer is concerned so an execution plan must be determined for each separately. But a parameterized query (SELECT * FROM table WHERE a = :param) with different parameters is one query and only needs to be planned once (at least until it falls out of the query cache).
I think you are confusing the two types of variables you're dealing with, and neither answer really clarifies that so far.
$result is a MySQL result object. It does not "contain any rows." When you say $result = mysql_query($sql), MySQL executes the query, and knows what rows will match, but the data has not been transferred over to the PHP side. $result can be thought of as a pointer to a query that you asked MySQL to execute.
When you say $row = mysql_fetch_object($result), that's when PHP's MySQL interface retrieves a row for you. Only that row is put into $row (as a plain old PHP object, but you can use a different fetch function to ask for an associative array, or specific column(s) from each row.)
Rows may be buffered with the expectation that you will be retrieving all of the rows in a tight loop (which is usually the case), but in general, rows are retrieved when you ask for them with one of the mysql_fetch_* functions.
If you only want one column from the database, then you should SELECT that_column FROM .... Using a LIMIT clause is also a good idea whenever possible, because MySQL can usually perform significant optimizations if it knows that you only want a certain group of rows.
The first question can be answered by reading up on resources
Since you are SELECTing "*", every column is returned for each mysql_fetch_object call. Just look at print_r($row) to see.
In simple words the resource returned it like an ID that the MySQL library associate with other data. I think it is like the identification card in your wallet, it's just a number and some information but asociated with a lot of more information if you give it to the goverment, or your cell-phone company, etc.