This question already has answers here:
Increment a database field by 1
(5 answers)
Closed 8 years ago.
I'm trying to increment a like counter for a users post. A post in my MySQL table has a field likes which shows how many likes this specific post has. now what will happen if multiple user like the same post at the same time? I think this will result in a conflict or not increment correctly? How can I avoid this, do I need to lock the row or which options do I have?
my query could look something like this:
UPDATE posts
SET likes = likes + 1,
WHERE id = some_value
also when a user unlikes a post this should decrement --> likes -1
Thanks for any help!
It's a simple enough query to run:
UPDATE mytable SET mycolumn = mycolumn + 1;
That way even if you have multiple people liking at the same time the queries won't run at exactly the same time and so you'll get the correct number at the end.
Queries such as these run in fractions of a second, so you don't need to worry about multiple users clicking on them unless you've got millions of users, and then you'll have lots of problems to do with queries.
Someone liking the same post at the same time will only cause issues in long and complex queries.
This should be more than sufficient for an increment otherwise the entirety of SQL would be rendered useless.
UPDATE posts
SET likes = likes + 1
WHERE id = some_value
You could always run this query twice programmaticly and see what happens. I can guarantee that it will go from 0 to 2.
someTableAdapter.LikePost(postID);
someTableAdapter.LikePost(postID);
What you described is called a race condition (as in, race to the finish. Kind of like playing musical chairs with two people, but only one chair). I believe if you research transactions you may be able to implement your code and have pseudo-concurrent likes. Here's a link to the MySQL Manual.
MySQL 5.1 Manual: Transactional and Locking Statements
YouTube: MySQL Transactions
"MySQL uses row-level locking for InnoDB tables to support simultaneous write access by multiple sessions, making them suitable for multi-user, highly concurrent, and OLTP applications.
MySQL uses table-level locking for MyISAM, MEMORY, and MERGE tables, allowing only one session to update those tables at a time, making them more suitable for read-only, read-mostly, or single-user applications. "
MySQL Manual, 8.7.1: Internal Locking Methods
But, if transactions are too much, make sure you are using at least the InnoDB storage engine and you should be alright if you are using an ACID compliant database with the the proper level of isolation.
There are lots of good references, but hopefully I've pointed you in the right direction.
Anthony
Related
MyPHP Application sends a SELECT statement to MySQL with HTTPClient.
It takes about 20 seconds or more.
I thought MySQL can’t get result immediately because MySQL Administrator shows stat for sending data or copying to tmp table while I'd been waiting for result.
But when I send same SELECT statement from another application like phpMyAdmin or jmater it takes 2 seconds or less.10 times faster!!
Dose anyone know why MySQL perform so difference?
Like #symcbean already said, php's mysql driver caches query results. This is also why you can do another mysql_query() while in a while($row=mysql_fetch_array()) loop.
The reason MySql Administrator or phpMyAdmin shows result so fast is they append a LIMIT 10 to your query behind your back.
If you want to get your query results fast, i can offer some tips. They involve selecting only what you need and when you need:
Select only the columns you need, don't throw select * everywhere. This might bite you later when you want another column but forget to add it to select statement, so do this when needed (like tables with 100 columns or a million rows).
Don't throw a 20 by 1000 table in front of your user. She cant find what she's looking for in a giant table anyway. Offer sorting and filtering. As a bonus, find out what she generally looks for and offer a way to show that records with a single click.
With very big tables, select only primary keys of the records you need. Than retrieve additional details in the while() loop. This might look like illogical 'cause you make more queries but when you deal with queries involving around ~10 tables, hundreds of concurrent users, locks and query caches; things don't always make sense at first :)
These are some tips i learned from my boss and my own experince. As always, YMMV.
Dose anyone know why MySQL perform so difference?
Because MySQL caches query results, and the operating system caches disk I/O (see this link for a description of the process in Linux)
the query i'd like to speed up (or replace with another process):
UPDATE en_pages, keywords
SET en_pages.keyword = keywords.keyword
WHERE en_pages.keyword_id = keywords.id
table en_pages has the proper structure but only has non-unique page_ids and keyword_ids in it. i'm trying to add the actual keywords(strings) to this table where they match keyword_ids. there are 25 million rows in table en_pages that need updating.
i'm adding the keywords so that this one table can be queried in real time and return keywords (the join is obviously too slow for "real time").
we apply this query (and some others) to sub units of our larger dataset. we do this frequently to create custom interfaces for specific sub units of our data for different user groups (sorry if that's confusing).
this all works fine if you give it an hour to run, but i'm trying to speed it up.
is there a better way to do this that would be faster using php and/or mysql?
I actually don't think you can speed up the process.
You can still add brutal power to your database by cluserting new servers.
Maybe I'm wrong or missunderstood the question but...
Couldn't you use TRIGGERS ?
Like... when a new INSERT is detected on "en_pages", doing a UPDATE after on that same row?
(I don't know how frequent INSERTS are in that table)
This is just an idea.
How often does "en_pages.keyword" and "en_pages.keyword_id" changes after being inserted ?!?!?
I don't know about mySQL but usually this sort of thing runs faster in SQL Server if you process a limited number of batches of records (say a 1000) at a time in a loop.
You might also consider a where clause (I don't know what mySQL uses for "not equal to" so I used the SQL Server verion):
WHERE en_pages.keyword <> keywords.keyword
That way you are only updating records that have a difference in the field you are updating not all of the them.
I have few tables which are accessed frequently by users. The same kind of queries are running again and again which cause extra load on the server.
The records do not insert/update frequently, I was thinking to cache the IDs into memcached and then fetch them from database this will reduce the burden of searching/sorting etc.
Here is an example
SELECT P.product_id FROM products P, product_category C WHERE P.cat_id=C.cat_id AND C.cat_name='Fashion' AND P.is_product_active=true AND C.is_cat_active=true ORDER BY P.product_date DESC
The above query will return all the product ids of a particular category which will be imported into memcached and then rest of the process (i.e., paging) will be simulated same as we do with mysql result sets.
The insert process will either expire the cache or insert the product id on the first row of the array.
My question is this the practical apporach? How do people deal with searches say if a person is searching for a product which returns 10000 results (practically may not possible) do they search every time tables? Is there any good example available of memcached and mysql which shows how these tasks can be done?
you may ask yourself if you really need to invalidate the cache upon insert/update of a product.
Usually a 5 minutes cache can be acceptable for a product list.
If your invalidation scheme is time-based only (new entries will only appear after 5 min) there is a quick&dirty trick that you can use with memcache : simply use as a memcache key an md5 of your sql query string, and tell memcache to keep the result of the select for 5 minutes.
I hope this will help you
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Which is faster/best? SELECT * or SELECT column1, colum2, column3, etc.
I am currently porting an application written in MySQL3 and PHP4 to MySQL5 and PHP5.
On analysis I found several SQL queries which uses "select * from tablename" even if only one column(field) is processed in PHP. The table has almost 60 columns and it has a primary key. In most cases, the only column used is id which is the primary key.
Will there be any performance boost if I use queries in which the column names are explicitly mentioned instead of * ? (In this application there is only one method which we need all the columns and all other methods return only a subset of the columns)
It is generally considered good practise to only fetch what is needed. Especially if the database server is not on the same machine, fetching an entire row will result in slower queries, because there is more data to transport over the network to the consuming machine. So if a full row is like 100k of data and you only need the ID which is much less, you will get faster results of course.
As a general tip for optimizing queries, use the EXPLAIN statement to see how costly a query will be.
"Premature optimization is root of the all evil". Donald Knuth.
Never ask a question like Will there be any performance boost?. But ask only a question like "I have certain bottleneck. How can I eliminate it?"
In 99% of our applications, this "improvement" would be irrlelvant. As many other improvements, based on the dreams, not on the profiling and real needs.
Will there be any performance boost if
I use queries in which the column
names are explicitly mentioned instead
of * ? - YES
If and how much you benefit depends on the case, but at least for the cases when you only need the id column, you should fix the SQL.
In addition to the reduced network traffic (of sending useless data), the database may be able to get to the few columns you do need just using indexes, without accessing the table at all. That would speed things up a lot.
The only possible downside is the increased number of distinct SQL statements that the server has to process (and more complex code on your end).
No - there will be an impact on performance but as long as there aren't BLOBs/CLOBs in the schema it will be negligible (unless you access your database over a 300 baud modem) - most of the work done by the database is in identifying the rows matching the WHERE clause - however its (IMHO) bad programming practice to use SELECT *
C.
Yes. Fetch only the columns you require. Not only can this improve performance, but it will prevent your code from inadvertently breaking. Consider this query:
SELECT *
FROM tabA JOIN tabB on ...
ORDER BY colX
They query works today when only tabA has colX, but if you change schema and add colX to tabB, the query will abend.
Of course using table aliases for all fields will also help prevent breakage.
-Krip
Yes. If you're fetching more data than you need, that has to be read from disk, transferred between MySQL and PHP, etc. which is probably going to take longer.
I am building a fairly large statistics system, which needs to allow users to requests statistics for a given set of filters (e.g. a date range).
e.g. This is a simple query that returns 10 results, including the player_id and amount of kills each player has made:
SELECT player_id, SUM(kills) as kills
FROM `player_cache`
GROUP BY player_id
ORDER BY kills DESC
LIMIT 10
OFFSET 30
The above query will offset the results by 30 (i.e. The 3rd 'page' of results). When the user then selects the 'next' page, it will then use OFFSET 40 instead of 30.
My problem is that nothing is cached, even though the LIMIT/OFFSET pair are being used on the same dataset, it is performing the SUM() all over again, just to offset the results by 10 more.
The above example is a simplified version of a much bigger query which just returns more fields, and takes a very long time (20+ seconds, and will only get longer as the system grows).
So I am essentially looking for a solution to speed up the page load, by caching the state before the LIMIT/OFFSET is applied.
You can of course use caching, but i would recommend caching the result, not the query in mysql.
But first things first, make sure that a) you have the proper indexing on your data, b) that it's being used.
If this does not work, as group by tends to be slow with large datasets, you need to put the summary data in a static table/file/database.
There are several techniques/libraries etc that help you perform server side caching of your data. PHP Caching to Speed up Dynamically Generated Sites offers a pretty simple but self explanatory example of this.
Have you considered periodically running your long query and storing all the results in a summary table? The summary table can be quickly queried because there are no JOINs and no GROUPings. The downside is that the summary table is not up-to-the-minute current.
I realize this doesn't address the LIMIT/OFFSET issue, but it does fix the issue of running a difficult query multiple times.
Depending on how often the data is updated, data-warehousing is a straightforward solution to this. Basically you:
Build a second database (the data warehouse) with a similar table structure
Optimise the data warehouse database for getting your data out in the shape you want it
Periodically (e.g. overnight each day) copy the data from your live database to the data warehouse
Make the page get its data from the data warehouse.
There are different optimisation techniques you can use, but it's worth looking into:
Removing fields which you don't need to report on
Adding extra indexes to existing tables
Adding new tables/views which summarise the data in the shape you need it.