I have a problem with a project I am currently working on, built in PHP & MySQL. The project itself is similar to an online bidding system. Users bid on a project, and they get a chance to win if they follow their bid by clicking and cliking again.
The problem is this: if 5 users for example, enter the game at the same time, I get a 8-10 seconds delay in the database - I update the database using the UNIX_TIMESTAMP(CURRENT_TIMESTAMP), which makes the whole system of the bids useless.
I want to mention too that the project is very database intensive (around 30-40 queries per page) and I was thinking maybe the queries get delayed, but I'm not sure if that's happening. If that's the case though, any suggestions how to avoid this type of problem?
Hope I've been at least clear with this issue. It's the first time it happened to me and I would appreciate your help!
You can decide on
Optimizing or minimizing required queries.
You can cache queries do not need to update on each visit.
You can use Summery tables
Update the queries only on changes.
You have to do this cleverly. You can follow this MySQLPerformanceBlog
I'm not clearly on what you're doing, but let me elaborate on what you said. If you're using UNIX_TIMESTAMP(CURRENT_TIMESTAMP()) in your MySQL query you have a serious problem.
The problem with your approach is that you are using MySQL functions to supply the timestamp record that will be stored in the database. This is an issue, because then you have to wait on MySQL to parse and execute your query before that timestamp is ever generated (and some MySQL engines like MyISAM use table-level locking). Other engines (like InnoDB) have slower writes due to row-level locking granularity. This means the time stored in the row will not necessarily reflect the time the request was generated to insert said row. Additionally, it can also mean that the time you're reading from the database is not necessarily the most current record (assuming you are updating records after they were inserted into the table).
What you need is for the PHP request that generates the SQL query to provide the TIMESTAMP directly in the SQL query. This means the timestamp reflects the time the request is received by PHP and not necessarily the time that the row is inserted/updated into the database.
You also have to be clear about which MySQL engine you're table is using. For example, engines like InnoDB use MVCC (Multi-Version Concurrency Control). This means while a row is being read it can be written to at the same time. If this happens the database engine uses something called a page table to store the existing value that will be read by the client while the new value is being updated. That way you have guaranteed row-level locking with faster and more stable reads, but potentially slower writes.
Related
Maybe this is an obvious question, but it's just something I'm unsure of. If I have two standalone PHP applications running on one LAMP server, and the two PHP applications share the same MySQL database, do I need to worry about data integrity during concurrent database transactions, or is this something that MySQL just takes care of "natively"?
What happens if the two PHP applications both try to update the same record at the same time? What happens if they try to update the same table at the same time? What happens if they both try to read data from the database at the same time? Or if one application tries to read a record at the same time as the other application is updating that record?
What happens if the two PHP applications both try to update the same record at the same time?
What happens if they try to update the same table at the same time?
What happens if they both try to read data from the database at the same time?
Or if one application tries to read a record at the same time as the other application is updating that record?
This depend from several factor ..
the db engine you are using
the locking policy / transaction you have setted for you envirement .. or for you query
https://dev.mysql.com/doc/refman/8.0/en/innodb-locking-reads.html
https://dev.mysql.com/doc/refman/8.0/en/innodb-locks-set.html
the code you are using .. you could use a select for update for lock only the rows you want modify
https://dev.mysql.com/doc/refman/8.0/en/update.html
and how you manage transaction
https://dev.mysql.com/doc/refman/8.0/en/commit.html
this is just a brief suggestion
I am doing a booking site using PHP and MySql where i will get lots of data for insertion for a single insertion. Means if i get 1000 booking at a time i will be very slow. So what i am thinking to dump those data in MongoDb and run task to save in MySql. Also i am thing to use Redis for caching most viewed data.
Right now i am directly inserting in db.
Please suggest any one has any idea/suggestion about it.
In pure insert terms, it's REALLY hard to outrun MySQL... It's one of the fastest pure-append engines out there (that flushes consistently to disk).
1000 rows is nothing in MySQL insert performance. If you are falling at all behind, reduce the number of secondary indexes.
Here's a pretty useful benchmark: https://www.percona.com/blog/2012/05/16/benchmarking-single-row-insert-performance-on-amazon-ec2/, showing 10,000-25,000 inserts individual inserts per second.
Here is another comparing MySQL and MongoDB: DB with best inserts/sec performance?
I have a multiple devices (eleven to be specific) which sends information every second. This information in recieved in a apache server, parsed by a PHP script, stored in the database and finally displayed in a gui.
What I am doing right now is check if a row for teh current day exists, if it doesn't then create a new one, otherwise update it.
The reason I do it like that is because I need to poll the information from the database and display it in a c++ application to make it look sort of real-time; If I was to create a row every time a device would send information, processing and reading the data would take a significant ammount of time as well as system resources (Memory, CPU, etc..) making the displaying of data not quite real-time.
I wrote a report generation tool which takes the information for every day (from 00:00:00 to 23:59:59) and put it in an excel spreadsheet.
My questions are basically:
Is it posible to do the insertion/updating part directly in the database server or do I have to do the logic in the php script?
Is there a better (more efficient) way to store the information without a decrease in performance in the display device?
Regarding the report generation, if I want to sample intervals lets say starting from yesterday at 15:50:00 and ending today at 12:45:00 it cannot be done with my current data structure, so what do I need to consider in order to make a data structure which would allow me to create such queries.
The components I use:
- Apache 2.4.4
- PostgreSQL 9.2.3-2
- PHP 5.4.13
My recommendations - just store all the information, your devices are sending. With proper indexes and queries you can process and retrieve information from DB really fast.
For your questions:
Yes it is possible to build any logic you desire inside Postgres DB using SQL, PL/pgSQL, PL/PHP, PL/Java, PL/Py and many other languages built into Postgres.
As I said before - proper indexing can do magic.
If you cannot get desired query speed with full table - you can create a small table with 1 row for every device. And keep in this table last known values to show them in sort of real-time.
1) The technique is called upsert. In PG 9.1+ it can be done with wCTE (http://www.depesz.com/2011/03/16/waiting-for-9-1-writable-cte/)
2) If you really want it to be real-time you should be sending the data directly to the aplication, storing it in memory or plaintext file also will be faster if you only care about the last few values. But PG does have Listen/notify channels so probabably your lag will be just 100-200 mili and that shouldn't be much taken you're only displaying it.
I think you are overestimating the memory system requirements given the process you have described. Adding a row of data every second (or 11 per second) is not a hog of resources. In fact it is likely more time consuming to UPDATE vs ADD a new row. Also, if you add a TIMESTAMP to your table, sort operations are lightning fast. Just add some garbage collection handling as a CRON job (deletion of old data) once a day or so and you are golden.
However to answer your questions:
Is it posible to do the insertion/updating part directly in the database server or do I >have to do the logic in the php script?
Writing logic from with the Database engine is usually not very straight forward. To keep it simple stick with the logic in the php script. UPDATE (or) INSERT INTO table SET var1='assignment1', var2='assignment2' (WHERE id = 'checkedID')
Is there a better (more efficient) way to store the information without a decrease in >performance in the display device?
It's hard to answer because you haven't described the display device connectivity. There are more efficient ways to do the process however none that have locking mechanisms required for such frequent updating.
Regarding the report generation, if I want to sample intervals lets say starting from >yesterday at 15:50:00 and ending today at 12:45:00 it cannot be done with my current data >structure, so what do I need to consider in order to make a data structure which would >allow me to create such queries.
You could use the a TIMESTAMP variable type. This would include DATE and TIME of the UPDATE operation. Then it's just a simple WHERE clause using DATE functions within the database query.
I am making a webservice in PHP which does a series of calculations based on a select from a table and then updates the table afterwards with the new results.
However i want to prevent the case where another person is making a call to the same webservice while another person's session is still doing an update.
Is it the right thing here to lock that entire table and then unlock it again? If so, how do i lock and unlock a mysql table using PHP pdo?
Database Management Systems like MySQL are smart enough to prevent concurrency violations like these.
Look for database isolation levels (read uncommitted, read commited, repeatable read, serializable) and the possible problems (dirty read, non repeatable read ...) -> Wikipedia.
Personally, I would not recommend a table lock in your case. You better wrap your calculations and database operations in a transaction and rely on the DBMS to manage your stuff.
I posted a comment:
Not a direct answer, but I don't think this is any problem. The
calculations and fetching data from the database is done within a few
milliseconds. The chances or two people interacting at the same time
is soooo small that most people don't bother making a lock like this.
But if these calculations are critical you could prevent this problem by adding a new field and simply call it occupied, busy or something like that.
When you run your script, check if this field is set to for example 1, if it is, make the script sleep for 1-2-3 seconds and then retry. If this field is set to 0, update it to 1, do the calculations and set it back to 0 again.
This would prevent two people from accessing the same values at the same time.
Is it possible to do a simple count(*) query in a PHP script while another PHP script is doing insert...select... query?
The situation is that I need to create a table with ~1M or more rows from another table, and while inserting, I do not want the user feel the page is freezing, so I am trying to keep update the counting, but by using a select count(\*) from table when background in inserting, I got only 0 until the insert is completed.
So is there any way to ask MySQL returns partial result first? Or is there a fast way to do a series of insert with data fetched from a previous select query while having about the same performance as insert...select... query?
The environment is php4.3 and MySQL4.1.
Without reducing performance? Not likely. With a little performance loss, maybe...
But why are you regularily creating tables and inserting millions of row? If you do this only very seldom, can't you just warn the admin (presumably the only one allowed to do such a thing) that this takes a long time. If you're doing this all the time, are you really sure you're not doing it wrong?
I agree with Stein's comment that this is a red flag if you're copying 1 million rows at a time during a PHP request.
I believe that in a majority of cases where people are trying to micro-optimize SQL, they could get much greater performance and throughput by approaching the problem in a different way. SQL shouldn't be your bottleneck.
If you're doing a single INSERT...SELECT, then no, you won't be able to get intermediate results. In fact this would be a Bad Thing, as users should never see a database in an intermediate state showing only a partial result of a statement or transaction. For more information, read up on ACID compliance.
That said, the MyISAM engine may play fast and loose with this. I'm pretty sure I've seen MyISAM commit some but not all of the rows from an INSERT...SELECT when I've aborted it part of the way through. You haven't said which engine your table is using, though.
The other users can't see the insertion until it's committed. That's normally a good thing, since it makes sure they can't see half-done data. However, if you want them to see intermediate data, you could throw in an occassional call to "commit" while you're inserting.
By the way - don't let anybody tell you to turn autocommit on. That a HUGE time waster. I have a "delete and re-insert" job on my database that takes 1/3rd as long when I turn off autocommit.
Just to be clear, MySQL 4 isn't configured by default to use transactions. It uses the MyISAM table type which locks the entire table for each insert, if I remember correctly.
Your best bet would be to use one of the MySQL bulk insertion functions, such as LOAD DATA INFILE, as these are dramatically faster at inserting large amounts of data. As for the counting, well, you could break the inserts into N groups of 1000 (or Y) then divide your progress meter into N sections and just update it on each group's request.
Edit: Another thing to consider is, if this is static data for a template, then you could use a "select into" to create a new table with the same data. Not sure what your application is, or the intended functionality, but that could work as well.
If you can get to the console, you can ask various status questions that will give you the information you are looking for. There's a command that goes something like "SHOW processlist".