I have searched SO and read many questions but didn't find any that really answers my question, first a little background info:
PHP Script receive data (gaming site)
Several DB servers are available for redundancy and performance
I am well aware of MySQL Replication and Cluster but here are my problems with those solutions:
1) In Replication if the Master fails, the entire grid fails or long downtimes are suffered
2) In Cluster, first I thought that in order to add another node one must also suffer downtime, but reading again the documentation Im not so sure anymore
Q1: Can someone please clarify if the "rolling restart" actually means downtime for any application connecting to the grid?
Since I was under the impression that downtime was inevitable it seemed to me that a 3d application would solve this problem:
PHP connects to 3d App, 3d App inserts/updates/deletes into one database to quickly return last_insert_id, PHP continues its process and 3d App continues inserting/updating/deleting from the other data nodes. In this scenario each DB is not replicated or clustered, they are standalone DB servers, the 3d App is a daemon.
Q2: Does anybody know of such an app?
In the above scenario selects from the PHP end would randomly choose a DB server (to load balance)
Thank you for your time and wisdom
A rolling restart basically tracks a series of nodes and restarts them one by one. It makes sure that no users are logged on to the node before the restart, then restarts, moves on to the next node or server and so forth. So yes your server will be restarted but in a sequence and so if you have a cluster setup with n nodes, each node restarts one by one, hence either removing the down time or limiting it.
I would suggest integrating your PHP script with a NoSQL database, you can set up clusters for those, and will have almost no latency. If you still want a MySQL Synced database then you can also try to set up the NoSQL as master and replicate to a MySQL slave, that too is possible.
Lots of questions here.
There is no implicit functionality within master-slave erplication for promoting a slave in the event that a master fails. Bu scripting it yourself is trivial.
For master-master replication that is just not an issue - OTOH, running with lots of nodes, a failure can increasing divergence in the datasets.
A lot of the functionality described by your 3rd app is implemented by mysqlproxy - although there's nothing to stop you building the functionality into your own DB abstraction layer (you can hand off processing via an asynchronous message/http call or as a shutdown function)
Related
This is my first nervous question on SO because all of my questions in the last decade have already had excellent answers.
I have searched all the terms that I can think of with no hits that appear to address the problem - either on SO or Google generally...
For the last 15 years we have used phpMyAdmin to administer a linux MySQL manufacturing database of about 100 tables, some of which are now 50 to 300 million records each. Ongoing development is constant, and manual lookup of various tables to correct erroneous data, or to modify table indexes etc are frequent as the size of the data grows. All of this is internal to our fast network - i.e. accessed via our intranet. Most queries are short, and the database runs responsively at a low average loading.
As may be understood, DBA mistakes happen. For example to speed up a slow query, an additional index may be added to a large table without enough thought. At this point, the re-indexing may take 30 minutes, and the manufacturing applications (written in php for Apache2 also on a linux server) come to an immediate halt. This is not appreciated in the factory.
And here is the real problem. I cannot then from my development PC open a second instance of phpMyAdmin to kill the unwanted MySQL process while it is still busy. Which is the very time I need to most :-) The browser just goes into waiting for the phpMyAdmin page to load until after the long query is finished.
If I happen to have a second instance pf phpMyAdmin open already, I can look up the process and kill it satisfactorily. Normallly, my only resort is to restart Apache2 and/or MySQL on the server. This is too drastic and requires re-starting many client machines as well in order to re-establish necessary manufacturing connections to the database.
I have seen reference on SO that Apache will queue requests from the same IP address in the case of php programs using file-based session management, but it seems to me that I have no control over how phpMyAdmin uses its sessions.
I also read some time ago that if multiple CPU cores were brought into play on the database server, multiple simultaneous connections could be made despite one such query still being busy. I cannot now find any reference to this concept.
Does anyone please know how to permit or force a second phpMyAdmin connection from the same PC to the same database server using phpMyAdmin while the first instance of phpMyAdmin is still tied up with a previous slow query?
Many thanks, Jem Stanners
Try mySQL Workbench
https://dev.mysql.com/downloads/workbench/
Try upgrading servers RAMs an processors
Consider cleaning the tables and delete rows if possible
Consider shifting to Oracle (cost is to be considered)
TLDR
Would moving a PHP application's logic to a C++ daemon that interacts with an OracleDB be a smart move?
I created a simple application for one of the teams at our company, basically they audit transactions made and mark any errors/incompleteness they find.
Initially it was PHP Apachemod + MYSQL running on a Virtual CentOS with 15 users. MYSQL was hogging the CPU at multiple times. Since then I've moved it to PHP-FPM & oracle.
The queries have been optimized and indexes created correctly and where needed.
The application has about 16 users simultaneously, each expected to audit around 350 transaction a day. Write operations are almost each second, with every transaction requiring an insert in about 4 tables. The DB structure is currently one large database (no partitioning, no caching). On a daily basis about 70K new transactions are added to the database, and as of today there are > 1M transactions.
Users sometimes witness delay since the PHP side needs to complete the DB write before returning.
I was thinking that this could be improved by:
Move to a dedicated server
Optimizing the DB first (partition - log and tmp, monthly partitions)
Create an archiving structure to move previous quarters transactions and user operations
Maybe move SQL to stored procedures?
Create C++ daemon that looks for audit details (thinking could be a file based event ie. looks for a file and loads the records appropriately)
-- could be then made as a PHP library
PHP utilizes Gearman to send details to input directory of C++. Or maybe use memcached.
This way the PHP frontend could immediately return to the user once it passed the details to memcached/gearman which i hope would be faster than a DB write.
What other options are there? Is this overcomplicating the problem? And yes, the team leader has complained over 2second delays. ("premature optimization is the root of all evil" does not apply here)
This sounds like Your application is system write bound at the moment.
One might ask if the transaction processing really needs to be synchronous or can also be dealt with asynchronously (think message queue).
Otherwise I'd see two reasonable paths of action
Separate the persistence layer from the application layer, i.e: Host the DB on a dedicated machine with better write throughput.
Review the actual persistence schema of Your application, 2s for four writes sounds like You are doing very expensive writes.
Regarding C++ and Stored Procedures I don't see potential for high returns as it seems the problem is nearer to Hardware bounds or Business rules than it is to the used technology.
I am writing a PHP application which uses MySQL in the backend. I am expecting about 800 users a second to be hitting our servers, with requests coming from an iOS app.
The application is spread out over about 8 diffrenet PHP scripts which are doing very simple SELECT queries (occasionally with 1 join) and simple INSERT queries where I'm only inserting one row at a time (with less than 10kb of data per row on average). Theres about a 50/50 split between SELECTS and INSERTS.
The plan is to use Amazon Web Services and host the application on EC2s to spread the CPU load and RDS (with MySQL) to handle the database, but I'm aware RDS doesn't scale out, only up. So, before committing to an AWS solution, I need to benchmark my application on our development server (not a million miles of the medium RDS solution spec) to see roughly how many requests a second my application and MySQL can handle (for ballpark figures) - before doing an actual benchmark on AWS itself.
I believe I only really need to performance test the queries within the PHP, as EC2 should handle the CPU load, but I do need to see if / how RDS (MySQL) copes under that many users.
Any advice on how to handle this situation would be appreciated.
Thank-you in advance!
Have you considered using Apache Benchmark? Should do the job here. Also I've heard good things about siege but haven't tested yet.
If you have 800 user hits per second, it could be a good idea to consider sharding to begin with. Designing and implementing sharding right at the beginning will allow you to start with a small number of hosts, and then scale out more easily later. If you design for only one server, even if for now it will handle the load, pretty soon you will need to scale up and then it will be much more complex to switch to a sharding architecture when the application is already in production.
I am trying to write a client-server app.
Basically, there is a Master program that needs to maintain a MySQL database that keeps track of the processing done on the server-side,
and a Slave program that queries the database to see what to do for keeping in sync with the Master. There can be many slaves at the same time.
All the programs must be able to run from anywhere in the world.
For now, I have tried setting up a MySQL database on a shared hosting server as where the DB is hosted
and made C++ programs for the master and slave that use CURL library to make request to a php file (ex.: www.myserver.com/check.php) located on my hosting server.
The master program calls the URL every second and some PHP code is executed to keep the database up to date. I did a test with a single slave program that calls the URL every second also and execute PHP code that queries the database.
With that setup however, my web hoster suspended my account and told me that I was 'using too much CPU resources' and I that would need to use a dedicated server (200$ per month rather than 10$) from their analysis of the CPU resources that were needed. And that was with one Master and only one Slave, so no more than 5-6 MySql queries per second. What would it be with 10 slaves then..?
Am I missing something?
Would there be a better setup than what I was planning to use in order to achieve the syncing mechanism that I need between two and more far apart programs?
I would use Google App Engine for storing the data. You can read about free quotas and pricing here.
I think the syncing approach you are taking is probably fine.
The more significant question you need to ask yourself is, what is the maximum acceptable time between sync's that is acceptable? If you truly need to have virtually realtime syncing happening between two databases on opposite sites of the world, then you will be using significant bandwidth and you will unfortunately have to pay for it, as your host pointed out.
Figure out what is acceptable to you in terms of time. Is it okay for the databases to only sync once a minute? Once every 5 minutes?
Also, when running sync's like this in rapid succession, it is important to make sure you are not overlapping your syncs: Before a sync happens, test to see if a sync is already in process and has not finished yet. If a sync is still happening, then don't start another. If there is not a sync happening, then do one. This will prevent a lot of unnecessary overhead and sync's happening on top of eachother.
Are you using a shared web host? What you are doing sounds like excessive use for a shared (cPanel-type) host - use a VPS instead. You can get an unmanaged VPS with 512M for 10-20USD pcm depending on spec.
Edit: if your bottleneck is CPU rather than bandwidth, have you tried bundling up updates inside a transaction? Let us say you are getting 10 updates per second, and you decide you are happy with a propagation delay of 2 seconds. Rather than opening a connection and a transaction for 20 statements, bundle them together in a single transaction that executes every two seconds. That would substantially reduce your CPU usage.
I have a PHP app that is running on an Apache server with MySQL databases.
Based on the subdomain that users access, I am connecting them to a database (sub1.domain.com connects to database_sub1 and sub2.domain.com connects to database_sub2). Right now there are 10 subdomain-database combos, but that number could potentially grow to well over 100.
So, is this a bad thing?
Considering my situation, is mysql_pconnect the way to go?
Thanks, and please let me know if more info would be helpful.
Josh
Is this an app you have written?
If so, from a maintenance standpoint this may turn into a nightmare.
What happens when you alter the program and need to change the database?
Unless you have a sweet migration tool to help you make changes to all your databases to the new schema you may find yourself in a world of hurt.
I realize you may be too far into this project now, but if a little additional relation was added to the schema to differentiate between the domains (companies/users), you could run them all off one database with little additional overhead.
If performance really becomes a problem (Read this) you can implement Clustering or another elegant solution, but at least you won't have 100+ databases to maintain.
It partly depends on the rest of your configuration, but as long as each transaction only involves one connection then the database client code should perform as you would expect - about the same as with a single database but with more possibilities to improve the performance of the database servers, up to the limit of the network bandwidth.
If more than one connection participates in a transaction, then you probably need an XA compliant transaction manager, and these usually carry a significant performance overhead.
No, it's not a bad thing.
It's rather question of number of parallel connections in total. This can be defined by "max_connections" in mysql settings (default it's 151 since MySQL 5.1.15), and is limited by capability of your platform (i.e. 2048< on Windows, more on Linux), hardware (RAM) and system settings (mainly by limit of open files). It can be a bottleneck if you have many parallel users, number of databases is not important.
I made a script which connects 400+ databases in one execution (one after one, not parallel) and i found mysql + php handling it very well (no significant memory leaks, no big overhead). So i assume there will be no problem with your configuration.
And, finnaly - mysql_pconnect is generally not good think in web developement if there is no significant overhead in connecting database per se. You have to manage it really carefully to avoid problems with max_connections, locks, pending scripts etc. I think that pconnect has limited use (ie. cron job runned every second or something like that)