Closing & reopening mysql connection in after remote API replies - php

I am working on a backend application requires some remote API requests, Which take 1~3 second to response depending on load on remote services.
This backend will get ALOT of requests/second, I am trying to achieve best performance I can get out of my server.
Should I close mysql connection before calling the API and reopen it again after receiving the reply to free up some resources ?
If I did so, What wrong may happen?
What I should Do if I can't connect again?
I MUST store and update database after receiving the request.
I am using pure PHP with MySQL (MySQLi)

If your application has to perform some time-consuming task that is not related to the database then you should close the connection so that your PHP process doesn't block the resources on MySQL server.
You can open a new connection after the API call is over. The obvious downside to this is that you will lose the ability to perform transactions. A transaction cannot remain open across two separate MySQL sessions.
An even better option would be to detach the long-running process from your main process. There a number of ways that this can be achieved. If you can run the process in background then it would be the best option. Caching long-running processes is also a viable option. If the user must wait for the whole job to complete then you can employ 2 or 3 step process.
User sends a request to the server. Your PHP code does some DB operations and redirects the user to the second step
Another PHP process calls the API and does only API related logic. No database connection is established. Once it is done, the application redirects user to the last step.
Your PHP application performs another set of DB activities and returns the final response to the user.
Of course, with such complex operations, there is more possibility for your application to lose transactional correctness so you have to evaluate all advantages and disadvantages.
One last point. If your server is set up correctly then your available PHP processes should have a corresponding amount of MySQL processes. This is to ensure that if your PHP processes are utilized at 100% and each one needs to perform some DB operations, then your MySQL server should not be the bottleneck. In this case, why not make it simple and keep the database connection open while performing the long-running API call? Saves you a lot of trouble.

Related

configure Apache and MySQL for parallel and cancelable service to clients

We have a client server architecture with Angular on client side and Apache2 PHP PDO and MySQL on the server side. server side exposing an API to clients that gives them data to show.
Some observations :
some API calls can take very long to compute and return response.
server side seem to handle a single request per client at any given time (im seeing only one coresponding query thats being executed in mysql), that limit comes either from apache or from mysql since front-end sending requests in parallel for sure.
front end cancels requests that are not relevant anymore (data being fetched will not be visible)
seems like requests canceled by front end are not canceled in server side and continues to run anyway, i think even if they are queued they will still run when their turn arrives (even though they were cancelled on client side)
Need help to understand :
what exactly is the cause of not having all requests (or at least X>1 requests) run on parallel? can it be changed?
What configurations should i change in either apache or mysql to overcome this?
is there a way to make apache drop cancelled requests? at least those that are still queued and not started?
Thanks!
EDIT
Following #Markus AO comment (Thanks Markus!!!) this was session blocking related... wish i knew about that before!
OP has a number of tangled problems on the table. However I feel these are worthwhile concerns (having wrestled with them myself), so let's take this apart. For great justice; main screen turn on:
Solving Concurrent Request Problems
There are several possible problems and solutions with concurrent connections in a (L)AMP stack. Before looking at tuning Apache and MySQL, however, let me gloss a common "mystery" issue that creates concurrence problems; namely, a necessary evil called "PHP Session Locking".
PHP Session Blocking & Concurrent Requests
In a nutshell: When you use sessions in your application, after calling session_start(), PHP locks the session file stored at your session.save_path directory. This file lock will remain in place until the script ends, or session_write_close() is called. Result: Any subsequent calls by the same user will be queued, rather than concurrently processed, to ensure there's no session data corruption. (Imagine parallel scripts writing into the same $_SESSION!)
An easy way to demonstrate this is to create a long-running script; then call it in your browser; and then open a new tab, and call it again (or in fact, call any script sharing the same session cookie/ID). You'll see that the second call won't execute until the first one is concluded. This is a common cause of strange AJAX lags, especially with parallel AJAX requests from a single page. Processing will be consecutive instead of concurrent. Then, 10 calls at 0.3 sec each will take a total of 3 sec to conclude, and so on. We don't want that, do we!
You can remedy request blocking caused by PHP session lock by ensuring that:
Scripts using sessions should call session_write_close() once done storing session data. The session lock will be immediately released.
Scripts that don't require sessions shouldn't start sessions to begin with.
Scripts that need to only read session data: Using session_start() with ['read_and_close' => true] option will give you a read-only (non-persistent) $_SESSION variable without session locking. (Available since PHP 7.)
Options 1 and 3 will leave you with read access for the $_SESSION variable and release/avoid the session lock. Any changes made to $_SESSION after the session is closed will be silently discarded; no warnings/errors are displayed.
The session lock request blocking issue is only consequential for a single user (using the same session). It has no impact on multi-user concurrence. For further reading, please see:
SO: Session (Auto)-Start, Performance & Session Locking
SO: PHP & Sessions: Is there any way to disable PHP session locking?
In-Depth: PHP Session Locking: How To Prevent Sessions Blocking in PHP requests.
Apache & MySQL Concurrent Requests
Once upon a time, before realizing PHP was the culprit behind blocking/queuing my concurrent calls, I spent a small aeon in tweaking Apache and MySQL and wondering, what happen?
Apache 2.4 supports 150 concurrent requests by default; any further requests will queue up. There are several settings under the MPM/Multi-Processing Module that you can tune to support the desired level of concurrent connections. Please see:
MPM Docs
Worker Docs
Overview at Oxpedia
MySQL has options for max_connections (default 151) and max_user_connections (default unlimited). If your application sends a lot of concurrent requests per user, you'll want to ensure the global max connections is high enough to ensure a handful of users don't hog the entire DBMS.
Obviously, you'll further want to tune these settings in light of your server CPU/RAM specs. (The calculations for which are beyond this answer.) Your concurrency issues probably aren't caused by too many open TCP sockets, but hey, you never know...
Canceling Requests to Apache/PHP/MySQL
We don't have much to go on as far as your application's specific wiring, but I understand from the comments that as it stands, a user can cancel a request at the front-end, but no back-end action is taken. (Ie. any back-end response is simply ignored/discarded.)
"Is there a way to make Apache drop cancelled requests?" I'm assuming that your front-end sends the requests directly and without delay to Apache; and onward to PHP > MySQL > PHP > Apache. In that case, no, you can't really have Apache cancel the request that it's already received; or you could hit "stop", but chances are PHP and MySQL are already munching it away...
Holding a "Cancel Window"
However, you could program a "cancel window" lag into your front-end, where requests are only passed on to Apache after e.g. a 0.5-second sleep waiting for a possible cancel. This may or may not have a negative impact on the UX; may be worth implementing to save server resources if a significant portion of requests are canceled. This assumes an UI with Javascript. If you're getting direct HTTP calls to API, you could have a "sleepy proxy receiver" instead.
Using a "Cancel Controller"
How would one cancel PHP/MySQL processes? This is obviously only feasible/doable if calls to your API result in a processing time of any significant duration. If the back-end takes 0.28 sec to process, and user cancels after 0.3 seconds, then there isn't much left to cancel, is there.
However, if you do have scripts that may run for longer, say into a couple of seconds. You could always find relevant break-points in your code, where you have a "not-canceled" check or a kill/rollback routine. Basically, you'd have the following flow:
Front-end sends request with unique ID to main script
PHP script begins the long march for building a response
On cancel: Front-end re-sends the ID to a light-weight cancel controller
Cancel controller logs ID to temporary file/database/wherever
PHP checks at break-points if there's a cancel request for current process
On cancel, PHP executes a kill/rollback routine instead of further processing
This sort of "cancel watch" will obviously create some overhead, and as such you may want to only incorporate this into heavier scripts, to ensure you actually save some processing time in the big picture. Further, you'd only want at most a couple of breakpoints at significant junctions. For read requests, you could just kill the process; but for write requests, you'd probably want to have a graceful rollback to ensure data integrity in your system.
You can also cancel/kill a long-running MySQL thread, already initiated by PHP, with mysqli::kill. For this to make sense, you'd want to run it as MYSQLI_ASYNC, so PHP's around to pull the plug. PDO doesn't seem to have a native equivalent for either async queries or kill. Came across $pdo->query('KILL CONNECTION_ID()'); and PHP Asynchronous MySQL Query (see answer for PDO). Haven't tested these myself. Also see: Kill MySQL query on user abort
PHP Connection Handling
As an alternative to a controller that passes the cancel signal "from the side", you could look into PHP Connection Handling and poll for aborted connection status at your cancel check-points with connection_aborted(). (See "MySQL kill" link above for a code example.)
A CONNECTION_ABORTED state follows if a user clicks the "stop" button in their browser. PHP has a ignore_user_abort() setting, default "Off", which should abort a script on user-abort. (In my experience though, if I have a rogue script and session lock is on, I can't do anything until it times out, even when I hit "stop" in the browser. Go figure.)
If you have "ignore user abort" on false, ie. the PHP script terminates on user abort, be aware that this will be a wholly uncontrolled termination, unless you have register_shutdown_function() implemented. Even so, you'd have to flag check-points in your code for your shutdown function to be able to "rewind the clock" from the termination point onward. Also note this caveat:
PHP will not detect that the user has aborted the connection until an attempt is made to send information to the client. Simply using an echo statement does not guarantee that information is sent, see flush(). ~ PHP Manual on ignore_user_abort
I have no experience with implementing "user abort" over AJAX/JS. For a starting point, see: Abort AJAX Request and Cancel an HTTP fetch() request. Not sure how/if they register with PHP. If you decide to travel down this road, please return and update us with your code / research!

Is PDO::lastInsertId() in multithread single connection safe?

I read some threads here about PDO::lastInsertId() and its safety. It returns last inserted ID from current connection (so it's safe for multiuser app while there is only one connection per user/script run).
I'm just wondering if there is a possibility to get invalid ID if there is only one DB connection per one long script (lots of SQL requests) in multicore server system? The question is more likely to be theoretical.
I think PHP script run is linear but maybe I'm wrong.
PDO itself is not thread safe. You must provide your own thread safety if you use PDO connections from a threaded application.
The best, and in my opinion the only maintainable, way to do this is to make your connections thread-private.
If you try to use one connection from more than one thread, your MySQL server will probably throw Packet Out of Order errors.
The Last Insert ID functionality ensures multiple connections to MySQL get their own ID values even if multiple connections do insert operations to the same table.
For a typical php web application, using a multicore server allows it to handle more web-browser requests. A multicore server doesn’t make the php programs multithreaded. Each php program, to handle each web request, allocates is own PDO connections. As you put it, each php script run is “linear”. The multiple cores allow multiple scripts to run at the same time, but independently.
Last Insert ID is designed to be safe for that scenario.
Under some circumstances a php program may leave the MySQL connection open when it's done so another php program may use it. This is is called a persistent connection or connection pooling. It helps performance when a web site has many users connecting to it. The generic term for a reusable connection is "serially reusable resource.*
Some php programs may use threads. In this case the program must avoid allowing more than one thread to use the same connection at the same time, or get the dreaded Packet Out of Order errors.
(Virtually all machines have multiple cores.)

Online Testing Platform

Hi there we are developing a website for students for taking Online tests.
We are working on PHP My SQL.
The questions of the all the tests are stored in a table with the test_id associated with the test.
Problem:
Now as the questions of the tests are being loaded from the server it sometime takes time in loading.
As these tests are being TIMED (Online Tests) hence the test taker feels his time is getting wasted.
The loading time may be a result of
slow internet connection
Databse search
Question/s
What is the best way of giving a jerkless experience to the test-taker irrespective of his internet speed and PC configuration.
From your wording, I'm assuming each individual question has its own time limit.
Eliminating a user's slow connection is impossible; if you measure the time on the client to try and avoid that, you open it up to cheating (client can hack the javascript to present a false time).
However you can eliminate database query time: set up a websocket server, have the user connect to it when they start the test, load all of the relevant questions in advance on the server into a queue, and when the user requests a question, immediately record the current time and send the next question from the queue out via the websocket connection.
Also make sure that upon receiving the question, the client side JS displays it immediately and doesn't have to e.g. make further AJAX calls or requests before it can display it. If additional information is needed, that should be looked up by your websocket server and bundled in with the question.
By doing this you should be able to get the response time below 50ms if the user has a decent internet connection and is in the same country as your websocket server.
You can not speed-up loading regardless of the users internet-connection. Of course, you can (and should) optimize all SQL queries and long-running tasks to have them perform as good as possible.
To void issues with test time running out, I would recommend to load all questions before the time starts running. Then, all data can be stored in the clients local storage (refer this link for some more info) - but please take into account, that this will only work if the browser supports local storage.
Another possibility is, to load / generate all data and have some server-side cache (like memcached, or a simple file-cache). On every new action, that cache can be queried without having to query all data from the database. Of course, this will only speedup the process, if the performance issues are in long-running queries, database speed etc - not if the user`s internet connection is too slow.

Cross server MySQL connection and requests

I'm going to be using Nodejs to process some CPU intense loop operations with sending emails to registered users as PHP was using too much during the time it runs and freezes the site.
One thing is that Nodejs will be on different server and do a request using external connection in MySQL.
I've heard that external db connection is bad for performance.
Is this true? And are there any pros and cons of doing this?
Keep in mind, when running a CPU intensive operation in Node the whole application blocks as it runs in a single thread. If you're going to run a CPU intensive operation in Node, make sure you spawn it off into a child process who's only job is to run the calculation and then return to the primary application. This will ensure your Node app is able to continue responding to income requests as the data is being processed.
Now, onto your question. Having the database on a different server is extremely common and typically is a good practice to have. Where you can run into performance problems is if your database is in a different data center entirely. The further (physically) your database server is from your application server, the more latency there will be per request.
If these requests are seriously CPU intensive, you should consider looking into a queueing mechanism for a couple reasons. One, it ensures that even in the event of an application crash, you don't lose a request that is being processed. Two, you can monitor the queue, and scale the number of workers processing the queue in the event that the operations are piling to the point that a single application can't finish processing one before another comes in.

Pheanstalk (PHP client for beanstalk) - how do connections work?

I'd like some help understanding the use of pheanstalk (php beanstalk client). I have a PHP program that is executed on the server when form data is sent to it. The PHP program should then package the form data as a JSON structure and send it to a backend server process.
The thing I don't understand is the connection to the beanstalkd server. Should I create a new Pheanstalk() object each time the PHP program executes - in which case, am I incurring the cost of creating the connection. When is the connection closed (since there is no close() method in pheanstalk)?
If the connection is persistent, is it shared among all executions of the PHP program, in which case, what happens in the case of concurrent hits? Thanks for any help.
Yes, you will have to create a new connection with Pheanstalk (or any other library) each time you start the program, since PHP starts each one fresh. The overhead is tiny though.
The Beanstalkd process is optimised to easily handle a number of connections, and will act on them atomically - you won't get a duplicate job, unless you put two of the same in there (and even then, they would have different job-ID's).
Pheanstalk doesn't even send data to the daemon any information (including opening the connection) until the first command is sent. It's for this reason that you can't tell if the daemon is even alive till you actively make a request (in my tests, I get the list of current tubes). If you kept re-using the instantiated class in the running program, then it would keep reusing it of course.
There's no formal close(), but unset($pheanstalk) would do the same thing, running the destructor. Again, the call is program so transient and the daemon can keep so many concurrent connections open if it's allowed to, that it's not an issue - and it will be shut down as the program itself does.
In short, don't worry. The overhead of connecting and sending data into, or out of, Beanstalkd will probably be a tiny fraction of any work that is done by the worker, or producer, in generating the request/response.

Categories