I'd like some help understanding the use of pheanstalk (php beanstalk client). I have a PHP program that is executed on the server when form data is sent to it. The PHP program should then package the form data as a JSON structure and send it to a backend server process.
The thing I don't understand is the connection to the beanstalkd server. Should I create a new Pheanstalk() object each time the PHP program executes - in which case, am I incurring the cost of creating the connection. When is the connection closed (since there is no close() method in pheanstalk)?
If the connection is persistent, is it shared among all executions of the PHP program, in which case, what happens in the case of concurrent hits? Thanks for any help.
Yes, you will have to create a new connection with Pheanstalk (or any other library) each time you start the program, since PHP starts each one fresh. The overhead is tiny though.
The Beanstalkd process is optimised to easily handle a number of connections, and will act on them atomically - you won't get a duplicate job, unless you put two of the same in there (and even then, they would have different job-ID's).
Pheanstalk doesn't even send data to the daemon any information (including opening the connection) until the first command is sent. It's for this reason that you can't tell if the daemon is even alive till you actively make a request (in my tests, I get the list of current tubes). If you kept re-using the instantiated class in the running program, then it would keep reusing it of course.
There's no formal close(), but unset($pheanstalk) would do the same thing, running the destructor. Again, the call is program so transient and the daemon can keep so many concurrent connections open if it's allowed to, that it's not an issue - and it will be shut down as the program itself does.
In short, don't worry. The overhead of connecting and sending data into, or out of, Beanstalkd will probably be a tiny fraction of any work that is done by the worker, or producer, in generating the request/response.
Related
First of all, i'm using pthreads. So the scenario is this: There are servers of a game that send logs over UDP to an ip and port you give them. I'm building an application that will receive those logs, process them and insert them in a mysql database. Since i'm using blocking sockets because the number of servers will never go over 20-30, I'm thinking that i will create a thread for each socket that will receive and process logs for that socket. All the mysql infromation that needs to be inserted in the database will be send to a redis queue where it will get processed by another php running. Is this ok, or better, is it reliable ?
Don't use php for long running processes (php script used for inserting in your graph). The language is designed for web requests (which die after a couple of ms or max seconds). You will run into memory problems all the time.
I run PHP via FCGI - that is my web server spawns several PHP processes and they keep running for like 10,000 requests until they get recycled.
My question is - if I've a $mysqli->connect at the top of my PHP script, do I need to call $mysqli->close in when I'm about to end running the script?
Since PHP processes are open for a long time, I'd image each $mysqli->connect would leak 1 connection, because the process keeps running and no one closes the connection.
Am I right in my thinking or not? Should I call $mysqli->close?
When PHP exits it closes the database connections gracefully.
The only reason to use the close method is when you want to terminate a database connection that you´ll not use anymore, and you have lots of things to do: Like processing and streaming the data, but if this is quick, you can forget about the close statement.
Putting it in the end of a script means redundancy, no performance or memory gain.
In a bit more detail, specifically about FastCGI:
FastCGI keeps PHP processing running between requests. FastCGI is good at reducing CPU usage by leveraging your server's available RAM to keep PHP scripts in memory instead of having to start up a separate PHP process for each and every PHP request.
FastCGI will start a master process and as many forks of that master process as you have defined and yes those forked processes might life for a long time. This means in effect that the process doesn't have to start-up the complete PHP process each time it needs to execute a script. But it's not like you think that your scripts are now running all the time. There is still a start-up and shutdown phase each time a script has to be executed. At this point things like global variables (e.g. $_POST and $_GET) are populated etc. You can execute functions each time your process shuts down via register_shutdown_function().
If you aren't using persistent database connections and aren't closing database connections, nothing bad will happen. As Colin Schoen explained, PHP will eventually close them during shutdown.
Still, I highly encourage you to close your connections because a correctly crafted program knows when the lifetime of an object is over and cleans itself up. It might give you exactly the milli- or nanosecond that you need to deliver something in time.
Simply always create self-contained objects that are also cleaning up after they are finished with whatever they did.
I've never trusted FCGI to close my database connections for me. One habit I learned in a beginners book many years ago is to always explicitly close my database connections.
Is not typing sixteen keystrokes worth the possible memory and connection leak? As far as I'm concerned its cheap insurance.
If you have long running FastCGI processes, via e.g. php-fpm, you can gain performance by reusing your database connection inside each process and avoiding the cost of opening one.
Since you are most likely opening a connection at some point in your code, you should read up on how to have mysqli open a persistent connection and return it to you on subsequent requests managed by the same process.
http://php.net/manual/en/mysqli.quickstart.connections.php
http://php.net/manual/en/mysqli.persistconns.php
In this case you don't want to close the connection, else you are defeating the purpose of keeping it open. Also, be aware that each PHP process will use a separate connection so your database should allow for at least that number of connections to be opened simultaneously.
You're right in your way of thinking. It is still important to close connection to prevent memory/data leaks and corruption.
You can also lower the amount of scipts recycled each cycle, to stop for a connection close.
For example: each 2500 script runs, stop and close and reopen connection.
Also recommended: back up data frequently.
Hope I helped. Phantom
I'm going to be using Nodejs to process some CPU intense loop operations with sending emails to registered users as PHP was using too much during the time it runs and freezes the site.
One thing is that Nodejs will be on different server and do a request using external connection in MySQL.
I've heard that external db connection is bad for performance.
Is this true? And are there any pros and cons of doing this?
Keep in mind, when running a CPU intensive operation in Node the whole application blocks as it runs in a single thread. If you're going to run a CPU intensive operation in Node, make sure you spawn it off into a child process who's only job is to run the calculation and then return to the primary application. This will ensure your Node app is able to continue responding to income requests as the data is being processed.
Now, onto your question. Having the database on a different server is extremely common and typically is a good practice to have. Where you can run into performance problems is if your database is in a different data center entirely. The further (physically) your database server is from your application server, the more latency there will be per request.
If these requests are seriously CPU intensive, you should consider looking into a queueing mechanism for a couple reasons. One, it ensures that even in the event of an application crash, you don't lose a request that is being processed. Two, you can monitor the queue, and scale the number of workers processing the queue in the event that the operations are piling to the point that a single application can't finish processing one before another comes in.
i read somewhere that php has threading ability called pcntl_fork(). but i still dont get what it means? whats its purpose?
software languages have multithreading abilities from what i understand, correct me if im wrong, that its the ability for a parent object to have children of somekind.
thanks
From wikipedia: "In computer science, a thread of execution is the smallest unit of processing that can be scheduled by an operating system. It generally results from a fork of a computer program into two or more concurrently running tasks."
Basically, having threads is having the ability to do multiple things within the same running application (or, process space as said by RC).
For example, if you're writing a chat server app in PHP, it would be really nice to "fork" certain tasks so if the server gets hung up processing something like a file transfer or very slow client it can spawn a thread to take care of the file transfer while the main application continues to transfer messages between clients without delay. Last time I'd used PHP, the threading was clunky/not very well supported.
Or, on the client end, while sending said file, it would be a good idea to thread the file transfer, otherwise you wouldn't be able to send messages to the server while sending the file.
It is not a very good metaphor. Think of a thread like a worker or helper that will do work or execute code for you in the background while your program could possibly be doing other tasks such as taking user input.
Threading means you can have more than one line of execution within the same process space. Note that this is different than a multi-process paradigm as processes will not share the same memory space.
This wiki link will do a good job getting you up to speed with threads. Having said that, the function pcntl_fork() in PHP, appears to create a child process which falls inline with the multi-process paradigm.
In layman's terms, since a thread gives you more than one line of execution within a program, it allows you to do more than one thing at the same time. Technically, you're not always doing these things simultaneously as in a single core processor, you're really just time-slicing so it appears you're doing more than one thing at a time.
A pretty straight-forward use of threads is how connections to a web server are handled. If you didn't have multiple threads, your application would listen for a connection on a socket, accept the connection when a client requested a connection, and then would process whatever page the client asked for. This seems well and good until you have a page that takes 5 seconds to load and you have 2 clients connecting at the same time. One of the clients will sit and wait for the server to accept their connection for ~5 seconds, because the 1st client is using the only line of execution to serve the page and it can't do that and accept the 2nd connection.
Now if you have multiple threads, you'll have one thread (i.e. the listener thread) that only accepts connections. As soon as the connection is accepted by the listener thread, he will pass the connection on to another thread. We'll call it the processor thread. Once the connection is passed on to the processor thread, the listener thread will immediately go back to waiting for a new connection. Meanwhile, the processor thread will use it's own execution line to serve the page that takes 5 seconds. In the scenario above, the 2nd client would have it's connection accepted immediately after the 1st client was handed to the 1st processor thread and an additional 2nd processor thread would be created to handle the request from the 2nd client. This would typically allow you to serve both clients the data in a little over 5 seconds while the single-threaded app would take ~10 seconds.
Hope this helps with your understanding of application threading.
Threading mean not to allow sequential behavior inside your code, whatever is your programming language..
My understanding is that PHP's p* connections is that it keeps a connection persistent between page loads to the service (be it memcache, or a socket etc). But are these connections thread safe? What happens when two pages try to access the same connection at the same time?
In the typical unix deployment, PHP is installed as a module that runs inside the apache web server, which in turn is configured to dispatch HTTP requests to one of a number of spawned children.
For the sake of efficiency, apache will often spawn these processes ahead of time (pre-forking them) and maintain them, so that they can dispatch more than one request, and save the overhead of starting up a process for every request that comes in.
PHP works on the principle of starting every request with a clean environment; no script variables persist between page loads. (Contrast this with mod_perl or python, where applications often manifest subtle bugs due to unexpected state hangovers).
This means that the typical resource allocated by a PHP script, be it an image handle for GD or a database connection, will be released at the end of a request.
Some resources, particularly Oracle database connections, have quite a high cost to establish, so it is desirable to somehow cache that connection between dispatched web requests.
Enter persistent resources.
The way these work is that any given apache child process may maintain a resource beyond the scope of a request by registering it in a "persistent list" of resources. The persistent list is not cleaned up at the end of the request (known as RSHUTDOWN internally). When you use a pconnect function, it will look up the persistent list entry for a given set of unique credentials and return that, if it exists, or establish a new connection with those credentials.
If you have configured apache to maintain 200 child processes, you should expect to see that many connections established from your web server to your database machine.
If you have many web servers and a single database machine, you may end loading your database machine much more than you anticipated.
With a threaded SAPI, the persistent list is maintained per thread, so it should be thread safe and have similar benefits, but the usual caveat about PHP not being recommended to run in threaded SAPI applies--while PHP is itself thread safe, so many libraries that it uses may have thread safety problems of their own and cause you a good number of headaches.
The manual's page Persistent Database Connections might get you a couple of informations about persistent connections.
It doesn't say anything specific about thread safety, still ; I've quite never seen anything about that anywhere, as far as I remember, so I suppose it "just works OK". My guess would be a connection is re-used only if not already used by another thread at the same time, but it's just some kind of (logical) wild guess...
Generally speaking, PHP will make one persistent connection per process or thread running on the webserver. Because of this, a process or thread will not access the connection of another process or thread.
Instead, when you make a database connection PHP will check to see if one is already open (in the process or thread that is handling the page request) and if it is then it will use it, otherwise it will just initialize a new one.
So to answer your question, they aren't necessarily thread safe but because of how they operate there isn't a situation where two threads or processes will access the same connection.
Generally speaking, when a PHP script requests a persistent connection, PHP will look for one in the connection pool with the same connection parameters.
If one is found that is NOT being used, it is given to the script, and returned to the pool at the end of the script.