PHP heavy task in background - php

I'm building a script to generate thousands of PDF pages but the memory consuming will affect the server's perfomance. As this is not a prioritary task (this generation can take hours, as long as it doesn't affect the webserver) what is the best approach for this problem?
I've seen some implementations of pthread but I would have to install also the ZTS. pthreadPHP
Don't really know if this is the right approach for my problem.
Thanks all

If you want to avoid installing pcntl or pthread, simply move the operation to a cron job (with file locking to prevent duplicate processes from running) or a never ending service to process them. Your main application would leave behind the meta data needed to generate the PD while the service runs separately from your main app and can be throttled.

Another option would be to use Gearman which can allow you to offload the PDF generation process to a completely separate server (or group of servers, gearmand itself is really just a job dispatcher). The worker scripts can be written in many different languages, including PHP (so if you're already using some PDF Generation library written in PHP, you can still use it). See also http://gearman.org/

Related

Multi-threaded queue consumer in PHP

I'm writing an image processing system in PHP that should spawn 3-5 workers that will monitor a location / queue for new images, and then process them (using locking so there's no race conditions).
The tricky part is I want to include it for generic CMS platforms. For that reason I can't assume that any particular extensions are enabled, since often it will be on shared hosting.
Is it possible to write a multi-threaded image processor that works with a locking queue without being sure that (for example) PHP's semaphore or pcntl are available? I haven't been able to come up with anything, but I'm hoping there's a way to work this in PHP using native PHP functions rather than having to install PEAR items or reconfigure PHP.
I could install items via Composer-- just not core PHP libs.
Thanks for any ideas!

Multithread This PHP Script? [duplicate]

I recently read about http://php.net/pcntl and was woundering how good that functions works and if it would be smart to use multithreading in PHP since it isn't a core function of PHP.
I would want to trigger events that don't require feedback through it like fireing a cronjob execution manually.
All of it is supposed to run in a web app written with Zend Framework
The pcntl package works quite fine - it just uses the according unix functions. The only shortage is that you can't use them if php is invoked from a web server context. i.e. you can use it in shell scripts, but not on web pages - at least not without using a hack like calling a forking script with exec or similar.
[edit]
I just found a page explaining why mod_php cannot fork. Basically it's a security issue.
[/edit]
This is not thread control, this is process control. The library for the threads is pthreads (POSIX threads) and it's not included in PHP, so there are no multi-threading functions in PHP.
As of multiprocessing, you cannot use that in mod_php, as that would be a giant security hole (spawned process would have all the web-server's privileges).
The only possible way to have php code executing in multiple threads is to run php as a module of a threaded web server, which is useless because threads are fully isolated and your code has no control over them. As far as i know, pcntl only manages processes, not threads.
If I needed to do manual crontab executions or the like from PHP, I'd probably use a queue. Have a database table that you append jobs to. Another process, either from a cron or running as a daemon, executes the jobs as they show up.
Another way to do it is to set up a separate script and do an HTTP GET to it. It's not quite threading, but it's one way of shelling to another command in PHP.
For example, if I wanted to run /usr/bin/somescript.sh on demand, I'd have a somescript.php that did a system call. This would be on a virtual host only accessible from localhost.
I'd do a socket call to the webserver and GET the script. The key is to not read on the socket so it doesn't block. If I wanted to check the return value of somescript.php, I'd do it later in my main script to prevent blocking.
If somescript.php takes a long time to execute (longer than the calling script), you'll have to do some magic to stop apache from killing the script when the socket is closed.
Multiplatform PHP Multithreading engine
http://anton.vedeshin.com/articles/lightweight-and-multiplatform-php-multithreading-engine
Examples of multithreading working in PHP (with excerpts from their project pages):
Cron Multi-Threaded.
As of October 25th, 2011, this module has reached "end of life" and is deprecated in favor of projects such as Elysia Cron. This module wasn't completely useless in that a core patch inspired by Cron MT was committed to D7.
Boost.
... provides static page caching for Drupal enabling a very significant performance and scalability boost for sites that receive mostly anonymous traffic. For shared hosting this is your best option in terms of improving performance. On dedicated servers, you may want to consider Varnish instead.

Working with Threads using Apache and PHP

Is there anyway to work with threads in PHP via Apache (using a browser) on Linux/Windows?
The mere fact that it is possible to do something, says nothing whatever about whether it is appropriate.
The facts are, that the threading model used by pthreads+PHP is 1:1, that is to say one user thread to one kernel thread.
To deploy this model at the frontend of a web application inside of apache doesn't really make sense; if a front end controller instructs the hardware to create even a small number of threads, for example 8, and 100 clients request the controller at the same time, you will be asking your hardware to execute 800 threads.
pthreads can be deployed inside of apache, but it shouldn't be. What you should do is attempt to isolate those parts of your application that require what threading provides and communicate with the isolated multi-threading subsystem via some sane form of RPC.
I wrote pthreads, please listen.
Highly discouraged.
The pcntl_fork function, if allowed at all in your setup, will fork the Apache worker itself, rather than the script, and most likely you won't be able to claim the child process after it's finished.
This leads to many zombie Apache processes.
I recommend using a background worker pool, properly running as a daemon/service or at least properly detached from a launching console (using screen for example), and your synchronous PHP/Apache script would push job requests to this pool, using a socket.
Does it help?
[Edit] I would have offered the above as a commment, but I did not have enough reputation to do so (I find it weird btw, to not be able to comment because you're too junior).
[Edit2] pthread seems a valid solution! (which I have not tried so I can't advise)
The idea of "thread safe" can be very broad. However PHP is on the very, furthest end of the spectrum. Yes, PHP is threadsafe, but the driving motivation and design goals focus on keeping the PHP VM safe to operate in threaded server environments, not providing thread safe features to PHP userspace. A huge amount of sites use PHP, one request at a time. Many of these sites change very slowly - for example statistics say more sites serve pages without Javascript still than sites that use Node.js on the server. A lot of us geeks like the idea of threads in PHP, but the consumers don't really care. Even with multiple packages that have proved it's entirely possible, it will likely be a long time before anything but very experimental threads exist in PHP.
Each of these examples (pthreads, pht, and now parallel) worked awesome and actually did what they were designed to do - as long as you use very vanilla PHP. Once you start using more dynamic features, and practically any other extensions, you find that PHP has a long way to go.

PHP flock function limitations and txt cache files

I have a fairly basic PHP script that caches data to a text file. I need to come up with a solution that prevents two running instances of the script from writing to the file at the same time. I've looked into the PHP flock function, however, the PHP manual (http://php.net/manual/en/function.flock.php) mentioned one big limitation:
On some operating systems flock() is implemented at the process level.
When using a multithreaded server API like ISAPI you may not be able
to rely on flock() to protect files against other PHP scripts running
in parallel threads of the same server instance!
I've got two questions regarding this warning that I'm hoping someone can answer. First, how can I check if my implementation of flock is done at the process level or not? Btw, I'm running CentOS, with cPanel.
Second, if my implementation is at the process level, does that mean that one running instance of my script will not be aware of a lock done by another running instance of the same script? Or do script instances run on separate threads and not separate processes? Any clarification about this is very much appreciated.
Thanks.
The only common case would be running Apache, with the some kind threaded (non-forking) npm. 99% of the cases you are not running PHP threaded.. it's a relatively safe assumption.
Aside from that, it may be worth trying to avoid locking..
The biggest issue you have is that 2 processes may be writing at the same time, or 1 process reads the cache when it's not fully generated. The easiest way to get around this, is to let the PHP script generate the cache in a different file in a temporary location. When the file is written, just move it into place (with rename()). File moves are guaranteed to be atomic when it's happening on the same mount.

Multithreading in PHP

I recently read about http://php.net/pcntl and was woundering how good that functions works and if it would be smart to use multithreading in PHP since it isn't a core function of PHP.
I would want to trigger events that don't require feedback through it like fireing a cronjob execution manually.
All of it is supposed to run in a web app written with Zend Framework
The pcntl package works quite fine - it just uses the according unix functions. The only shortage is that you can't use them if php is invoked from a web server context. i.e. you can use it in shell scripts, but not on web pages - at least not without using a hack like calling a forking script with exec or similar.
[edit]
I just found a page explaining why mod_php cannot fork. Basically it's a security issue.
[/edit]
This is not thread control, this is process control. The library for the threads is pthreads (POSIX threads) and it's not included in PHP, so there are no multi-threading functions in PHP.
As of multiprocessing, you cannot use that in mod_php, as that would be a giant security hole (spawned process would have all the web-server's privileges).
The only possible way to have php code executing in multiple threads is to run php as a module of a threaded web server, which is useless because threads are fully isolated and your code has no control over them. As far as i know, pcntl only manages processes, not threads.
If I needed to do manual crontab executions or the like from PHP, I'd probably use a queue. Have a database table that you append jobs to. Another process, either from a cron or running as a daemon, executes the jobs as they show up.
Another way to do it is to set up a separate script and do an HTTP GET to it. It's not quite threading, but it's one way of shelling to another command in PHP.
For example, if I wanted to run /usr/bin/somescript.sh on demand, I'd have a somescript.php that did a system call. This would be on a virtual host only accessible from localhost.
I'd do a socket call to the webserver and GET the script. The key is to not read on the socket so it doesn't block. If I wanted to check the return value of somescript.php, I'd do it later in my main script to prevent blocking.
If somescript.php takes a long time to execute (longer than the calling script), you'll have to do some magic to stop apache from killing the script when the socket is closed.
Multiplatform PHP Multithreading engine
http://anton.vedeshin.com/articles/lightweight-and-multiplatform-php-multithreading-engine
Examples of multithreading working in PHP (with excerpts from their project pages):
Cron Multi-Threaded.
As of October 25th, 2011, this module has reached "end of life" and is deprecated in favor of projects such as Elysia Cron. This module wasn't completely useless in that a core patch inspired by Cron MT was committed to D7.
Boost.
... provides static page caching for Drupal enabling a very significant performance and scalability boost for sites that receive mostly anonymous traffic. For shared hosting this is your best option in terms of improving performance. On dedicated servers, you may want to consider Varnish instead.

Categories