I have a mailing function and 100k email ids. I want to call a function multiple times like 10 times and in each time it will process 10k emails. I want to call this function without waiting for response just call a function and another and another without getting response.
I tried pthread for multi threading but can't run it successfully.
I am using my sql database
You can use multiple PHP processes for that, just like you can use multiple threads, there isn't much of a difference for PHP, as PHP is shared nothing.
You probably want to wait for it to finish and to notice any errors, but don't want to wait for completion before launching another process. Amp is perfectly suited for such use cases. Its amphp/parallel package makes multi-processing easier than using PHP's native API.
You can find a usage example in the repository.
php name_of_script.php &>/dev/null &
this line will start your script in the background.
I suppose you do not want to control a critical process like sending mails from your browser? What if your connection breaks a few seconds?
If so: still use command line approach, but using exec
$command = '/usr/bin/php path/to/script.php & echo $!');
Related
I am trying to crawl every page on my site (ran by a cron) to update data. There are roughly 500 pages.
I have tried 2 options.
PHP Simple HTML DOM Parser
PHP get_headers
Using either of the above, each page roughly takes 1.402 seconds to load. In total this takes about 570 seconds.
Is there a more efficient way of doing this?
Request pages in parallel (i.e. concurrently). Then it won't matter how long each request takes, because many will fire at once.
There are many ways to achieve this, but here is one example:
curl www.website.com/page1 &
curl www.website.com/page2 &
curl www.website.com/page3 &
Use xargs or other tools to prevent flooding the server with too many concurrent connections. e.g. Bash script processing commands in parallel
It can be complicated to run commands in parallel inside a single PHP script. Easier to use the command line, if possible.
Let's assume I have two PHP scripts, s1.php and s2.php. Let's also assume that s2.php takes about 30 minutes of running.
I would like to use s1.php to call s2.php asynchronously. When s2.php is called, it will run on its own without returning any value to s1.php. s1.php would not wait for s2.php to finish; s1.php will continue the next command, while s2.php starts on its own.
So here is the pseudo code for s1.php
Do something
Call s2.php
Continue s1.php, while s2.php is running (this step does not need to wait for s2.php to return in order to continue, it immedieately startes after s2.php starts).
How can I do that?
IMPORTANT NOTE: I am using a shared hosting environment
Out of the box, PHP does not support async processing or threading. What you're most likely after is Queueing and/or Messaging.
This can be as simple as storing a row in a database in s1.php and running s2.php on a cron as suggested in the comments, which reads that database, pulls out new rows and executes whatever logic you need.
It would be up to you to clean up - making sure you aren't reprocessing the same rows multiple times.
Other solutions would be using something like RabbitMQ or IronMQ. IronMQ might be a good place to look because its a cloud based service which would work well in your shared hosting environment and they allow for a 'dev tier' account which is free and probably far more api calls then you'll ever need.
Other fun things to look at is ReactPHP as that does allow for non-blocking io in php.
afaik, there's no good way to do this :/
you can use proc_open / exec ("nohup php5 s2.php") ~ , or $cmh=curl_multi_init();
$ch=curl_init("https://example.org/s2.php"); curl_multi_add_handle($chm,$ch);curl_multi_exec($chm,$foo);
though (or if you don't have curl, substitute with fopen... or if allow_url_fopen is false, you can even go as far as socket_create ~~ :/ )
I am creating a small plugin get get's data from different websites. The data does not have to be up to date, and I do not want to use a cronjob for this.
Instead with every visit of the website I want to check if the DB needs updating. Now it takes a while before the whole db is updated, and I do not want the user waiting for that.
Is there a way that I can have the function fired, but in the background. The user will just work as normal, but in the background the db is updating.
You could also fork the process using pcntl_fork
As you can see in the php.net example you get two execution threads following the function call. The parent thread could complete as usual, while the child could go on doing its thing
You'd want to use exec() with a command that redirects output to a file or /dev/null, otherwise PHP will wait for the command to complete before continuing with the script.
exec('/path/to/php /path/to/myscript.php 2>&1 > /dev/null');
There are many solutions to execute a PHP code asynchronously. The simplest is calling shell exec asynchronously Asynchronous shell exec in PHP. For more sophisticated true parallel processing in PHP try Gearman. Here a basic example on how to use Gearman.
The idea behind Gearman is you will have a deamon what will manage jobs for you by assigning tasks to worker. You will write two PHP files:
Worker: Which contain the code you want to run asynchronously.
Client: The code that will call your asynchronous function.
I am trying to create a multi threaded PHP application right now. I have read lots of paper that explains how to create multi threading. All of those examples are built on diving the processes on different worker PHP files. Actualy that is also what I am trying to do but there is a problem :)
There are too many jobs even to divide
in 30 seconds (which is the execution time limit)
We are using multi server environment on local network to complete the processes as the processes do not linked to each other or shares the same memory. We just need to fire them up and let them work at an exact time. Each of the processes works for 0.5 secs but it has a possibility to work for 30 secs.
Most of the examples fires up the PHP's and waits for the results. But unfortunately in my situation I dont need to expect a result from the thread. I just need it to execute the command and write the result to its own database.
How can I achieve to fire up the phps and wait for them to work for 10000 processes ?
ADDITIONAL INFO:
I know that PHP neither have multi threading feature nor is built for. But we have to create a way to use it for instance we can send request to http://server1/dothis.php?jobid=5 but standart methods makes us wait for the result. If we can manage to send request to this server without waiting for result it would solve our problem I think or we will need completely different approach such as a process divider with c++ or qt.
As has been pointed out, php doesn't support multi threading. However, and as tomaszsobczak mentioned, there is a library which will let you create "threads" and leave them running, and reconnect to them through other scripts to check their status and so on, called "Gearman".
From the project homepage: "Gearman provides a generic application framework to farm out work to other machines or processes that are better suited to do the work. It allows you to do work in parallel, to load balance processing, and to call functions between languages. It can be used in a variety of applications, from high-availability web sites to the transport of database replication events. In other words, it is the nervous system for how distributed processing communicates."
Rasmus' blog has a great write up about it here:
playing with gearman and for your case, it might just be the solution, although I've not read any in depth test cases... Would be interested to know though, so if you end up using this, please report back!
As the comments say, multi-threading is not possible in PHP. But based on your comment:
If we can manage to send request to this server without waiting for result it would solve our problem I think
You can start a PHP script to run in the background using exec(), redirecting the script's output somewhere else (e.g. /dev/null). I think that's the best you will get. From the manual:
Note: If a program is started with this function, in order for it to continue running in the background, the output of the program must be redirected to a file or another output stream. Failing to do so will cause PHP to hang until the execution of the program ends.
There are several notes and pointers in the User Contributed Comments, for example this snippet that allows background execution on both Windows and Linux platforms.
Of course, that PHP script will not share the state, or any data, of the PHP script you are running. You will have to initialize the background script as if you are making a completely new request.
Since PHP does not have the ability to support multithreading, I don't quite know how to advise you. Each time a new script is loaded, a new instance of PHP is loaded, so if your server can handle X many PHP instances, then you can do what you want.
Here is something however:
Code for background execution
<?php
function execInBackground($cmd) {
if (substr(php_uname(), 0, 7) == "Windows"){
pclose(popen("start /B ". $cmd, "r"));
}
else {
exec($cmd . " > /dev/null &");
}
}
?>
Found at http://php.net/manual/en/function.exec.php#86329
Do you really want to have multi-threading in php?
OR do you just want to execute a php script every second? For the latter case, a cronjob-like "execute this file every second" approach via linux console tools should be enough.
If your task is to make a lot of HTTP requests you can use curl multi. There is a good library for doing it: http://code.google.com/p/rolling-curl/
As everyone's already mentioned, PHP doesn't natively support multi-threading and the workarounds are, well, workarounds...
That said, have you heard of the Facebook PHP Compiler? Basically it compiles your PHP to highly optimized C++ and uses g++ to compile it. This opens up a world of opportunities, including but not limited to multi-threading!
The project is open-source and its on github
if you just want to post a HTTP request, just do it using PHP CURL lib. It will solve your issue.
Greetings All!
I am having some troubles on how to execute thousands upon thousands of requests to a web service (eBay), I have a limit of 5 million calls per day, so there are no problems on that end.
However, I'm trying to figure out how to process 1,000 - 10,000 requests every minute to every 5 minutes.
Basically the flow is:
1) Get list of items from database (1,000 to 10,000 items)
2) Make a API POST request for each item
3) Accept return data, process data, update database
Obviously a single PHP instance running this in a loop would be impossible.
I am aware that PHP is not a multithreaded language.
I tried the CURL solution, basically:
1) Get list of items from database
2) Initialize multi curl session
3) For each item add a curl session for the request
4) execute the multi curl session
So you can imagine 1,000-10,000 GET requests occurring...
This was ok, around 100-200 requests where occurring in about a minute or two, however, only 100-200 of the 1,000 items actually processed, I am thinking that i'm hitting some sort of Apache or MySQL limit?
But this does add latency, its almost like performing a DoS attack on myself.
I'm wondering how you would handle this problem? What if you had to make 10,000 web service requests and 10,000 MySQL updates from the return data from the web service... And this needs to be done in at least 5 minutes.
I am using PHP and MySQL with the Zend Framework.
Thanks!
I've had to do something similar, but with Facebook, updating 300,000+ profiles every hour. As suggested by grossvogel, you need to use many processes to speed things up because the script is spending most of it's time waiting for a response.
You can do this with forking, if your PHP install has support for forking, or you can just execute another PHP script via the command line.
exec('nohup /path/to/script.php >> /tmp/logfile 2>&1 & echo $!'), $processId);
You can pass parameters (getopt) to the php script on the command line to tell it which "batch" to process. You can have the master script do a sleep/check cycle to see if the scripts are still running by checking for the process id's. I've tested up to 100 scripts running at once in this manner, at which point the CPU load can get quite high.
Combine multiple processes with multi-curl, and you should easily be able to do what you need.
My two suggestions are (a) do some benchmarking to find out where your real bottlenecks are and (b) use batching and cacheing wherever possible.
Mysqli allows multiple-statement queries, so you could definitely batch those database updates.
The http requests to the web service are more likely the culprit, though. Check the API you're using to see if you can get more info from a single call, maybe? To break up the work, maybe you want a single master script to shell out to a bunch of individual processes, each of which makes an api call and stores the results in a file or memcached. The master can periodically read the results and update the db. (Careful to rotate the data store for safe reading and writing by multiple processes.)
To understand your requirements better, you must implement your solution only in PHP? Or you can interface a PHP part with another part written in another language?
If you could not go for another language, try to perform this update maybe as php script that runs in the background and not through the apache.
You can follow Brent Baisley advice for a simple use case.
If you want to build a robuts solution, then you need to :
set up a representation of the actions in a table in database that will be your process queue;
set up a script that pop this queue and process your action;
set up a cron daemon that run this script every x.
This way you can have 1000 PHP scripts running, using your OS parallelism capabilities and not hanging when ebay is taking to to respond.
The real advantage of this system is that you can fully control the firepower you throw at your task by adjusting :
the number of request one PHP script does;
the order / number / type / priority of the action in the queue;
the number or scripts the cron daemon runs.
Thanks everyone for the awesome and quick answers!
The advice from Brent Baisley and e-satis works nicely, rather than executing the sub-processes using CURL like i did before, the forking takes a massive load off, it also nicely gets around the issues with max out my apache connection limit.
Thanks again!
It is true that PHP is not multithreaded, but it can certainly be setup with multiple processes.
I have created a system that resemebles the one that you are describing. It's running in a loop and is basically a background process. It uses up to 8 processes for batch processing and a single control process.
It is somewhat simplified because i do not have to have any communication between the processes. Everything resides in a database so each process is spawned with the full context taken from the database.
Here is a basic description of the system.
1. Start control process
2. Check database for new jobs
3. Spawn child process with the job data as a parameter
4. Keep a table of the child processes to be able to control the number of simultaneous processes.
Unfortunately it does not appear to be a widespread idea to use PHP for this type of application, and i really had to write wrappers for the low level functions.
The manual has a whole section on these functions, and it appears that there are methods for allowing IPC as well.
PCNTL has the functions to control forking/child processes, and Semaphore covers IPC.
The interesting part of this is that i'm able to fork off actual PHP code, not execute other programs.