Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I want to run more than 800 PHP scripts in the background simultaneously on Linux. Each PHP script will execute forever, meaning it will not stop once it has started. Each script will send request and get response from the server. How much RAM do I need for that? Will it possible to run more than 800 scripts? What kind of hardware do I need?
You're probably doing it wrong. Since your scripts are I/O bound instead of CPU bound, an event loop will help you. That way you just need as many workers as CPU cores.
This approach does not only lower your required resources in terms of memory and CPU cycles, but also reduce the number of scripts you have to monitor.
There are various PHP implementations, here are the three most popular ones:
Amp
Icicle
React
Well, I'm sure the hardware you seek exists, but you will need a time machine to access it ... do you have a time machine ??
I'm going to assume you do not have access to, or plans to build a time machine, and say that this is not sensible.
In case humour didn't do it for you; There is no hardware that is capable of executing that many processes concurrently, setting out to create an architecture that requires more threads than any commonly available hardware can execute is clearly a bad idea.
If all you are doing is I/O, then you should use non-blocking, asynchronous I/O.
To figure out how much ram you will need is simple, how much data will be stored in memory during execution x 800.
You can improve memory usage by setting variables to null as soon as you are done with the data, even if you are re-using the variables again I would highly recommend this. That way the execution will not turn into a memory leak filling up RAM and crashing your server.
$myVariable = null; //clears memory
The second part of your question "execute forever" is easy too, you simply need to tell PHP to allow the script to run for a long time... Personally though I would do the following:
Setup 800 crons of your script all running every 1 hour.
I would assume your script is in an infinite loop... note the time into a variable before the infinite loop and in the loop check if 1 hour has passed, if 1 hour has passed end the loop and the process (a new one will replace it).
Doing the above will ensure the process will be cleaned every hour, also if for some reason a process gets killed by the server due to resource or security checks the process will spring back up within the hour.
Of course you could lower this to 30 mins, 15 mins, 5 mins depending on how heavy each loop is and how often you want to re-establish the processes.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I have a large database with 1800 contact records. A php script is being used to facilitate "Word Mailmerge". When I do a mailmerge with 500 records, it takes around 40 seconds. But to do same for 1800 records, execution time for the task is around 300 seconds, where it should take no more than 180-200 seconds.
I have tweaked around PHP.ini and php-fpm config and increase some values, but no improvement in results. Is it normal for PHP when executing large no of records?
Is it exactly 300 seconds (i.e 5 minutes) every time? If so, I think you might be getting stuck on one of the records somehow and PHP is hitting its max_execution_time.
Try running your script with the latter 1300 of the 1800 (by adding LIMIT 501, 1300 to your query). Does it take 300s?
Try splitting the recordset in half (by adding LIMIT 0, 900 in one attempt then LIMIT 901,900 in the next). Take the set that results in the 300s execution time and split that in half -- continue until you find the record that is causing your trouble.
Investigate that record and see why it's causing an infinite loop or hang.
Also, do you have E_ALL error reporting on? You may have Notices or Warnings that may shed some light on the issue.
UPDATE:
In your script, where you have your query string defined, add an echo with a concatenated <br> after the assignment and before the query call itself. Then wrap that echo in ob_end_flush(); and ob_start();.
For example, if, you had code like this:
$query = "SELECT * FROM {contacts};";
$result = mysqli_query($query);
change it to:
$query = "SELECT * FROM {contacts};";
ob_end_flush();
# CODE THAT NEEDS IMMEDIATE FLUSHING
echo $query . "<br>";
ob_start();
$result = mysqli_query($query);
This will make PHP echo the query string immediately without waiting for the script execution to end or for the buffer to fill. Then you can see the output while the script is still running and determine immediately what's actually happening (either the script is getting hung on a single, specific record OR the execution time is legitimately increasing as your iterate over an increasing amount of records).
If you actually share your code, I'd be able to help a lot more.
1800 is a very small number of records and shouldn't take any amount of time at all to process. Can you provide a little more detail on the process itself? Is it a webpage / background process? Are you sending the same data to all of these users?
This type of thing should be handled by a worker / background process so, typically, you shouldn't care how long it takes (within reason) providing the job completes. That said, because it is such a small range, you could troubleshoot the process by trying to send 250 first and set the base time for what you expect as an average of 250 records to be sent. Increase by 250 each time. If the average time increases each time you increment the batch size then you can pretty be sure you have an issue with the code - in which case post it here and we can try to resolve for you. If the average increases only within a specific window (e.g. between 750 and 1000) then it is likely a data issue.
This is as simple an approach as I can suggest without seeing any code or having any extra detail.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I’m looking at ways to send out 1 email every 1 min. I’ve looked at the example below where the top answer is to use PHP sleep() function.
However, I’ve also found suggestions that sleep() might slow down the server.
I’m not looking for exact answers but general approaches would be great.
However, I've also found suggestions that sleep might slow down the
server.
Yes, and hitting the pause button on a movie playing on your computer will slow down the duration of the film based on the amount of time you pause the movie.
The purpose of sleep is to put a pause in your script. As described in the official PHP documentation:
Delays the program execution for the given number of seconds.
So yes, it slows down your server. But only on content or pages where sleep is active.
So if this is a fronted script with sleep in it, it slows down the ability for anyone to view content via the PHP script that uses sleep. Place it in the middle of a page where HTML is rendering with a 1 second delay & your page now takes 1 second longer to render.
If this is a backend process only you really know about or trigger, no big deal. It’s a background process anyway so it will just expectedly slow things down in that realm.
That said, let’s look at your core question which is the first sentence of your post:
I’m looking at ways to send out 1 email every 1 min.
Then what you are looking for is a cron job which is a timed job on a Unix/Linux system. An entry for a cron job for something sending mails out every minute might be something like this:
* * * * * /full/path/to/php /full/path/to/your/php/script.php
But that is superficial. It basically just triggers the script.php every minute. Then within your script.php you would have to create some core logic that would control what happens each time it’s triggered. If you are using a database, then maybe you could create a last_sent field where you sent a time stamp of the last time a mail was sent to a recipient and then you act on that. But again, the logic is based on your core needs.
But at the end of the day, I am not too clear how sleep would factor into any of this. Might be worth it to take a step back and better architect your script to fit your needs knowing what cron is, what sleep is & what they are as well are not.
It is generally done with a separated worker and a queue manager.
That's it: you have a queue manager (i.e. RabbitMQ) that a sending email worker is bound to,
Then when you need to send 10 emails you put all of them to the corresponding queue at once in the script that serves HTTP response. This step is immediate.
Then a worker reads emails one by one and sends them with required delay. This step takes some time but we don't care.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
The scenario:
TL;DR - I need a queue system for triggering jobs based on a future timestamp and NOT on the order it is inserted
I have a MySQL database of entries that detail particular events that need to be performed (which will consist mostly of a series of arithmetic calculations and a database insert/update) in a precise sequence based on timestamps. The time the entry is inserted and when the event will be "performed" has no correlation and is determined by outside factors. The table also contains a second column of milliseconds which increases the timing precision.
This table is part of a job "queue" which will contain entries set to execute from anywhere between a few seconds to a few days in the future, and can potentially have up to thousands of entries added every second. The queue needs to be parsed constantly (every second?) - perhaps by doing a select of all timestamps that have expired during this second and sorting by the milliseconds, and then executing each event detailed by the entries.
The problem
Currently the backend is completely written in PHP on an apache server with MySQL (ie standard LAMP architecture). Right now, the only way I can think of to achieve what I've specified is to write a custom PHP job queue script that will do the parsing and execution, looped every second using this method. There are no other job systems that I'm aware of which can queue jobs according to a specified timestamp/millisecond rather than the entry time.
This method however sounds rather infeasible CPU wise even on paper - I have to perform a huge MySQL query every second and execute some sort of function for each row retrieved, with the possibility of it running over a second of execution time which will start introducing delays to the parsing time and messing up the looping script.
I am of course attempting to create a solution that will be scalable should there be heavy traffic on the system, which this solution fails miserably as it will continue falling behind as the number of entries get larger.
The questions
I'd prefer to stick to the standard LAMP architecture, but is there any other technology I can integrate nicely into the stack that is better equipped to deal with what I'm attempting to do here?
Is there another method entirely to to accurately trigger events at a specified future date without the messy fiddling about with the constant queue checking?
If neither of the above options are suitable, is there a better way to loop the PHP script in the background? In the worst case scenario I can accept the long execution times and split the task up between multiple 'workers'.
Update
RabbitMQ was a good suggestion, but unfortunately doesn't execute the task as soon as it 'expires' - it has to go through a queue first and wait up on any tasks in front that have yet to expire. The expiry time has a wide range between a few seconds to a few days, and the queue needs to be sorted somehow each time a new event is added in so the expiry time is always in order in the queue. This isn't possible as far as I'm aware of in RabbitMQ, and doesn't sound very efficient either. Is there an alternative or a programmatic fix?
Sometimes, making a square peg fit into a round hole takes too much effort. While using MySQL to create queues can be effective, it gets much trickier to scale. I would suggest that this might be an opportunity for RabbitMQ.
Basically, you would setup a message queue that you can put the events into. You would then have a "fanout" architecture with your workers processing each queue. Each worker would listen to the queue and check to see if the particular event needs to be processed. I imagine that a combination of "Work Queues" and the "Routing" techniques available in Rabbit would achieve what you are looking for in a scalable and reliable way.
I would envision a system that works something like this:
spawn workers to listen to queues, using routing keys to prune down how many messages they get
each worker checks the messages to see if they are to be performed now
if the message is to be performed, perform it and acknowledge -- otherwise, re-dispatch the message for future processing. There are some simple techniques available for this.
As you need more scale, you add more workers. RabbitMQ is extremely robust and easy to cluster as well when you eventually cap out your queue server. There are also other cloud-based queuing systems such as Iron.IO and StormMQ
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I've been learning to program in PHP and made an application which makes several independent things, the problem is it takes about 20-30 seconds to finish the task, because the code is executed sequentially.
I was reading and found out that there are no threads in php, is there any way to get around?
Edit: added information:
Basically, my application will seek information from news, weather, etc. (with file_get_contents($url)), but performs the functions sequentially, in other words, first fetches the news, then information about weather, and successively, instead of running it all at the same time .
Use some kind of job-queuing software like Gearman or RabbitMQ, then - put those ops in the consumer.
use CURL_MULTI instead, much faster. http://php.net/manual/en/function.curl-multi-init.php
It will reduce the loading \ processing time noticeably if you are reading numerous pages.
You could also try to hack in some threading behaviour by launching different requests to your webserver at the same time. For instance, your index.php would serve a simple page, which contains a number of AJAX calls to, say, fetchNews.php and fetchWeather.php. These requests would then be run asynchronously, in parallel, by the browser, and you'd circumvent phps limit on threading by just launching different webserver requests.
You mention that you're doing a bunch of file_get_contents($url)-calls. These are pretty slow. It would be a huge timesaver if instead of pulling these files in every time you load the page, you cache them to local storage and read them from there: that would be almost instant. Of course, you'd need to keep in mind how fresh you need your information.
For instance you could run a cron job that fetches these files every minute or so. Then you could have your website render this fetched information: the information would only be max 1 minute + the time it takes to run that script out of date.
I have a map. On this map I want to show live data collected from several tables, some of which have astounding amounts of rows. Needless to say, fetching this information takes a long time. Also, pinging is involved. Depending on servers being offline or far away, the collection of this data could vary from 1 to 10 minutes.
I want the map to be snappy and responsive, so I've decided to add a new table to my database containing only the data the map needs. That means I need a background process to update the information in my new table continuously. Cron jobs are of course a possibility, but I want the refreshing of data to happen as soon as the previous interval has completed. And what if the number of offline IP addresses suddenly spike and the loop takes longer to run than the interval of the Cron job?
My own solution is to create an infinite loop in PHP that runs by the command line. This loop would refresh the data for the map into MySQL as well as record other useful data such as loop time and failed attempts at pings etc, then restart after a short pause (a few seconds).
However - I'm being repeatedly told by people that a PHP script running for ever is BAD. After a while it will hog gigabytes of RAM (and other terrible things)
Partly I'm writing this question to confirm if this is in fact the case, but some tips and tricks on how I would go about writing a clean loop that doesn't leak memory (If that is possible) wouldn't go amiss. Opinions on the matter would also be appreciated.
The reply I feel sheds the most light on the issue I will mark as correct.
The loop should be in one script which will activate/call the actual script as a different process...much like cron is doing.
That way, even if memory leaks, and non collected memory is accumulating, it will/should be free after each cycle.
However - I'm being repeatedly told by people that a PHP script running for ever is BAD. After a while it will hog gigabytes of RAM (and other terrible things)
This used to be very true. Previous versions of PHP had horrible garbage collection, so long-running scripts could easily accidentally consume far more memory than they were actually using. PHP 5.3 introduced a new garbage collector that can understand and clean up circular references, the number one cause of "memory leaks." It's enabled by default. Check out that link for more info and pretty graphs.
As long as your code takes steps to allow variables to go out of scope at proper times and otherwise unset variables that will no longer be used, your script should not consume unnecessary amounts of memory just because it's PHP.
I don't think its bad, as with anything that you want to run continuously you have to be more careful.
There are libraries out there to help you with the task. Have a look at System_Daemon, which release RC 1 just over a month ago, which allows you to "Set options like max RAM usage".
Rather than running an infinite loop I'd be tempted to go with the cron option you mention in conjunction with a database table entry or flat-file that you'd use to store a "currently active" status bit to ensure that you didn't have overlapping processes attempting to run at the same time.
Whilst I realise that this would mean a minor delay before you perform the next iteration, this is probably a better idea anyway as:
It'll let the RDBMS perform any pending low-priority updates, etc. that may well been on-hold due to the amount of activity that you've been carrying out.
Even if you neatly unset all the temporary variables you've been using, it's still possible that PHP will "leak" memory, although recent improvements (5.2 introduced a new memory management system and garbage collection was overhauled in 5.3) should hopefully mean that this less of an issue.
In general, it'll also be easier to deal with other issues (if the DB connection temporarily goes down due to a config change and restart for example) if you use the cron approach, although in an ideal world you'd cater for such eventualities in your code anyway. (That said, the last time I checked, this was far from an ideal world.)
First I fail to see how you need a daemon script in order to provide the functionality you describe.
Cron jobs are of course a possibility, but I want the refreshing of data to happen as soon as the previous interval has completed
The neither a cron job nor a daemon are the way to solve the problem (unless the daemon becomes the data sink for the scripts). I'd spawn a dissociated process when the data is available using a locking strategy to aoid concurrency.
Long running PHP scripts are not intrinsically bad - but there reference counting garbage collector does not deal with all possible scenarios for cleaning up memory - but more recent implementations have a more advanced collector which should clean up a lot more (circular reference checker).