I am creating a web application using zend, here I create an interface from where user-A can send email to more than one user(s) & it works excellent but it slow the execution time because of which user-A wait too much for the "acknowledged response" ( which will show after the emails have sent. )
In Java there are "Threads" by which we can perform that task (send emails) & it does not slow the rest application.
Is there any technique in PHP/Zend just like in Java by which we can divide our tasks which could take much time eg: sending emails.
EDIT (thanks #Efazati, there seems to be new development in this direction)
http://php.net/manual/en/book.pthreads.php
Caution: (from here on the bottom):
pthreads was, and is, an experiment with pretty good results. Any of its limitations or features may change at any time; [...]
/EDIT
No threads in PHP!
The workaround is to store jobs in a queue (say rows in a table with the emails) and have a cronjob call your php script at a given interval (say 2 minutes) and poll for jobs. When jobs present fetch a few (depending on your php's install timeout) and send emails.
The main idea to defer execution:
main script adds jobs in the queue
cron script sends them in tiny slices
Gotchas:
make sure u don't send an email without deleting from queue (worst case would be if a user rescieves some spam at 2 mins interval ...)
make sure you don't delete a job without executing it first ...
handle bouncing email using a score algorithm
You could look into using multiple processes, such as with fork. The communication between them wouldn't be as simple as with threads (but then, it won't come with all of its pitfalls either), but if you're just sending emails, it might not be necessary to communicate much, if at all.
Watch out for doing forks on an Apache process. You may get some behaviors that you are not expecting. If you are looking to do any kind of asynchronous execution it should be via some kind of queuing mechanism. Gearman is one. Zend Server Job Queue is another. I have some demo code at Do you queue? Introduction to the Zend Server Job Queue. Cron can be used, but you'll have the problem of depending on your cron scheduler to run tasks whereas asynchronous computing often needs to be run immediately. Using a queuing system allows you to do that without threading.
There is a Threading extension being developed based on PThreads that looks promising at https://github.com/krakjoe/pthreads
There is pcntl, which allows you to create sub-processes, but php doesn't work very well for this kind of architecture. You're probably better off creating a long-running script (a daemon) and spawning multiple of them.
As of PHP there are no threads in it. However for php, you can have a look at this roundabout way
http://www.alternateinterior.com/2007/05/multi-threading-strategies-in-php.html
You may want to use a queue system for your email sending and send the email from another system which supports threads. PHP is just a tool and you should the tool that is best fitted for the job.
PHP doesn't include threading as part of the language, there are some methods that can emulate it but they aren't foolproof.
This Google search shows a few potential workarounds
Related
Would appreciate some help understanding typical best practices in carrying out a series of tasks using Gearman in conjunction with PHP (among other things).
Here is the basic scenario:
A user uploads a set of image files through a web-based interface. The php code responding to the POST request generates an entry in a database for each file, mostly with null entries in the columns, queues a job for each to do analysis using Gearman, generates a status page and exits.
The Gearman worker gets a job for a file and starts a relatively long-running analysis. The result of that analysis is a set of parameters that need to be inserted back into the database record for that file.
My question is, what is the generally accepted method of doing this? Should I use a callback that will ultimately kick off a different php script that is going to do the modification, or should the worker function itself do the database modification?
Everything is currently running on the same machine; I'm planning on using Gearman for background scheduling, rather than for scaling by farming out to different machines, but in any case any of the functions could connect to the database wherever it is.
Any thoughts appreciated; just looking for some insights on how this typically gets structured and what might be considered best practice.
Are you sure you want to use Gearman? I only ask because it was the defacto PHP job server about 15 years ago but hasn't been a reliable solution for quite some time. I am not sure if things have drastically improved in the last 12 months, but last time I evaluated Gearman, it wasn't production capable.
Now, on to the questions.
what is the generally accepted method of doing this? Should I use a callback that will ultimately kick off a different php script that is going to do the modification, or should the worker function itself do the database modification?
You are going to follow this general pattern with any job queue:
Collect a unit of work. In your case, it will be 1 of the images and any information about who that image belongs to, user id, etc.
Submit the work to the job queue with this information.
Job Queue's worker process picks up the work, and starts processing it. This is where I would create records in the database as you can opt to not create them on job failure.
The job queue is going to track which jobs have completed and usually the status of completion. If you are using gearman, this is the gearmand process. You also need something pickup work and process that work, I will refer to this as the job worker. The job worker is where the concurrency happens which is what i think you were referring to when you said "kick off a different php script." You can just kick off a PHP script at an interval (with supervisord or a cronjob) for a kind of poll & fork approach. It's not the most efficient approach, but it doesn't sound like it will really matter for your applications use case. You could also use pcntl_fork or pthreads in PHP to get more control over your concurrent processes and implement a worker pool pattern, but it is much more complicated than just firing off a script. If you are interested in trying to implement some concurrency in PHP, I have a proof-of-concept job worker for beanstalkd available on GitHub that implements a worker pool with both fork and pthreads. I have also include a couple of other resources on the subject of concurrency.
Job Worker (pthreads)
Job Worker (fork)
PHP Daemon Example
PHP IPC Example
I'm writing a web app in PHP + Laravel + MySQL.
In the system, a user can schedule emails (and other API calls) at arbitrary times (much like how you schedule posts in WordPress). I can use CRON to inspect the database every 5min or so to find emails that should be sent, send them, and update their status.
However, this is a SaaS app. So the amount of emails to be sent at a particular time can grow rapidly. I can create a "lock file" every time the CRON script runs so that only one instance of it is running at a time. The lock file will be deleted after a script finishes execution.
But with potentially large data, I would want a way to process multiple messages simultaneously, potentially using multiple "workers." Is there any existing solution manage such a queue?
Yes! Task/Message/Job queues are what you are looking for! They allow you to put various tasks in queues from which you can retrieve them and process them, this process can scale horizontally as each worker can pull a task once its finished with the previous one.
You should have the cron maybe every minute/two minutes that just uploads the task and what needs to be done. This will make sure the cron is very quick.
Take a look at Iron.io Here is an extract from the website which gives a nice overview of these kinds of systems:
An easy-to-use scalable task queue that gives cloud developers a
simple way to offload front-end tasks, run scheduled jobs, and process
tasks in the background and at scale.
Gearman is also a great solution that you can use yourself and is very simple. You can send the message in many different languages and use a different langauge to process it. Say PHP -> C etc...
The Wikipedia link will tell you everything you need to know, here is a quick excerpt:
Message queues provide an asynchronous communications protocol,
meaning that the sender and receiver of the message do not need to
interact with the message queue at the same time. Messages placed onto
the queue are stored until the recipient retrieves them.
I have a Gearman Work in php that processes background tasks from client. From time to time I am not able to process that job. I would need a way to delay retry that job after 5 minutes. How can I do that?
What I do now is to do exit(255) but this will retry the job immediately. Also I do not know how can I get the number of failures of that specific job (in the worker).
Questions:
How can I do the above stuff in Gearmand
Is there any other
messageing system that is capable of this?
You can't. At least not using built-in capabilities. This feature is only partly implemented in Gearmand and the PHP module does not expose this functionality. See this discussion on the feature.
People have tried different things, including:
Use node.js and its timeout capabilities
Use at utility
Use another queuing system - beanstalkd
When it comes to tracking failures - again, you can't, AFAIK. See my answer on handling retries in Gearman for a possible solution.
Not build in, but you can use a bit of memcached+TTL for that.
I am handling a project which contain message queue concept. Now the project is in PHP, and it's making more delay in message sending or mail sending. So I suggest to develop a message queue in Perl or Python script. Could you please suggest which is best either PHP or Perl or Python?
A possible solution could be to use Gearman as a queue :
Your PHP project would send messages to Gearman, as background jobs ; and finish
Gearman would dispatch those messages to workers
Workers will deal with the jobs -- doing the stuff that might take time
One additional advantage : the day you need several servers to handle a larger amount of jobs, you'll already have what's needed : Gearman will deal with load-balancing for you.
PHP is perfectly adequate to implement a simple message queue. So, if your current code is causing delays then it is because of your design, not because of some limitation with PHP. Switching to a different language isn't going to help you. Bad code is bad code regardless of language.
The best thing you can probably do is going with an existing message queue. Pascal recommended Gearman. I have worked with (and quite liked) Beanstalkd. If you need a metric ton of features, have a look at ApacheMQ or RabbitMQ.
That said, if you insist on implementing your own message queue, I would suggest sticking with PHP. That way you can re-use code from your existing application (e.g. re-use your models and database API for example).
Here are two alternative for gearman
a. Beanstalkd
b. MemcacheQ
MemcacheQ http://memcachedb.org/memcacheq/
Adding and fetching from queue needs to be done manually using code.
Its not like you send it to queue and MemcacheQ will execute it one by one.
but its very very fast.
Beanstalkd
http://kr.github.com/beanstalkd/download.html
It supports many languages.
I have developed a web application where students across the country come and register for some academic purpose. The users are expected to be around 100k within next year.
I need to send all of these people periodic mails. The web app is developed using codeigniter. The php script can run for 3000 seconds. But still the app is unable to send mails to more that 100 users.
The machine I run is in cloud and has got 256MB ram. I used the free -m command to check the memory usage but that doesnt seem to be a problem. Everything works fine for 10-20 mails.
What would be the best solutions? Is there any way I can transfer this job to some other app/program/shell script ?
If you cannot use some external service for your emails I would just setup a cronjob that sends a couple of emails every n seconds. Its pretty cumbersome to send a lot of emails with php as you have discovered. But the cronjob solution works everytime as far as I know.
So you have a list of emails/addresses and a cronjob that iterates that list and sends the emails.
Sure you can send the emails yourself from a server, but that is only half the battle.
If you are sending bulk emails, as opposed to the transactional type, it's best to use a third party service that is already whitelisted on mail servers. The primary reason being, you might get blacklisted by the major mail servers as a spammer. If this happens, you will have to work with them individually to get removed from the blacklists.
Also if you are operating in the United States you should be familiar with CAN-SPAM: http://business.ftc.gov/documents/bus61-can-spam-act-Compliance-Guide-for-Business
MailChimp is a viable candidate for this. Serving mail is a time-consuming task, and sending it up to 100k email addresses will be an arduous task for your server.
They provide an very capable PHP API.
https://developer.mailchimp.com/
It is very appropriate to get this out of your web server threads and into something that runs standalone. Typically for stuff like this, I have tables in the DB where the appropriate information is written to from the web site, so when I am ready to e-mail, something on the backend can assemble the e-mails and send them out. If you are sending out 100,000 e-mails, you are going to want something multithreaded.
It might be good in this case to use one of the many off-the-shelf tools for this, rather than reinventing the wheel. We use an old version of Campaign Enterprise here, and I am able to throw queries at it which I can use to pull data from my web DB directly, over ODBC. That may or may not work well for you, considering you are in the cloud.
Edit: You can also write a PHP script to do this and call PHP from the shell. Perhaps you can get around your timeout limit this way? (This is assuming you are referring to some service-level timeout. If you are talking about the regular PHP timeout, this can worked around with set_time_limit().)
You might be able to do with using pcntl_fork or creating a daemon process.
Fork: By using the fork process you could batch the emails into groups and send them out. each batch could be in it's own fork child process
Daemon: By using a Daemon you could create a batch of emails and send them to be processed by the daemon. a daemon could run multiple batches at once.