We are currently developing a mobile app for iOS and Android. For this, we need a stable webservices.
Requirements: - Based on PHP and MySQL, must be blazing fast, must be scalable
I've created a custom-coded simple webservices with multiple endpoints to allow passing data from the app to our database, and vice versa.
My Question:
our average response time with my custom coded solution is below 100ms (measured using newrelic) for normal requests (say, updating a DB field, or performing INSERT INTO). This is without any load however (below 100 users daily). When we are creating outbound requests (specifically, sending E-Mail using SendGrid PHP-Framework) we are seeing a response time of > 1000ms. It appears that the request is "waiting" for a response from Sendgrid. Is it possible to tell the script not to "wait for a response"? This is not really ideal. My idea was to store all "pending" requests to a separate table, and then using a cron to run through all "pending" requests and mark them as "completed". Is this a viable solution? And will one cron each minute be enough for processing requests (possible delay of 1min for each E-Mail)?
As always, any replies or suggestions are very appreciated. Thanks in advance!
To answer the first part of your question: Yes you can make asynchronous requests with PHP, and even ignore the service's response. However, as you correctly say it's not a super great solution.
Asynchronous Requests
This excellent blog post on PHP Asynchronous Requests by Segment.io comes to several conclusions:
You can open a socket and write to it, as described by this Stack Overflow Topic - However, it seems that this is actually blocking and fairly slow (300ms in their tests).
You can write to a log file and then process it in another way (essentially a queue, like you describe) - However, this requires another process to read the log and process it. Using the file system can be slow, and shared files can cause all sorts of problems.
You can fork a cURL request - However, this means you aren't waiting for a response, so if SendGrid (or some other service) responds with an error, you can't catch it and react.
Opinion Land
We're now entering semi-opinion land, but queues as you describe (such as a mySQL one with a cron job, or a text file, or something else) tend to be very scalable as you can throw workers at the queue if you need it to process faster. These can be outside your user facing system (and therefor not share resources).
Queues
With a queue, you'd have a separate service that would be responsible for sending an email with SendGrid (e.g.). It would pull tasks off a queue (e.g. "send an email to Nick")and then execute on it.
There are several ways to implement queues that you can process.
You can write your own - As you seem to want to stay on PHP/mySQL, if you do this you'll need to take into account a bunch of queueing problems and weird edge cases. However, you'll have absolute control and for a simple application maybe this will work.
You can implement a self hosted task queue - Celery is meant to be a distributed task queue, øMQ (ZeroMQ) and RabbitMQ can also be used as Task Queues. These are meant to be fast and distributed and have had a lot of thought put into them. You'd need to benchmark them in your system to see if they speed it up. It'd also mean you have to host additional pieces yourself. This however, is likely to be the fastest solution from a communication standpoint.
You can pass things off to a hosted task queue - IronMQ and Amazon SQS are both cool hosted solutions which means you wouldn't need to dedicate resources to them, additionally with IronWorkers (e.g.) you could have the other service taken care of. However, since you're trying to optimize a request to an external service, this probably isn't the solution in this scenario.
Queueing Emails
On the topic of queuing emails (specifically), this is something common to email senders. Like with everything else it means you can have better reliability (because if a service down the line fails you can keep it in the queue and retry).
With email however, there's some specific services out there for queueing messages. These are SMTP Servers. Theoretically you can setup a server like sendmail and then set SendGrid as your "smarthost" or relay and have the server send to SendGrid. It then queues and deals with service interruptions and sends mail with little additional code. However, SMTP servers are pains to deal with, even if they're just forwarding messages. Additionally, SMTP is even slower than HTTP to establish a connection and therefor probably not what you want, but it's good to know.
Another possible solution if you control your own server environment that will speed up your email sending and your application is to install a mail server such as Postfix locally. You then configure Postfix to use your Sendgrid credentials, so any email sent will go from your server to sendgrid.
This is not a PHP solution, but removes the need for writing your own customer solution. If you set Postfix as the default mail server. You can then just use the php mail() function to send email.
https://sendgrid.com/docs/Integrate/Mail_Servers/postfix.html
Related
So a friend and I are building a web based, AJAX chat software with a jQuery and PHP core. Up to now, we've been using the standard procedure of calling the sever every two seconds or so looking for updates. However I've come to dislike this method as it's not fast, nor is it "cost effective" in that there are tons of requests going back and forth from the server, even if no data is returned.
One of our project supporters recommended we look into a technique known as COMET, or more specifically, Long Polling. However after reading about it in different articles and blog posts, I've found that it isn't all that practical when used with Apache servers. It seems that most people just say "It isn't a good idea", but don't give much in the way of specifics in the way of how many requests can Apache handle at one time.
The whole purpose of PureChat is to provide people with a chat that looks great, goes fast, and works on most servers. As such, I'm assuming that about 96% of our users will being using Apache, and not Lighttpd or Nginx, which are supposedly more suited for long polling.
Getting to the Point:
In your opinion, is it better to continue using setInterval and repeatedly request new data? Or is it better to go with Long Polling, despite the fact that most users will be using Apache? Also, it possible to get a more specific rundown on approximately how many people can be using the chat before an Apache server rolls over and dies?
As Andrew stated, a socket connection is the ultimate solution for asynchronous communication with a server, although only the most cutting edge browsers support WebSockets at this point. socket.io is an open source API you can use which will initiate a WebSocket connection if the browser supports it, but will fall back to a Flash alternative if the browser does not support it. This would be transparent to the coder using the API however.
Socket connections basically keep open communication between the browser and the server so that each can send messages to each other at any time. The socket server daemon would keep a list of connected subscribers, and when it receives a message from one of the subscribers, it can immediately send this message back out to all of the subscribers.
For socket connections however, you need a socket server daemon running full time on your server. While this can be done with command line PHP (no Apache needed), it is better suited for something like node.js, a non-blocking server-side JavaScript api.
node.js would also be better for what you are talking about, long polling. Basically node.js is event driven and single threaded. This means you can keep many connections open without having to open as many threads, which would eat up tons of memory (Apaches problem). This allows for high availability. What you have to keep in mind however is that even if you were using a non-blocking file server like Nginx, PHP has many blocking network calls. Since It is running on a single thread, each (for instance) MySQL call would basically halt the server until a response for that MySQL call is returned. Nothing else would get done while this is happening, making your non-blocking server useless. If however you used a non-blocking language like JavaScript (node.js) for your network calls, this would not be an issue. Instead of waiting for a response from MySQL, it would set a handler function to handle the response whenever it becomes available, allowing the server to handle other requests while it is waiting.
For long polling, you would basically send a request, the server would wait 50 seconds before responding. It will respond sooner than 50 seconds if it has anything to report, otherwise it waits. If there is nothing to report after 50 seconds, it sends a response anyways so that the browser does not time out. The response would trigger the browser to send another request, and the process starts over again. This allows for fewer requests and snappier responses, but again, not as good as a socket connection.
I'm running an Amazon EC2 'Large' instance - Ubuntu Natty x64 with PHP5 and MySQL. We execute a PHP script via CRON - this sends an email list (2000-4000 emails) using SMTP/PHPMailer.
The server runs very slowly (several of these CRON jobs run in parallel) and it's making the CPU go to 100%. Memory usage is low (only ~600mb / 8gigs used) and each CRON job takes a significant chunk of CPU%, for example 20-30% each with 4-5 running in parallel.
Trying to pinpoint the issue, I ran slow query log in MySQL but nothing caught my attention. How should I go about narrowing down the cause of this CPU usage? Is SMTP/email just that CPU intensive or is it a sign that there is a programming or server issue? Thanks!
EDIT: The issue is resolved. There was a trivial (of course) bug that caused emails to 'grow' (some of the previous email content was being injected into the next email) - so the email pre-processing got more and more ridiculous with each subscriber. The resulting emails had hundreds/thousands of tracking images which all hit our server simultaneously when opened i.e. 'display images' in gmail. After fending off the self-inflicted DDoS attack and two days of no sleep, I am going to enjoy a bottle of Captain Morgan while contemplating various choices I've made in life.
Things that can cause this (Non-exhaustive list):
Non block IO with the SMTP server.
Implementation of SMTP library used in php with long strings manipulation/long files encoded every loop (remember: the protocol must be corrected formatted, and this is checked/encoded every time you call the send method by many other methods).
One (or more) query per mail.
Try measuring the time spent on each operation performed inside the loop.
You can use a simple $start = microtime (true) and printf (___FILE__.':'.__LINE__.": here after % 0.8f seconds\n", microtime(true) - $start); to a debug file or another profiling tool.
Try to reduce protocol formatting/encoding time.
Not allow more than Number of cores in your machine here instances of php scripts running simultaneously.
First, establish exactly which programs are taking 100% CPU.
If it's the PHP interpreter then there has be something wrong in your code - an SMTP client should never manage to reach 100% utilisation because a lot of the time it'll be limited in throughput by the SMTP server.
It may not be limited just to php... is the SMTP server you are connecting to on the local box? Are you running out of sockets? Are the requests blocking themselves?
Usually for things like what are are doing, a queue based approach is usually best.
Have you though about using a third party service for sending mail, where all you do is send a API HTTP request? There are several benefits to this, most of these services have relationships setup with mail servers so that your emails actually get to inboxes, and your SMTP server does not get blacklisted as spammy. Amazon has a service that can do this, so do others like Postmark.
right now, we have a single server with a cronjob tab that sends out daily emails. We would like to scale that server. The application is standard zend framework application deployed on centos server in amazon cloud.
We already took care of the load balancing, content management and managing deployment. However, the cronjob is still an issue for us, as we need to grantee that some jobs are performed only once.
For example, the daily emails cronjob must only be executed once by a single server. I'm looking for the best method to grantee only one server will execute it only once.
I'm thinking about 2 solutions, but i was wondering if someone else had the same issue.
Make one of the servers "master", who only sends out the daily emails. That will be an issue, if the server malfunction, and generally we don't want to have a "special" server. It would also means we will need to keep track which server is master.
Have a queue of schedule tasks to be performed. Each server open that queue and sees which tasks needed to be performed. The first server who "grab" the task, will preform the task and mark it as done. I was looking at amazon simple queuing service as a solution for the queue.
Both these solutions have advantages and disadvantages, and i was wondering if someone thought about someone else that might help us here.
When you need to scale out cron jobs, you are better off using a job manager like Gearman
Beanstalkd could also be an option for you.
I had the same problem. What I did was dead simple.
I spun up the cheapest EC2 instance on AWS.
I created the cronjob(s) only on this server.
The cron job just run jobs that only makes a simple request to my endpoint / api (i.e. api.mydomain.com).
On my api, i just have a route watching for these special request that will run the job I want. So basically, all I'm doing instead of running the task using a cronjob, im running the task via a http request.
I hope that makes sense! Now it doesn't matter how many servers you have, it will just scale! Also, your cronjob server's only function is to run dead simple jobs to send a request, nothing more.
I have developed a web application where students across the country come and register for some academic purpose. The users are expected to be around 100k within next year.
I need to send all of these people periodic mails. The web app is developed using codeigniter. The php script can run for 3000 seconds. But still the app is unable to send mails to more that 100 users.
The machine I run is in cloud and has got 256MB ram. I used the free -m command to check the memory usage but that doesnt seem to be a problem. Everything works fine for 10-20 mails.
What would be the best solutions? Is there any way I can transfer this job to some other app/program/shell script ?
If you cannot use some external service for your emails I would just setup a cronjob that sends a couple of emails every n seconds. Its pretty cumbersome to send a lot of emails with php as you have discovered. But the cronjob solution works everytime as far as I know.
So you have a list of emails/addresses and a cronjob that iterates that list and sends the emails.
Sure you can send the emails yourself from a server, but that is only half the battle.
If you are sending bulk emails, as opposed to the transactional type, it's best to use a third party service that is already whitelisted on mail servers. The primary reason being, you might get blacklisted by the major mail servers as a spammer. If this happens, you will have to work with them individually to get removed from the blacklists.
Also if you are operating in the United States you should be familiar with CAN-SPAM: http://business.ftc.gov/documents/bus61-can-spam-act-Compliance-Guide-for-Business
MailChimp is a viable candidate for this. Serving mail is a time-consuming task, and sending it up to 100k email addresses will be an arduous task for your server.
They provide an very capable PHP API.
https://developer.mailchimp.com/
It is very appropriate to get this out of your web server threads and into something that runs standalone. Typically for stuff like this, I have tables in the DB where the appropriate information is written to from the web site, so when I am ready to e-mail, something on the backend can assemble the e-mails and send them out. If you are sending out 100,000 e-mails, you are going to want something multithreaded.
It might be good in this case to use one of the many off-the-shelf tools for this, rather than reinventing the wheel. We use an old version of Campaign Enterprise here, and I am able to throw queries at it which I can use to pull data from my web DB directly, over ODBC. That may or may not work well for you, considering you are in the cloud.
Edit: You can also write a PHP script to do this and call PHP from the shell. Perhaps you can get around your timeout limit this way? (This is assuming you are referring to some service-level timeout. If you are talking about the regular PHP timeout, this can worked around with set_time_limit().)
You might be able to do with using pcntl_fork or creating a daemon process.
Fork: By using the fork process you could batch the emails into groups and send them out. each batch could be in it's own fork child process
Daemon: By using a Daemon you could create a batch of emails and send them to be processed by the daemon. a daemon could run multiple batches at once.
I am creating a web application using zend, here I create an interface from where user-A can send email to more than one user(s) & it works excellent but it slow the execution time because of which user-A wait too much for the "acknowledged response" ( which will show after the emails have sent. )
In Java there are "Threads" by which we can perform that task (send emails) & it does not slow the rest application.
Is there any technique in PHP/Zend just like in Java by which we can divide our tasks which could take much time eg: sending emails.
EDIT (thanks #Efazati, there seems to be new development in this direction)
http://php.net/manual/en/book.pthreads.php
Caution: (from here on the bottom):
pthreads was, and is, an experiment with pretty good results. Any of its limitations or features may change at any time; [...]
/EDIT
No threads in PHP!
The workaround is to store jobs in a queue (say rows in a table with the emails) and have a cronjob call your php script at a given interval (say 2 minutes) and poll for jobs. When jobs present fetch a few (depending on your php's install timeout) and send emails.
The main idea to defer execution:
main script adds jobs in the queue
cron script sends them in tiny slices
Gotchas:
make sure u don't send an email without deleting from queue (worst case would be if a user rescieves some spam at 2 mins interval ...)
make sure you don't delete a job without executing it first ...
handle bouncing email using a score algorithm
You could look into using multiple processes, such as with fork. The communication between them wouldn't be as simple as with threads (but then, it won't come with all of its pitfalls either), but if you're just sending emails, it might not be necessary to communicate much, if at all.
Watch out for doing forks on an Apache process. You may get some behaviors that you are not expecting. If you are looking to do any kind of asynchronous execution it should be via some kind of queuing mechanism. Gearman is one. Zend Server Job Queue is another. I have some demo code at Do you queue? Introduction to the Zend Server Job Queue. Cron can be used, but you'll have the problem of depending on your cron scheduler to run tasks whereas asynchronous computing often needs to be run immediately. Using a queuing system allows you to do that without threading.
There is a Threading extension being developed based on PThreads that looks promising at https://github.com/krakjoe/pthreads
There is pcntl, which allows you to create sub-processes, but php doesn't work very well for this kind of architecture. You're probably better off creating a long-running script (a daemon) and spawning multiple of them.
As of PHP there are no threads in it. However for php, you can have a look at this roundabout way
http://www.alternateinterior.com/2007/05/multi-threading-strategies-in-php.html
You may want to use a queue system for your email sending and send the email from another system which supports threads. PHP is just a tool and you should the tool that is best fitted for the job.
PHP doesn't include threading as part of the language, there are some methods that can emulate it but they aren't foolproof.
This Google search shows a few potential workarounds