when I set up a beanstalkd instance, for running in a production environment, should I use a separate server instance (Digital ocean) just as Queue Server? So it is better to separate this service from the rest of the system running on a droplet?
Pay attention to the memory and expected throughput of the queue.
If you have above 10k ops/ second than you need to put on a large dedicated instance, otherwise it's good to store on the same server.
Anytime in the future when you want to move it, you simply pause the system move the binlog to the new server and resume your service and will work.
Related
I currently have a couple sites setup on a server, and they all use beanstalkd for their queues. Some of these sites are staging sites. While I know it would be ideal to have the staging site on another server, it doesn't make financial sense to spin up another server for it in this situation.
I recently ran into a very confusing issue where I was deploying a staging site, which included a reseeding of the database. I have an observer setup on some model saves that will trigger a job to be queued, which usually ends up sending out an email. The staging site does not actually have any queue workers setup to run them, but the production site (on the same server) does have the queue workers running.
What appeared to be happening is the staging site was generating the queue jobs, and the production site was running those queue jobs! This causes random users of mine to be spammed with email, since it would serialize the model from staging, and when it unserialized it running the job, it actually matched up with an actual production user.
It seems like it would be very common to have multiple sites on a server running queues, so I'm curious if there is a way to avoid this issue. Elasticsearch has the concept of a 'cluster', so you can run multiple search 'clusters' on one server. I'm curious if beanstalkd or redis or any other queue provider have this ability, so we don't have crosstalk between completely separate websites.
Thanks!
Beanstalkd has the concept of tubes:
Tubes are job queues.
A common use case of tubes would be to have completely different sets
of producers and consumers running through a single beanstalk instance
such that a given consumer will not know what to do with jobs produced
by some of the producers. Producer1 can enqueue jobs into Tube1 and
Consumer1 can pick up the jobs from there completely independently of
what Producer2 and Consumer2 are doing with Tube2, for example.
For example, if you're using pheanstalk, the producer will call useTube():
$pheanstalk = new Pheanstalk();
$pheanstalk->useTube('foo')->put(...);
And the worker will call watch():
$pheanstalk = new Pheanstalk();
$pheanstalk->watch('foo')->ignore('default')->reserve();
This is an old question but have you tried running multiple beanstalkd daemons? Simply bind to another port.
Example:
beanstalkd -p 11301 &
Use & to fork into background.
we use nginx and php-fpm as our game server.
we want to make sure requests from one player are processed one by one.
Then multi-threading bugs are reduced greatly in our game.
we do not know how to config nginx that way.
thank you.
The web-server itself cannot be run in single-thread mode as far as I know.
I think there is a solution to this problem. You need a queue to process a player's requests.
There are two options to create a thread-safe queue.
One is to write an interface to a thread-safe queue application which resides on the server's memory for PHP. PHP can simply add requests to this thread-safe app and then the app can run them in order.
OR You can simply store requests in a database (as they support simultaneous insertion) and then run a program which reads the requests from the db and executes them in order.
However this will add overhead to execution process.
Im looking to build a distributed video encoding cluster of a few dozen machines. Ive never worked with a messaging queue before, but the 2 that I started playing around with were Gearman and Beanstalkd.
Beanstalk seems to be a lot simpler and easier to use than Gearman, but its not as feature rich as.
One thing I don't understand is... how do you spawn new workers on all the servers? I plan to use php. Is it as simple as running worker.php in CLI with "&" and just have it sit there waiting for work?
I noticed gearman doesn't actually kill the process after a job is done, but Beanstalk does, so I have to restart the script after every job, on every server.
Currently Im more inclined to use Beanstalk, the general flow of things I planned was:
Run a minutely cron on each server that checks if there are pre-defined amount of workers running. If its less than supposed to be, spawn new worker processes. Each process will take roughly 2-30 minutes.
Maybe I have a flaw in my logic here? Let me know what would be a "better" or "proper" way of doing this?
Terminology I will use just to try and be clear...
There is the concept of a producer and a consumer. The producer generates jobs that are put on a queue (i.e. the beanstalk service) that is then read by a consumer.
There are multiple ways to write a consumer. You can either every x time frame via a cron job run the task or just have a consumer running in a while 1 loop via php (or what have you).
Where to install the service is really dependent on what you are going after. For me I normally install the service either on a consumer(s) or on its separate box (with sometimes the latter being overkill depending on your needs).
If you want durability on the queue side then you should use Beanstalk's binlog parameter (-b ). If something happens to your beanstalk service this will allow you to restart with minimal loss of data in the queues (if not no information). Durability on the producer side can come from having multiple queues to try against.
What is the best way to setup a web application to check free RAM on the server and keep users into a waiting queue until sufficient RAM is available again?
I think fetching free RAM on server would only be possible using exec(), right?
I want to enforce this system in my web application as my web application makes use of a lot of RAM and during high traffic i dont want my server to get halted.
Thanks.
You should separate the part of your code that handles web requests and the part that does the resource-intensive work. When you get the web request, put the job into a queue, which separate processes pull jobs off of and do the work. You can have the user on the webpage poll your server every X seconds with AJAX until their job has been processed then update their page.
right now, we have a single server with a cronjob tab that sends out daily emails. We would like to scale that server. The application is standard zend framework application deployed on centos server in amazon cloud.
We already took care of the load balancing, content management and managing deployment. However, the cronjob is still an issue for us, as we need to grantee that some jobs are performed only once.
For example, the daily emails cronjob must only be executed once by a single server. I'm looking for the best method to grantee only one server will execute it only once.
I'm thinking about 2 solutions, but i was wondering if someone else had the same issue.
Make one of the servers "master", who only sends out the daily emails. That will be an issue, if the server malfunction, and generally we don't want to have a "special" server. It would also means we will need to keep track which server is master.
Have a queue of schedule tasks to be performed. Each server open that queue and sees which tasks needed to be performed. The first server who "grab" the task, will preform the task and mark it as done. I was looking at amazon simple queuing service as a solution for the queue.
Both these solutions have advantages and disadvantages, and i was wondering if someone thought about someone else that might help us here.
When you need to scale out cron jobs, you are better off using a job manager like Gearman
Beanstalkd could also be an option for you.
I had the same problem. What I did was dead simple.
I spun up the cheapest EC2 instance on AWS.
I created the cronjob(s) only on this server.
The cron job just run jobs that only makes a simple request to my endpoint / api (i.e. api.mydomain.com).
On my api, i just have a route watching for these special request that will run the job I want. So basically, all I'm doing instead of running the task using a cronjob, im running the task via a http request.
I hope that makes sense! Now it doesn't matter how many servers you have, it will just scale! Also, your cronjob server's only function is to run dead simple jobs to send a request, nothing more.