Test Gearman/MySQL - Persistence Layer

Test Gearman/MySQL - Persistence Layer - php

Running Fedora, PHP/Gearman/MySQL/Drizzle.
Built Gearman/Drizzle from source, and have the process running on a linux/fedora box. I created the mysql test table, and can see that the Gearman Daemon instance can access/interface with the mysql service. I'm running the Gearman and mysql processes on the same box, using TCP.
When Gearman is started, and points to the MySQL account, I can see the initial select statements in the DEBUG information that's displayed as the Gearman process runs.
However, I'm not sure what I need to do to actually test that a job from the Client is stored in the mysql Table.
I created a test client that replicates the Gearman Client/Worker "Review" test, that normally works if the worker is running, and ran the client without the worker. I see in the DEBUG process that the client connects with the Gearman daemon, but when I examine the mysql table, nothing is in the table.
So my question really boils down to determining what I need to do to actually be able to see/ensure that jobs/data is really written to the actual mysql table.
Is there a given flag, method to call somewhere to establish that data is to be stored in the mysql table if not processed by the worker? Shouldn't a job be stored in the table, and then removed once it's processed? Or am I missing something in the flow?

http://gearman.org/manual/job_server/#persistent_queues
I guess what you are looking for is:
gearmand --queue-type=mysql --mysql-host=hostname

Related

Multithreading implementation pattern

First of all, i'm using pthreads. So the scenario is this: There are servers of a game that send logs over UDP to an ip and port you give them. I'm building an application that will receive those logs, process them and insert them in a mysql database. Since i'm using blocking sockets because the number of servers will never go over 20-30, I'm thinking that i will create a thread for each socket that will receive and process logs for that socket. All the mysql infromation that needs to be inserted in the database will be send to a redis queue where it will get processed by another php running. Is this ok, or better, is it reliable ?

Don't use php for long running processes (php script used for inserting in your graph). The language is designed for web requests (which die after a couple of ms or max seconds). You will run into memory problems all the time.

multiple automated tasks to run ubuntu 10.04 server

I need to run automated tasks every 15. The tasks is for my server (call it server A, Ubuntu 10.04 LAMP) to query for updates to another server (Server B).
I have multiple users to query for, possibly 14 (or more) users. As of now, the scripts are written in PHP. They do the following:
request Server B for updates on users
If Server B says there are updates, then Server A retrieves the updates
In Server A, update DB with the new data for the users in
run calculations in server A
send prompts to users that I received data updates from.
I know cron jobs might be the way to go, but there could be a scenario where I might have a cron job for every user. Is that reasonable? Or should I force it to be one cron job querying data for all my users?
Also, the server I'm querying has a java api that I can use to query it. Which means I could develop a java servlet to do the same. I've had trouble with this approach but I'm looking for feedback if this is the way to go. I'm not familiar with Tomcat and I don't fully understand it yet.
Summary: I need my server to run tasks automatically every 15 mins, the requests data from another server, updates its DB and then send prompts to users. What's are recommended approaches?
Thanks for your help!

Create a single script, triggered by cron, that loops through each user and takes all three steps for each user: Pseudo code:
query for list of users from local DB;
foreach(users as user){
check for updates;
if(updates){
update db;
email user;
}
}
If you have a lot of users or the API is slow, you'll either want to set a long script timeout (ini_set) or you could add a TIMESTAMP DB column "LastUpdateCheck" and run the cron more often (every 30 seconds?) but limiting the update/API query to one or two users per instance (those with the oldest update times)

Cron Tasks on load balanced web servers

I'm looking for better solution to handling our cron tasks in a load balanced environment.
Currently have:
PHP application running on 3 CentOS servers behind a load balancer.
Tasks that need to be run periodically but only on a single machine at a time.
Good old cron set up to run those tasks on the first server.
Problems if the first server is out of play for whatever reason.
Looking for:
Something more robust and de-centralized.
Load balancing the tasks so multiple tasks would run only once but on random/different servers to spread the load.
Preventing not having the tasks run when the first server goes down.
Being able to manage tasks and see aggregate reports ideally using a web interface.
Notifications if anything goes wrong.
The solution doesn't need to be implemented in PHP but it would be nice as it would allow us to easily tweak it if needed.
I have found two projects that look promissing. GNUBatch and Job Scheduler. Will most likely further test both but I wonder if someone has better solution for the above.
Thanks.

You can use this small library that uses redis to create a temporary timed lock:
https://github.com/AlexDisler/MutexLock
The servers should be identical and have the same cron configuration. The server that will be first to create the lock will also execute the task. The other servers will see the lock and exit without executing anything.
For example, in the php file that executes the scheduled task:
MutexLock\Lock::init([
'host' => $redisHost,
'port' => $redisPort
]);
// check if a lock was already created,
// if it was, it means that another server is already executing this task
if (!MutexLock\Lock::set($lockKeyName, $lockTimeInSeconds)) {
return;
}
// if no lock was created, execute the scheduled task
scheduledTaskThatRunsOnlyOnce();
To run the tasks in a de-centralized way and spread the load, take a look at: https://github.com/chrisboulton/php-resque
It's a php port of the ruby version of resque and it stores the data in the same exact format so you can use https://github.com/resque/resque-web or http://resqueboard.kamisama.me/ to monitor the workers and see reports

Assuming you have a database available not hosted on one of those 3 servers;
Write a "wrapper" script that goes in cron, and takes the program you're running as its argument. The very first thing it does is connect to the remote database, and check when the last time an entry was inserted into a table (created for this wrapper). If the last insertion time is greater than when it was supposed to run, then insert a new record into the table with the current time, and execute the wrapper's argument (your cron job).
Cron up the wrapper on each server, each set X minutes behind the other (server A runs at the top of the hour, server B runs at 5 minutes, C at 10 minutes, etc).
The first server will always execute the cron first, so the other two servers never will. If the first server goes down, the second server will see it hasn't ran, and will run it.
If you also record in the table which server it was that executed the job, you'll have a log of when/where the script was executed.

Wouldn't this be an ideal situation for using a message / task queue?

I ran into the same problem but came up with this litte repository:
https://github.com/incapption/LoadBalancedCronTask

MySQL trigger + notify a long-polling Apache/PHP connection

I know there are Comet server technologies that do this but I want to write something simple and home-grown.
When a record is inserted into a MySQL table, I want it to somehow communicate this data to a series of long-polled Apache connections using PHP (or whatever). So multiple people are "listening" through their browser and the second the MySQL INSERT happens, it is sent to their browser and executed.
The easy way is to have the PHP script poll the MySQL database, but this isn't really pushing from the server and introduces some unacceptable order of unnecessary database queries. I want to get that data from MySQL to the long-polling connection essentially without the listeners querying at all.
Any ideas on how to implement this?

I have been trying all kinds of ideas for a solution to this as well and the only way to take out the polling of sql queries is to poll a file instead. If the fill equals 0 then continue looping. If file equals 1 have loop run sql queries and send to users. It does add another level of complexity but I would think it means less work by mysql but same for apache or what ever looping daemon. You could also send the command to a daemon "comet" style but it is going to fork and loop on each request as well from what I have seen on how sockets work so hopefully someone will find a solution to this.

This is something I have been looking for as well for many years. I have not found any functionality where the SQL server pushes out a message on INSERTS, DELETE and UPDATES.
TRIGGERS can run SQL on these events, but that is of no use here.
I guess you have to construct your own system. You can easily broadcast a UDP from PHP (Example in first comment), the problem is that PHP is running on the server side and the clients are static.
My guess would be that you could do a Java Applet running on the client, listening for the UDP message and then trigger an update of the page.
This was only some thoughts in the moment of writing...

MySQL probably isn't the right tool for this problem. Regilero suggested switching your DB, but an easier solution might be to use something like redis which has a pub/sub feature.
http://redis.io/topics/pubsub

Infrastructure for Running your Zend Queue Receiver

I have a simple messaging queue setup and running using the Zend_Queue object heirarchy. I'm using a Zend_Queue_Adapter_Db back-end. I'm interested in using this as a job queue, to schedule things for processing at a later time. They're jobs that don't need to happen immediately, but should happen sooner rather than later.
Is there a best-practices/standard way to setup your infrastructure to run jobs? I understand the code for receiving a message from the queue, but what's not so clear to me is how run the program that does that receiving. A cron that receives n messages on the command-line, run once a minute? A cron that fires off multiple web requests, each web request running the receiver script? Something else?
Tangential bonus question. If I'm running other queries with Zend_Db, will the message queue queries be considered part of that transaction?

You can do it like a thread pool. Create a command line php script to handle the receiving. It should be started by a shell script that automatically restarts the process if it dies. The shell script should not start the process if it is already running (use a $pid.running file or similar). Have cron run several of these every 1-10 minutes. That should handle the receiving nicely.
I wouldn't have the cron fire a web request unless your cron is on another server for some strange reason.
Another way to use this would be to have some backround process creating data, and a web user(s) consume it as they naturally browse the site. A report generator might work this way. Company-wide reports are available to all users but you don't want them all generating this db/time intensive report. So you create a queue and process one at a time possible removing duplicates. All users can view the report(s) when ready.
According to the docs it doens't look like the zend db is even using the same connection as your other zend_db queries. But of course the best way to find out is to make a simple test.
EDIT
The multiple lines in the cron are for concurrency. each line represents a worker for the pool. I was not clear, you don't want the pid as the identifier, you want to pass that as a parameter.
/home/byron/run_queue.sh Process1
/home/byron/run_queue.sh Process2
/home/byron/run_queue.sh Process3
The bash script would check for the $process.running file if it finds it exit.
otherwise:
Create the $process.running file.
start the php process. Block/wait until finished.
Delete the $process.running file.
This allows for the php script to die but not cause the pool to loose a worker.
If the queue is empty the php script exits immediately and is started again by the nex invocation of cron.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.