I am using zeromq to solve a problem which involves several hundred (potentially thousands) clients request tasks to be carried out. Each client would request for a specific task to be carried out, and the result(s), when completed, whould be returned back to the client that issued that request.
These are the actors that I have identified so far, in the pattern I have come up with:
Client: this is the actor that requests a unit of work (or 'job') to be carried out
Controller: this is the actor that loadbalances the 'jobs' accross available engines
Engine: this is the actor that receives a job request from the controller and publishes the result back to the client.
I still haven't yet worked out how the engine gets the mesage back to the client. I am guessing that one way for this to be implemented using zeromq would be:
Client:
PUSH job messages on one socket to Controller SUBscribe to completed results on PUBlished by Engine, on another
socket
Controller:
PULL job messages from client on one socket PUBlish job messages to engines on another socket (clearly, this will be a forwarding device)
Engine:
SUBscribe to job messages on one socket PUBlish result to another socket
It would be most helpful if someone provide a skeleton/snippet which will show the outline of how this pattern may be implemented, using the zeromq framework.
The code snippet can be in C, C++, PHP, Python or C#
[[Edit]]
After reading up on Task Farms (as suggested by akappa). I think this problem can indeed be modelled by a Task Farm. I have modified my original actors accordingly (and changed the title too).
It would still be very useful if someone who is familiar with zeromq, can sketch out a skeleton that would show how I can use the core components to build such a framework.
There are a variety of approaches to this, and IPython.parallel includes two such implementations with ZeroMQ - one simple and pure-zmq, and another that is more elaborate, with the Controller implemented in Python.
We split the Controller into two actors:
Hub - an out-of-the-way process that sees all traffic, and keeps track of the state of the cluster, pushing results to a database, etc., notifying clients about engine connect/disconnect, etc.
Scheduler - at its core, a simple ROUTER-DEALER device that forwards requests from the client(s) to the engines, and the replies back up.
Looking at just the task-farming part of our topology:
Scheduler is a 0MQ Queue device, with a ROUTER and DEALER socket, both of which bind.
Clients have DEALER sockets, connected to the Scheduler's ROUTER
Engines have ROUTER sockets connected to the Scheduler's DEALER
Which makes use of these two properties:
DEALERS LRU load-balance requests across peers
ROUTERs use identity prefixes to send replies back to the peer that made a particular request.
A toy load-balanced task farm with pyzmq, which routes replies back up to the requesting client: https://gist.github.com/1358832
An alternative, where the results go somewhere, but not back up to the requesting client, is the Ventilator-Sink pattern in the 0MQ Guide.
This is a classical master/slave parallel pattern (also known as "Farm" or "Task Farm").
There are billion ways to implement it.
Here there is a way to implement it using MPI, maybe it can be inspirational to you for implementing it in zeromq.
Related
Asking for architectural advice on how to design the system with public API for user clients communication.
I'm working on the project where two clients must be able to communicate at real-time (or close to that) with each other in the most simple way as possible. Let's introduce the resource which has to be accessed by two separate clients. The workflow is the following:
Client #1 connects to the server and creates the resource
Client #2 connects to the server and accesses the resource
Client #1 changes the resource
Client #2 changes the resource
Repeat steps 3 and 4 until done.
The client cannot act until opposing client has not acted - request order must be preserved.
Clients should be able to access the resource via REST API (GET, POST, PUT, DELETE). Each client must wait until the opposing client performs an action. Time for the client to respond and perform an action is about 1-2 seconds (can slightly differ).
Please note that system should be able to handle a high load of concurrent requests (multiple clients communicating at the same time).
The global goal of the application is to provide an API where clients programmed in multiple different languages could communicate at real-time without any polling implementation on the user-client side. User clients must be as simple as possible.
Pseudo user-client example
response = init();
while (response->pending) {
response = get();
}
while (response->action_required) {
response = act();
if (response->error || response->timeout) {
response = get();
}
}
function init() {
// POST resource.example.com
}
function act() {
// PUT resource.example.com
}
function get() {
// GET resource.example.com
}
The problem statement
Since each client must wait until opposing client to act there is a need to introduce the sleep() function in the code which will delay the response until the resource will be affected/changed by the opposing client.
The request polling must be omitted from the user-client and implemented in server side.
Current thoughts and proposal
The initial thought was to implement only the PHP backend and perform response delay inside the API function, however, this implementation seems to cause severe performance issues, so I'm thinking about a more sophisticated solution. Or maybe I am wrong and response delay can successfully be implemented with sleep() inside the PHP backend?
Proposed system architecture
Node WebSocket server (socket.io to receive/return events)
PHP backend with REST API (access/change the resource, fire events to WebSocket)
Node JS application with public API for the end-user client (response delay functionality until the event received)
Please note that PHP backend cannot be replaced in this architecture, however, WebSocket and Node JS application are flexible units for the implementation.
Would be this kind of architecture implementable without severe server performance issues? Is there a better, more feasible way to design this kind of system? Is Node JS application able to handle multiple concurrent requests with response delay or any other kind of web application (Python/Ruby/...) would serve better? Is socket a must-have for this system in order to achieve somewhat real-time behaviour?
Please, share any ideas/insights/suggestions/... what could help to design this system in a sophisticated and well-performing manner.
Thank you in advance!
Some notes:
Avoid Sleep at all costs.
Your use case tends to lend itself to a pub/sub micro-services pattern.
As you need to preserve message processing order you need to have a common queue. Each of your REST API nodes act as a pub/sub publisher onto a distributed message queue system (RabbitMQ, Kafka, etc. type of tech). So for high throughput you now have a farm of machines handing the enqueue. They return immediately with a 201 Accepted, but need a way to mark the message with some kind of client identifier so you can route update messages back over web socket (if you aren't going to poll for status updates by resource id).
You need subscribers to this queue to do the actual processing. Same thing, have these as separate applications and now you can scale out the dequeue and processing. However, the tech you choose for the pub/sub bus needs to be able to invalidate subsequent messages for that resource, and for each one of the invalidated messages provide feedback to your application so that it can send the required message over web socket.
Hope this helps.
What approach, mechanisms (& probably code) one should apply to fully implement Model-to-Views data update (transfer) on Model-State-Change event with pure PHP?
If I'm not mistaken, MVC pattern states an implicit requirement for data to be sent from Model layer to all active Views, specifying that "View is updated on Model change". (otherwise, it doesn't make any sense, as users, working with same source would see its data non-runtime and absolutely disconnected from reality)
But PHP is a scripting PL, so it's limited to "connection threads" via processes & it's lifetime is limited to request-response cycle (as tereško kindly noted).
Thus, one has to solve couple issues:
Client must have a live tunnel connection to server (Server Sent Events),
Server must be able to push data to client (flush(), ob_flush()),
Model-State-Change event must be raised & related data packed for transfer,
(?) Data must be sent to all active clients (connected to same exact resource/URL) together, not just one currently working with it's own processes & instance of ModelClass.php file...
UPDATE 1: So, it seems that "simultaneous" interaction with multiple users with PHP involves implementation of WEB Server over sockets of some sort, independent of NGINX and others.... Making its core non-blocking I/O, storing connections & "simply" looping over connections, serving data....
Thus, if I'm not mistaken the easiest way is, still, to go and get some ready solution like Ratchet, be it a 'concurrency framework' or WEB server on sockets...
Too much overhead for a couple of messages a day, though...
AJAX short polling seems to be quite a solution for this dilemma....
Is simultaneous updating multiple clients easier with some different backend than PHP, I wonder?.. Look at C# - it's event-based, not limited to "connection threads" and to query-reply life cycle, if I remember correctly... But it's still WEB (over same HTTP?)...
There is a long running process(Excel report creation) in my web app that needs to be executed in a background.
Some details about the app and environment.
The app consists of many instances, where each client has separate one (with customized business logic) while everything is hosted on our server. The functionality that produces Excel is the same.
I'm planning to have one rabbitMq server installed. One part of app(Publisher) will take all report options from user and will put it into message. And some background job(Consumer) will consume it, produce report and send it via email.
However, there is a flaw in such design, where,say, users from one instance will queue lots of complicated reports(worth ~10 min of work) and a user from another instance will queue an easy one(1-2 mins) and he will have to wait until others will finish.
There could be separate queues for each app instance, but in that case I would need to create one consumer per instance. Given that there are 100+ instances atm, it doesn't look like a viable approach.
I was thinking if it's possible to have a script that checks all available queues(and consumers) and create a new consumer for a queue that doesn't have one. There are no limitations on language for consumer and such script.
Does that sound like a feasible approach? If not, please give a suggestion.
Thanks
As I understood topic correctly everything lies on one server - RabbitMQ, web application, different instances per client and messeges' consumers. In that case I rather put different topics per message (https://www.rabbitmq.com/tutorials/tutorial-five-python.html) and introduce consumer priorities (https://www.rabbitmq.com/consumer-priority.html). Based on that options during publishing of the message I will create combination of topic and priority of the message - publisher will know number of already sent reports per client, selected options and will decide is it high, low or normal priority.
Logic to pull messages based on that data will be in the consumer so consumer will not get heavy topics when there are in process already 3 (example).
Based on the total number of messages in the queue (its not accurate 100%) and previous topics and priorities you can implement kind of leaking bucket strategy in order to get control of resources- max 100 number of reports generated simultaneously.
You can consider using ZeroMQ (http://zeromq.org) for your case its maybe more suitable that RabbitMQ because is more simple and its broker less solution.
I want to keep a persist connection between my server and clients (android app) so I can push data from server to my clients. After some search I found that the best way to do is WEB SOCKET. But there is two scenarios in here:
first is, I need to sent some data (command) like broad cast to some of my clients (not all) and then listen for their reply. And second is, I need to sent a notify to some of clients.
It's like chat room that there is a general room that the messages can be seen by everybody in room and some private rooms that messages can be seen just by two users who participate in chat.
I saw some example code but I couldn't understand the different between those two scenarios in codes. I need also some information about ZeroMQ and does it worth to use ZeroMQ for the project or not?
Just some links of references would be appreciated.
EDIT
I saw in code that people define some infinite loop to check some event but my idea is to create a virtual client in server that can call by other function so I don't need to change anything in DB and then check for event in my loop. the event can call this virtual client that can sent my command in broad cast. is that a proper way to do that?
Use Ratchet for this, It is a PHP library providing developers with tools to create real time, bi-directional applications between clients and servers over WebSockets.
You can use it's subscribe/unsubscribe (Topics) feature, which will fulfill your needs to send push notification in general and to specific users. Create different topics for them.
e.g. One topic would be general for which every user will get registered on page load and one would be for specific users. You can create as many as topics you need.
I'd like to create an application where when a Super user clicks a link the users should get a notification or rather a content like a pdf for them to access on the screen.
Use Case: When a teacher wants to share a PDF with his students he should be able to notify his students about the pdf available for download and a link has to be provided to do the same.
There are several ways you can accomplish this. The most supported way is through a technique called Comet or long-polling. Basically, the client sends a request to the server and the server doesn't send a response until some event happens. This gives the illusion that the server is pushing to the client.
There are other methods and technologies that actually allow pushing to the client instead of just simulating it (i.e. Web Sockets), but many browsers don't support them.
As you want to implement this in CakePHP (so I assume it's a web-based application), the user will have to have an 'active' page open in order to receive the push messages.
It's worth looking at the first two answers to this, but also just think about how other sites might achieve this. Sites like Facebook, BBC, Stackoverflow all use techniques to keep pages up to date.
I suspect Facebook just uses some AJAX that runs in a loop/timer to periodically pull updates in a way that would make it look like push. If the update request is often enough (short time period), it'll almost look realtime. If it's a long time period it'll look like a pull. Finding the right balance between up-to-dateness and browser/processor/network thrashing is the key.
The actual request shouldn't thrash the system, but the reply in some applications may be much bigger. In your case, the data in each direction is tiny, so you could make the request loop quite short.
Experiment!
Standard HTTP protocol doesn't allow push from server to client. You can emulate this by using for example AJAX requests with small interval.
Have a look at php-amqplib and RabbitMQ. Together they can help you implement AMQP (Advanced Message Queuing Protocol). Essentially your web page can be made to update by pushing a message to it.
[EDIT] I recently came across Pusher which I have implemented for a project. It is a HTML5 WebSocket powered realtime messaging service. It works really well and has a free bottom tier plan. It's also extremely simple to implement.
Check out node.js in combination with socket.io and express. Great starting point here