This question is a continuation to questions I have asked previously with regard to printing documents via Word on Windows from Laravel.
My issue was that I did not want to launch the necessary printing tasks within a POST request as this would show no feedback of the task, and would only respond once the task completed.
For example, if I called the POST /pledge/submit route, I would not want to call the necessary tasks for printing within that same request for the route.
Now, I see that Laravel 4 has a facility called Queues, which (I assume) would allow me to background process these tasks, and postpone them until a later time.
Having read through the documentation, I see that it supports four different drivers, one of which is sync.
Question: Can I use this driver to create new print jobs in the queue, and have them executed by an external application (such as one created in Delphi)? The app would periodically check to see if there are items in the queue, and then execute them (and, of course, remove them).
I am simply trying to find the best way to publish documents without the end-user having to wait for the page to respond whilst printing is underway. Further, I am new to queues in PHP, and am not familiar with how they work (in so far as a detailed process flow). If someone could also explain this, I would appreciate it very much.
The queue system wouldnt work for your Delphi program out of the box - you would need to make some modifications.
Instead - the easiest way would be to make your own 'table' in your database, called 'pending_print_jobs'.
When the user wants to print job 'x' - you get PHP to save the print job in the 'pending_print_jobs' table with all the details you need (such as file to be printed, the user who did it etc etc).
Then you would get your external application (i.e. your Delphi program) to periodically check the 'pending_print_jobs' table in your database. If it finds any records - it can action them - and print the file.
Related
I am currently working on a fairly large purely dynamic site that contains various information that needs to be expired. Previously i did not have to worry about expiration because it was handled on user login ( various checks would run to expire the logged in user data if needed ) but with our increase member base and inactivity of users the data within the db is getting old. Normally this would not be a problem but the old data affects the rest of the sites features/functionality ( point based system features implemented, team building features, etc. ) All data stored in the database has an expiration timer so all i have to do is soft-delete the data using a php script but i don't want to trigger this on page load ( i want to avoid slowing down the user page load )
What alternatives are available aside from cronjobs. I want to be able to setup and manage the background services through php so i don't have to edit/create crons every time i need something new added, etc.
Ideally i am looking for or trying to implement a system that will allow me to insert a db row with specific instructions ( queue a specific update ) and it will be handled on the backend. I want/need to have the data updated as soon as possible to avoid the issues we are running into now. This background processor will eventually handle larger more complex tasks like auto scheduling an on site event ( tournaments ), or auto generating brackets for these tournaments. All help is appreciated!
You could try the MySQL event scheduler, so you could initiate a SQL script which would re-run every x days:
CREATE EVENT `delete_old_data`
ON SCHEDULE schedule
ON COMPLETION NOT PRESERVE
ENABLE
DO BEGIN
-- event body, something like delete all rows older than 5 days
END;
To be honest to do more complex events like generating 'site events' it looks like you should be using a cronjob. Is there any reason you cannot use this? I am sure if you explain to your hosting provider that you need a cronjob and show them the code that you will be using it for they will enable this option on your account. (most hosts should have this enabled already)
There are several approaches, but none of them is perfect.
CRONJOBS
You've pointed that you don't want to use them. But if your main concern about cronjobs is crontab management, you may just setup one every-minute PHP cron which then may trigger different tasks with different resolutions. You can fully manage background services via PHP that way. The main disadvantage is that you cannot perform your background tasks parallely.
GEARMAN WORKERS
You may also use 3rd party software, like Gearman. Gearman allow you to write so called workers (your background services), in any language you like (PHP is supported, as well as c++, java and many others). These workers behave like ansychronous functions which may be called from anywhere of your code using gearman api. You don't have to carry about the result, simply call the function and forget. It will be done in background. Also task scheduling is a built-in gearman feature.
Another software you may use is Rabbit MQ or any other message bus system.
NATIVE PHP ASYNC FUNCTIONS
New PHP7 will bring native php asynchronous programming. You will be able to separate your PHP script and http request handling, a bit like in node.js. Your script will be able to work all the time, doing some background stuff in any way you like and handling http requests like events in another process. This is probably the best option for you, if you may wait till the release date.
My answer covers only solutions I was using personally. Possibly there are also another ways to reach your goal, so keep searching!
I am a programmer at an internet marketing company that primaraly makes tools. These tools have certian requirements:
They run in a browser and must work in all of them.
The user either uploads something (.csv) to process or they provide a URL and API calls are made to retrieve information about it.
They are moving around THOUSANDS of lines of data (think large databases). These tools literally run for hours, usually over night.
The user must be able to watch live as their information is processed and is presented to them.
Currently we are writing in PHP, MySQL and Ajax.
My question is how do I process LARGE quantities of data and provide a user experience as the tool is running. Currently I use a custom queue system that sends ajax calls and inserts rows into tables or data into divs.
This method is a huge pain in the ass and couldnt possibly be the correct method. Should I be using a templating system or is there a better way to refresh chunks of the page with A LOT of data. And I really mean a lot of data because we come close to maxing out PHP memory and is something we are always on the look for.
Also I would love to make it so these tools could run on the server by themselves. I mean upload a .csv and close the browser window and then have an email sent to the user when the tool is done.
Does anyone have any methods (programming standards) for me that are better than using .ajax calls? Thank you.
I wanted to update with some notes incase anyone has the same question. I am looking into the following to see which is the best solution:
SlickGrid / DataTables
GearMan
Web Socket
Ratchet
Node.js
These are in no particular order and the one I choose will be based on what works for my issue and what can be used by the rest of my department. I will update when I pick the golden framework.
First of all, you cannot handle big data via Ajax. To make users able to watch the processes live you can do this using web sockets. As you are experienced in PHP, I can suggest you Ratchet which is quite new.
On the other hand, to make calculations and store big data I would use NoSQL instead of MySQL
Since you're kind of pinched for time already, migrating to Node.js may not be time sensitive. It'll also help with the question of notifying users of when the results are ready as it can do browser notification push without polling. As it makes use of Javascript you might find some of your client-side code is reusable.
I think you can run what you need in the background with some kind of Queue manager. I use something similar with CakePHP and it lets me run time intensive processes in the background asynchronously, so the browser does not need to be open.
Another plus side for this is that it's scalable, as it's easy to increase the number of queue workers running.
Basically with PHP, you just need a cron job that runs every once in a while that starts a worker that checks a Queue database for pending tasks. If none are found it keeps running in a loop until one shows up.
I have a process users must go through on my site which can take quite a bit of time (upwards of an hour in certain cases).
I'd like to be able to have the user start the process, then be told that it is running in the background and they can leave the page and will be emailed when the process is complete. This would help avoid cases when the user gets impatient and closes the window before the process has finished.
An example of how it would ideally look is how Mailchimp handles importing contacts. You upload a CSV file of your contacts, and they then say that the contacts are currently uploading, but it can take a while so feel free to leave the page.
What would be the best way to accomplish this? I looked into Gearman, however it seems like that tool is more useful for scaling large amounts of tasks to happen quickly, not running processes in the background.
Thanks for your help.
Even it doesn't seem to be what you'd use at the first look, I think I would use Gearman, for that :
You can push tasks to it when the user does his action
It'll deal with both :
balancing tasks to several servers, if you have more than one
queuing, so no more than X tasks are executed in parallel.
No need to re-invent the wheel ;-)
You might want to take a look at creating a daemon. I'd suggestion writing the daemon in a language other than PHP (node.js maybe?), but if you already have a large(ish) code base in PHP this mightn't be desirable. Try taking a look at How to design a daemon with a MySQL DB connection.
I've been working on a library call LooPHP in PHP to allow event driven programming for PHP (often desirable for daemons). The library allows for timed events, multi-threaded listeners (when you want one event queue to be feed from >1 type of source).
If you could give us some more information on what exactly this background process does, it might be helpful.
Write out a file using the user's ID as the filename. Spawn a new process to perform whatever it is you want it to do (if what you want is to have it execute some more PHP, then you can just call PHP with the script you want to run). When that process is done, have it delete that file. If the user visits the page again, have the script check for existence of the file (since the filename is predictable based on user ID). If it exists, then you're still processing, so tell them to continue waiting. Maybe have some upper bound to wait, where if they come back and the file exists, but it's been, say, 5 hours, delete the file and let them try again.
After searching the web for a good Comet and also and asking you guys what my best option is, I've chose to go with Orbited. The problem is if you need a good documentation about Comet you won't find. I've installed Orbited and It seems It works just fine.
Basically, I want to constantly check a database and see if there is a new data. If there is, I want to push it to my clients and update their home page but I can't find any good and clear doc explaining how constantly check the database and push the new info to Orbited and then to the clients. Have you guys implemented that?
Also, how many users can Orbited handle?
Any ideas?
You could add a database trigger that sends messages to your message queue when the database got changed. This is also suggested here. Or, if it is only your app talking to the database, you could handle this from within the app via a Subject/Observer pattern, notifying the queue whenever someone called an action changing something in the DB.
I don't know how good or bad Orbited scales.
Have a reference table that keeps track of the last updated time of the source table. Create a update/delete/insert trigger for the source table that updates the time in the reference table.
Your comet script should keep checking the reference table for any change in the time. If the change is noticed, you can read the updated source table and push the data to your client's home page. Checking the reference table in a loop is faster because the MySQL will serve the results from its cache if nothing has changed.
And sorry, I don't know much about Orbited.
I would use the STOMP protocol with Orbited to communicate and push data to clients. Just find a good STOMP client with PHP and get started.
Here is an example of a use case with STOMP, although the server side is written in Ruby:
http://fuglyatblogging.wordpress.com/2008/10/
I don't know if PHP with Apache (if that's what you are using) is the best suite for monitoring database changes. Read this article, under the section title "Orbited Server", for an explanation: http://thingsilearned.com/2009/06/09/starting-out-with-comet-orbited-part-1/
EDIT: If you want to go the route with PHP through a web server, you need to make one, and one only, request to a script that starts the monitoring and pushes out changes. And if that script times out or fails, you need to start a new one. A bit fugly :) A nicer, cleaner way would be, for example, to use twisted with python to start a monitoring process, completely separated from the web-server.
First,
the set up:
I have a script that executes several tasks after a user hits the "upload" button that sends the script the data it need. Now, this part is currently mandatory, we don't have the option at this point to cut out the upload and draw from a live source.
This section intentionally long-winded to make a point. Skip ahead if you hate that
Right now the data is parsed from a really funky source using regex, then broken down into an array. It then checks the DB for any data already in the uploaded data's date range. If the data date ranges don't already exist in the DB, it inserts the data and outputs success to the user (there is also some security checks, data source validation, and basic upload validation)... If the data does exist, the script then gets the data already in the DB, finds the differences between the two sets, deletes the old data that doesn't match, adds the new data, and then sends an email to each person affected by these changes (one email per person with all relevant changes in said email, which is a whole other step). The email addresses are pulled by means of an LDAP search as our DB has their work email but the LDAP has their personal email which ensures they get the email before they come in the next day and get caught unaware. Finally, the data-uploader is told "Changes have been made, emails have been sent." which is really all they care about.
Now I may be adding a Google Calendar API that posts the data (when it's scheduling data) to the user's Google Calendar. I would do it via their work calendar, but I thought I'd get my toes wet with Google's API before dealing with setting up a WebDav system for Exchange.
</backstory>
Now!
The practical question
At this point, pre-Google integration, the script takes at most a second and a half to run. It's pretty impressive, at least I think so (the server, not my coding). But the Google bit, in tests, is SLOOOOW. We can probably fix that, but it raises the bigger question...
What is the best way to off-load some of the work after the user has gotten confirmation that the DB has been updated? This is the part he's most concerned with and the part most critical. Email notifications and Google Calendar updates are only there for the benefit of those affected by the upload, and if there is a problem with these notifications, he'll hear about it (and then I'll hear about it) regardless of the script telling him first.
So is there a way, for example, to run a cronjob that's triggered by a script's last execution? Can PHP create cronjobs with exec() ability? Is there some normalized way of handling post-execution work that needs getting done?
Any advice on this is really appreciated. I feel like the scripts bloated-ness reflects my stage of development and the need for me to finally know how to do division-of-labor in web apps.
But I also get worried that this is not done, as user's need to know when all tasks are completed, etc. So this brings up:
The best-practices/more-subjective question
Basically, is there an idea that progress bars, real-time offloading, and other ways of keeping the user tethered to the script are --when combined with optimization of the code, of course-- the better, more-preferred method then simply saying "We're done with your part, if you need us, we'll be notifying users" etc etc.
Are there any BIG things to avoid (other than obviously not giving the user any feedback at all)?
Thanks for reading. The coding part is crucial, so don't feel obliged to cover the second part or forget to cover the coding part!
A cron job is good for this. If all you want to do when a user uploads data is say "Hey user, thanks for the data!" then this will be fine.
If you prefer a more immediate approach, then you can use exec() to start a background process. In a Linux environment it would look something like this:
exec("php /path/to/your/worker/script.php >/dev/null &");
The & part says "run me in the backgound." The >/dev/null part redirects output to a black hole. As far as handling all errors and notifying appropriate parties--this is all down to the design of your worker script.
For a more flexible cross-platform approach, check out this PHP Manual post
There are a number of ways to go about this. You could exec(), like the above says, but you could potentially run into a DoS situation if there are too many submit clicks. the pcntl extension is arguably better at managing processes like this. Check out this post to see a discussion (there are 3 parts).
You could use Javascript to send a second, ajax post that runs the appropriate worker script afterwards. By using ignore_user_abort() and sending a Content-Length, the browser can disconnect early, but your apache process will continue to run and process your data. Upside is no forkbomb potential, Downside is it will open more apache processes.
Yet another option is to use a cron in the background that looks at a process-queue table for things to do 'later' - you stick items into this table on the front end, remove them on the backend while processing (see Zend_Queue).
Yet another is to use a more distributed job framework like gearmand - which can process items on other machines.
It all depends on your overall capabilities and requirements.