Getting big amount of data from a very slow external data-source - php

I need to recieve a big amount of data from external source. The problem is that external source sends data very slow. The workflow is like this:
The user initiates some process from app interface (common it is fetching data from local xml file). This is quite fast process.
After that we need to load information connected with fetched data from external source(basically it is external statistics for data from xml). And it is very slow. But user needs this additional inforamtion to continue work. For example he may perform filtering according to external data or something else.
So, we need to do it asynchronously. The main idea is to shows external data as it becomes available. The question is how could we organise this async process? Maybe some quess or something else? We`re using php+mysql as backend and jquery at front-end.
Thanks a lot!

Your two possible strategies are:
Do the streaming on the backend, using a PHP script that curls the large external resource into a database or memcache, and responds to period requests for new data by flushing that db row or cache into the response.
Do the streaming on the frontend, using a cross-browser JavaScript technique explained in this answer. In Gecko and WebKit, the XmlHttpRequest.onreadystatechange event fires every time new data is received, making it possible to stream data slowly into the JavaScript runtime. In IE, you need to use an iframe workaround, also explained at Ajax Patterns article linked in the above SO post.

One possible solution would be to make the cURL call using system() with the output being redirected in a file. Thus PHP would not hang until the call is finished. From the PHP manual for system():
If a program is started with this function, in order for it to continue running in the background, the output of the program must be redirected to a file or another output stream. Failing to do so will cause PHP to hang until the execution of the program ends.
This would split the data gathering from the user interface. You could then work with the gathered local data by several means, for example:
employ an iFrame in the GUI that would refresh itself in some intervals and fetch data from the local stored file (and possibly store it in the database or whatever),
use jQuery to make AJAX calls to get the data and manipulate it,
use some CGI script that would run in the background and handle the database writes too and display the data using one of the above from the DB directly,
dozens more I can't think of now...

Related

PHP / JavaScript: Calling a PHP Class via AJAX. Multiple Instances of the PHP Class

Maybe it's a stupid question. But anyway here is my problem. I have multiple classes in my project.
At the beginning the constructor of the class Calculate($param1, $param2...) is called.
This Calculate class is called multiple times via jQuery Events (click, change..) depending on which new form field is filled.. The prices and values are calculated in the background by php and are represented on the website via AJAX (live while typing).
The connection between the AJAX and the Calculate class is a single file (jsonDataHanlder) this file receives the POST-values from the AJAX and returns a JSON-String for the website output. So every time I call this jsonDataHandler a new Calculate object is beeing created. With the updated values, but never the first created object. I am experiencing now multiple problems as you may can imagine.
How can I always access the same object, without creating an new one?
EDIT: because of technical reasons, I cannot use sessions..
Here is the php application lifetime:
The browser sends an http request to the web-server
Web-server (for example Apache), accepts the request and launches your php application (in this case your jsonDataHandler file)
Your php application handles the request and generates the output
Your php application terminates
Web-server sends the response generated by php application to the browser
So the application "dies" at the end of each request, you can not create an object which will persist between requests.
Possible workarounds:
Persist the data on the server - use sessions or the database (as you said this is not an option for you)
Persist the data on the client - you still create your object for each request, but you keep additional information client-side to be able to restore the state of your object (see more on this below)
Use something like reactphp to have your application running persistently (this also can be not an option because you will need to use different environment). Variance of this option - switch to another technology which doesn't re-launch the server-side application each time (node.js, python+flask, etc).
So, if you can't persist the data on the server, the relatively simple option is to persist the data on the client.
But this will only work if you need to keep the state of your calculator for each individual client (vs keeping the same state for all clients, in this case you do need to persist data on the server).
The flow with client-side state can be this:
Client sends the first calculation request, for example param1=10
Your scripts responds with value=100
Client-side code stores both param1=10 and param1_value=100 into cookies or browser local storage
Client sends the next calculation, for example param2=20, this time the client-side code finds previous results and sends everything together (param1=10&param1_value=100&param2=20)
On the server you now can re-create the whole sequence of calculation, so you can get the same result as if you would have a persistent Calculate object
Maybe you should try to save the values of the parameters of Calculate object in database, and every you make an AJAX call you take the latest values from the DB.

do the files opened using $.post and $.get register as open files on linux server?

I have recently updated my site with the use of ajax calls to improve the end-user experience. Some calls are set to poll the db repeadedly, others are called at to alter the database upon user interaction ie. completing a task or cancelling a cart item.
Now I am getting server errors resulting from reaching my servers open file limit.
Here is an example of the sort of code I am using: (credit goes to every tutorial found on google...)
function checkForNewData() {
$.get('checkForNewData.php',false,function(data){
if(data.length){
$('#newData').html(data);
}
});
}
$(function(){
checkForNewData();
setInterval('checkForNewData()',10000);
});
I realize that by using "setInterval('checkForNewData()',10000);" that this means that file is loaded every 10000ms for every user that has this page open.
Here are my questions regarding my ignorance of ajax:
Does a unix server record each ajax call (of this manor) as a page load or open file?
If the page loads behind the scenes, do I have to close it?
Is there a better way to keep a site up-to-date than the repetitiously polling of my db.
Thanks for your time and assistants.
Does a unix server record each ajax call (of this manor) as a page load or open file?
Every-time a php file is run, it is logged. Executed PHP of any manner is recorded. That's why you can see errors in your error log if anything goes wrong during AJAX calls.
If the page loads behind the scenes, do I have to close it?
Which page? "checkForNewData.php"? No you don't. The AJAX call waits for the script to execute & finish and than gets the response.
Is there a better way to keep a site up-to-date than the repetitiously polling of my db?
Yes, there is. I would:
On the server
Use cache (maybe APC cache)
Run a DB check once every minute/ two minutes/ five minutes only
Store/ Update the results in an XML file
On the client
Get the timestamp of the most recent update on client-side page load
Get AJAX to check the timestamp (stored in the XML) of the last update
If timestamp of the AJAX response differs from the first-load timestamp, get new HTML from the XML file
Use AJAX headers or AJAX post-data to request a specific function (like asking for timestamp update vs getting HTML data).
Remember to use the correct flags for json_encode.
print_r(json_encode($html,JSON_HEX_QUOT|JSON_HEX_TAG|JSON_HEX_AMP|JSON_HEX_APOS));
Also remember to zip the data.
ob_start('ob_gzhandler');
It is a best practice to have as fewer DB calls as possible.

The best way to access data from a database every X seconds (asynchronously)

Ok, I didn't really now how to formulate this question, and especially not the title. But i'll give it a try and hope i'm being specific enough while trying to keep it relevant to others.
I you want to run a php script in the background (via ajax) every X seconds that returns data from a database, how do you do this the best way without using to much of the server resources?
My solution looks like this:
A user visits a webpage, ever x seconds that page runs a javascript. The javascript calls a PHP script/file that calls the database, retrieves the data and returns the data to the javascript. The javascript then prints the data to the page. My fear is that this way of solving it will put a lot of pressure on the server if there is a lot (10 000) simultaneous visitors on the page. Is there another way to do this?
That sounds like the best way, given the spec/requirement you set out.
Another way is to have an intermediary step. If you are going to have a huge amount of traffic (otherwise this does not introduce any benefit, but to the contrary may overcomplicat/slow the process), add another table that records the last time a dataset was pulled, and a hard file (say, XML) which if the 'last time' was deemed too long ago, is created from a new query, this XML then feeds the result returned to the user.
So:
1.Javascript calls PHP script (AJAX)
2.PHP pings DB table which contains last time data was fully output
3.If time is too great, 'main' query is rerun and XML file is regenerated from output
ELSE skip to 4
4.Fetch the XML file and output as appropriate for returned AJAX
You can do it the other way, contacting the client just when you need it and wasting less resources.
Comet it's the way to go for this option:
Comet is a programming technique that
enables web servers to send data to
the client without having any need for
the client to request it. This
technique will produce more responsive
applications than classic AJAX. In
classic AJAX applications, web browser
(client) cannot be notified in real
time that the server data model has
changed. The user must create a
request (for example by clicking on a
link) or a periodic AJAX request must
happen in order to get new data fro
the server.

PHP display progress messages on the fly

I am working in a tool in PHP that processes a lot of data and takes a while to finish. I would like to keep the user updated with what is going on and the current task processed.
What is in your opinion the best way to do it? I've got some ideas but can't decide for the most effective one:
The old way: execute a small part of the script and display a page to the user with a Meta Redirect or a JavaScript timer to send a request to continue the script (like /script.php?step=2).
Sending AJAX requests constantly to read a server file that PHP keeps updating through fwrite().
Same as above but PHP updates a field in the database instead of saving a file.
Does any of those sound good? Any ideas?
Thanks!
Rather than writing to a static file you fetch with AJAX or to an extra database field, why not have another PHP script that simply returns a completion percentage for the specified task. Your page can then update the progress via a very lightweight AJAX request to said PHP script.
As for implementing this "progress" script, I could offer more advice if I had more insight as to what you mean by "processes a lot of data". If you are writing to a file, your "progress" script could simply check the file size and return the percentage complete. For more complex tasks, you might assign benchmarks to particular processes and return an estimated percentage complete based on which process has completed last or is currently running.
UPDATE
This is one suggested method to "check the progress" of an active script which is simply waiting for a response from a request. I have a data mining application that I use a similar method for.
In your script that makes the request you're waiting for (the script you want to check the progress of), you can store (either in a file or a database, I use a database as I have hundreds of processes running at any time which all need to track their progress, and I have another script that allows me to monitor progress of these processes) a progress variable for the process. When the process begins, set this to 1. You can easily select an arbitrary number of 'checkpoints' the script will pass and calculate the percentage given the current checkpoint. For a large request, however, you might be more interested in knowing the approximate percent the request has completed. One possible solution would be to know the size of the returned content and set your status variable according to the percentage received at any moment. I.e. if you receive the request data in a loop, each iteration you could update the status. Or if you are downloading to a flat file you could poll the size of the file. This could be done less accurately with time (rather than file size) if you know the approximate time the request should take to complete and simply compare against the script's current execution time. Obviously neither of these are perfect solutions, but I hope they'll give you some insight into your options.
I suggest using the AJAX method, but not using a file or a database. You could probably use session values or something like that, that way you don't have to create a connection or open a file to do anything.
In the past, I've just written messages out to the page and used flush() to flush the output buffer. Very simple, but it may not work correctly on every web server or with every web browser (as they may do their own internal buffering).
Personally, I like your second option the best. Should be reliable and fairly simple to implement.
I like option 2 - using AJAX to read a status file that PHP writes to periodically. This opens up a lot of different presentation options. If you write a JSON object to the file, you can easily parse it and display things like a progress bar, status messages, etc...
A 'dirty' but quick-and-easy approach is to just echo out the status as the script runs along. So long as you don't have output buffering on, the browser will render the HTML as it receives it from the server (I know WordPress uses this technique for it's auto-upgrade).
But yes, a 'better' approach would be AJAX, though I wouldn't say there's anything wrong with 'breaking it up' use redirects.
Why not incorporate 1 & 2, where AJAX sends a request to script.php?step=1, checks response, writes to the browser, then goes back for more at script.php?step=2 and so on?
if you can do away with IE then use server sent events. its the ideal solution.

PHP and AJAX, sending new PHP info to page for AJAX to receive, is this possible?

I'm searching on how to do this but my searches aren't turning up things that are talking about what I'm trying to do so maybe I'm not searching with the right terms or this isn't possible, but figured I would ask here for help.. this is what I am trying to do..
I have PHP scripts that are called asyncrhonously, so it is called and it just runs, the calling PHP doesn't wait for a response, so it can go on to do other stuff / free things up so another asynch php process can be run.
I would still like to get back a result from these "zombie" scripts or whatever you want to call them, however the only way I can think of doing it that I know for sure will work is something like make this "zombie" script save its final output to a database and then have my AJAX UI make periodic requests to this database to check if the needed value exists in the place it is supposed to.. which would allow it to get the output from the zombie PHP script..
I am thinking it would be better if somehow this zombie script could do a sort of page refresh to the AJAX ui but the ajax ui would intercept this and just take the received data from PHP and use it as needed (such as display in a DIV for user to see).. basically I'm wondering if you can make PHP force this kind of thing rather than needing to involve a database in this and making AJAX do repeated requests to check for a specific value that way..
Thanks for any advice
No, a background script has no way to influence the client's front-end because it has no connection to it.
Starting a background script, having the script write status data into a shared space - be it a database or a memcache or a similar solution - and polling the status through Ajax is usually indeed the best way to go.
One alternative may be Comet. It's a technique where a connection is kept open over a long time, and updated actively from the server side (instead of frequent client-side Ajax polling). I have no practical experience with this but I imagine it most probably needs server side tweaking to be doable in PHP - it's not the best platform for long-running stuff. See this question for some approaches.

Categories