Read results of a PHP array, in real time with Javascript

Read results of a PHP array, in real time with Javascript - php

I certainly can't solve this problem by myself after a few many days already trying. This is the problem:
We need to display information on the screen (HTML) that is being generated in real time inside a PHP file.
The PHP is performing a very active crawling, returning huge arrays of URLs, each URL need to be displayed in real time in HTML, as soon as the PHP captures it, that's why we are using Ob_flush() and flush methods to echo and print the arrays as soon as we got them.
Meanwhile we need to display this information somehow so the users can see it while it works (since it could take more than one hour until it finishes).
It's not possible to be done, as far as I understand, with AJAX, since we need to make only 1 request and read the information inside the array. I'm not either totally sure if comet can do something like this, since it would interrupt the connection as soon as it gets new information, and the array is really rapidly increasing it's size.
Additionally and just to make the things more complex, there's no real need to print or echo the information (URLs) inside the array, since the HTML file is being included as the User Interface of the same file that is processing and generating the array that we need to display.
Long story short; we need to place here:
<ul>
<li></li>
<li></li>
<li></li>
<li></li>
<li></li>
...
</ul>
A never ending and real time updated list of URLS being generated and pushed inside an array, 1,000 lines below, in a PHP loop.
Any help would be really more than appreciated.
Thanks in advance!

Try web-sockets.
They offer real-time communication between client and server and using socket.io provide cross-browser compatibility. It's basically giving you the same results as long-polling / comet, but there is less overhead between requests so it's faster.
In this case you would use web sockets to send updates to the client about the current status of the processing (or whatever it was doing).
See this Using PHP with Socket.io

Suppose you used a scheme where PHP was writing to a Memcached server..
each key you write as rec1, rec2, rec3
You also store a current_min and a current_max
You have the user constantly polling with ajax. For each request they include the last key they saw, call this k. The server then returns all the records from k to max.
If no records are immediately available, the server goes into a wait loop for a max of, say 3 seconds, checking if there are new records every 100ms
If records become available, they are immediately sent.
Whenever the client receives updates or the connection is terminated, they immediately start a new request...
Writing a new record is just a matter of inserting max+1 and incrementing min and max where max-min is the number of records you want to keep available...

An alternative to web sockets is COMET
I wrote an article about this, along with a followup describing my experiences.
COMET in my experience is fast. Web sockets are definitely the future, but if you're in a situation where you just need to get it done, you can have COMET up and running in under an hour.
Definitely some sort of shared memory structure is needed here - perhaps an in-memory temp table in your database, or Memcached as stephen already suggested.

I think the best way to do this would be to have the first PHP script save each record to a database (MySQL or SQLite perhaps), and then have a second PHP script which reads from the database and outputs the newest records. Then use AJAX to call this script every so often and add the records it sends to your table. You will have to find a way of triggering the first script.
The javascript should record the id of the last url it already has, and send it in the AJAX request, then PHP can select all rows with ids greater than that.
If the number of URLs is so huge that you can't store a database that large on your server (one might ask how a browser is going to cope with a table as large as that!) then you could always have the PHP script which outputs the most recent records delete them from the database as well.
Edit: When doing a lot of MySQL inserts there are several things you can do to speed it up. There is an excellent answer here detailing them. In short use MyISAM, and enter as many rows as you can in a single query (have a buffer array in PHP, which you add URLs to, and when it is full insert the whole buffer in one query).

If I were you , I try to solve this with two way .
First of all I encode the output part array with json and with the setTimeout function with javascript I'll decode it and append with <ul id="appendHere"></ul> so
when list is updated , it will automatically update itself . Like a cronjob with js .
The second way , if u say that I couldn't take an output while proccessing , so
using data insertion to mysql is meaningless I think , use MongoDb or etc to increase speed .
By The way You'll reach what u need with your key and never duplicate the inserted value .

Related

Persist mongodb cursor between page requests in php

I have a very large dataset that i am exporting using a batch process to keep the page from timing out. The whole process can take over an hour, and i'm using drupal batch which basically reloads the page with a status on how far the process has completed. Each page request essentially runs the query again which includes a sort which takes a while. Then it exports the data to a temp file. The next page load runs the full mongo query, sorts, skips the entries already exported, and exports more to the temp file. The problem is that each page load makes mongo rerun the entire query and sort. I'd like to be able to have the next batch page just pick up the same cursor where it left off and continue to pull the next set of results.

The MongoDB Manual entry for cursor.skip() gives some advice:
Consider using range-based pagination for these kinds of tasks. That is, query for a range of objects, using logic within the application to determine the pagination rather than the database itself. This approach features better index utilization, if you do not need to easily jump to a specific page.
E.g If your nightly batch process runs over the data accumulated in the last 24hrs, perhaps you can run date-range based queries (maybe one per hour of the day) and process your data that way. I'm assuming that your data contains some sort of usable time stamp per document, but you get the idea.

Although cursors live on the server and only timeout after roughly 10minutes of no-activity, the PHP driver does not support persisting cursors between requests.
At the end of each request the driver will kill all cursors created during that request that have not been exhausted.
This also happens when all references to the MongoCursor object are removed (eg $cursor = null).
This is done as its unfortunately fairly common for applications not to iterate over the entire cursor, and we don't want to leave unused cursors around on the server as it could cause performance implications.
For your specific case, the best way to work around this problem is to improve your indexes so loading the cursor is faster.
You may also want to only select some subset of the data so you have a fixed point you can request data between.
Say, for reports, your first request may ask for all data from 1am to 2am.
Then your next request asks for all data from 2am to 3am and so on and on, like Saftschleck explains.
You may also want to look into the aggregation framework, which is designed to do "online reporting": http://docs.mongodb.org/manual/aggregation/

Getting all data once for future use

Well this is kind of a question of how to design a website which uses less resources than normal websites. Mobile optimized as well.
Here it goes: I was about to display a specific overview of e.g. 5 posts (from e.g. a blog). Then if I'd click for example on the first post, I'd load this post in a new window. But instead of connecting to the Database again and getting this specific post with the specific id, I'd just look up that post (in PHP) in my array of 5 posts, that I've created earlier, when I fetched the website for the first time.
Would it save data to download? Because PHP works server-side as well, so that's why I'm not sure.
Ok, I'll explain again:
Method 1:
User connects to my website
5 Posts become displayed & saved to an array (with all its data)
User clicks on the first Post and expects more Information about this post.
My program looks up the post in my array and displays it.
Method 2:
User connects to my website
5 Posts become displayed
User clicks on the first Post and expects more Information about this post.
My program connects to MySQL again and fetches the post from the server.

First off, this sounds like a case of premature optimization. I would not start caching anything outside of the database until measurements prove that it's a wise thing to do. Caching takes your focus away from the core task at hand, and introduces complexity.
If you do want to keep DB results in memory, just using an array allocated in a PHP-processed HTTP request will not be sufficient. Once the page is processed, memory allocated at that scope is no longer available.
You could certainly put the results in SESSION scope. The advantage of saving some DB results in the SESSION is that you avoid DB round trips. Disadvantages include the increased complexity to program the solution, use of memory in the web server for data that may never be accessed, and increased initial load in the DB to retrieve the extra pages that may or may not every be requested by the user.
If DB performance, after measurement, really is causing you to miss your performance objectives you can use a well-proven caching system such as memcached to keep frequently accessed data in the web server's (or dedicated cache server's) memory.
Final note: You say
PHP works server-side as well
That's not accurate. PHP works server-side only.

Have you think in saving the posts in divs, and only make it visible when the user click somewhere? Here how to do that.

Put some sort of cache between your code and the database.
So your code will look like
if(isPostInCache()) {
loadPostFromCache();
} else {
loadPostFromDatabase();
}
Go for some caching system, the web is full of them. You can use memcached or a static caching you can made by yourself (i.e. save post in txt files on the server)

To me, this is a little more inefficient than making a 2nd call to the database and here is why.
The first query should only be pulling the fields you want like: title, author, date. The content of the post maybe a heavy query, so I'd exclude that (you can pull a teaser if you'd like).
Then if the user wants the details of the post, i would then query for the content with an indexed key column.
That way you're not pulling content for 5 posts that may never been seen.

If your PHP code is constantly re-connecting to the database you've configured it wrong and aren't using connection pooling properly. The execution time of a query should be a few milliseconds at most if you've got your stack properly tuned. Do not cache unless you absolutely have to.
What you're advocating here is side-stepping a serious problem. Database queries should be effortless provided your database is properly configured. Fix that issue and you won't need to go down the caching road.
Saving data from one request to the other is a broken design and if not done perfectly could lead to embarrassing data bleed situations where one user is seeing content intended for another. This is why caching is an option usually pursued after all other avenues have been exhausted.

PHP Array efficiency vs mySQL query

I have a MySQL table with about 9.5K rows, these won't change much but I may slowly add to them.
I have a process where if someone scans a barcode I have to check if that barcode matches a value in this table. What would be the fastest way to accomplish this? I must mention there is no pattern to these values
Here Are Some Thoughts
Ajax call to PHP file to query MySQL table ( my thoughts would this would be slowest )
Load this MySQL table into an array on log in. Then when scanning Ajax call to PHP file to check the array
Load this table into an array on log in. When viewing the scanning page somehow load that array into a JavaScript array and check with JavaScript. (this seems to me to be the fastest because it eliminates Ajax call and MySQL Query. Would it be efficient to split into smaller arrays so I don't lag the server & browser?)

Honestly, I'd never load the entire table for anything. All I'd do is make an AJAX request back to a PHP gateway that then queries the database, and returns the result (or nothing). It can be very fast (as it only depends on the latency) and you can cache that result heavily (via memcached, or something like it).
There's really no reason to ever load the entire array for "validation"...

Much faster to used a well indexed MySQL table, then to look through an array for something.
But in the end it all depends on what you really want to do with the data.

As you mentions your table contain around 9.5K of data. There is no logic to load data on login or scanning page.
Better to index your table and do a ajax call whenever required.
Best of Luck!!

While 9.5 K rows are not that much, the related amount of data would need some time to transfer.
Therefore - and in general - I'd propose to run validation of values on the server side. AJAX is the right technology to do this quite easily.
Loading all 9.5 K rows only to find one specific row, is definitely a waste of resources. Run a SELECT-query for the single value.
Exposing PHP-functionality at the client-side / AJAX
Have a look at the xajax project, which allows to expose whole PHP classes or single methods as AJAX method at the client side. Moreover, xajax helps during the exchange of parameters between client and server.
Indexing to be searched attributes
Please ensure, that the column, which holds the barcode value, is indexed. In case the verification process tends to be slow, look out for MySQL table scans.
Avoiding table scans
To avoid table scans and keep your queries run fast, do use fixed sized fields. E.g. VARCHAR() besides other types makes queries slower, since rows no longer have a fixed size. No fixed-sized tables effectively prevent the database to easily predict the location of the next row of the result set. Therefore, you e.g. CHAR(20) instead of VARCHAR().
Finally: Security!
Don't forget, that any data transferred to the client side may expose sensitive data. While your 9.5 K rows may not get rendered by client's browser, the rows do exist in the generated HTML-page. Using Show source any user would be able to figure out all valid numbers.
Exposing valid barcode values may or may not be a security problem in your project context.
PS: While not related to your question, I'd propose to use PHPexcel for reading or writing spreadsheet data. Beside other solutions, e.g. a PEAR-based framework, PHPExcel depends on nothing.

What are the number of ways in which my approach to a news-feed is wrong?

This question has been asked a THOUSAND times... so it's not unfair if you decide to skip reading/answering it, but I still thought people would like to see and comment on my approach...
I'm building a site which requires an activity feed, like FourSquare.
But my site has this feature for the eye-candy's sake, and doesn't need the stuff to be saved forever.
So, I write the event_type and user_id to a MySQL table. Before writing new events to the table, I delete all the older, unnecessary rows (by counting the total number of rows, getting the event_id lesser than which everything is redundant, and deleting those rows). I prune the table, and write a new row every time an event happens. There's another user_text column which is NULL if there is no user-generated text...
In the front-end, I have jQuery that checks with a PHP file via GET every x seconds the user has the site open. The jQuery sends a request with the last update "id" it received. The <div> tags generated by my backend have the "id" attribute set as the MySQL row id. This way, I don't have to save the last_received_id in memory, though I guess there's absolutely no performance impact from storing one variable with a very small int value in memory...
I have a function that generates an "update text" depending on the event_type and user_id I pass it from the jQuery, and whether the user_text column is empty. The update text is passed back to jQuery, which appends the freshly received event <div> to the feed with some effects, while simultaneously getting rid of the "tail end" event <div> with an effect.
If I (more importantly, the client) want to, I can have an "event archive" table in my database (or a different one) that saves up all those redundant rows before deleting. This way, event information will be saved forever, while not impacting the performance of the live site...
I'm using CodeIgniter, so there's no question of repeated code anywhere. All the pertinent functions go into a LiveUpdates class in the library and model respectively.
I'm rather happy with the way I'm doing it because it solves the problem at hand while sticking to the KISS ideology... but still, can anyone please point me to some resources, that show a better way to do it? A Google search on this subject reveals too many articles/SO questions, and I would like to benefit from the experience any other developer that has already trawled through them and found out the best approach...

If you use proper indexes there's no reason you couldn't keep all the events in one table without affecting performance.
If you craft your polling correctly to return nothing when there is nothing new you can minimize the load each client has on the server. If you also look into push notification (the hybrid delayed-connection-closing method) this will further help you scale big successfully.
Finally, it is completely unnecessary to worry about variable storage in the client. This is premature optimization. The performance issues are going to be in the avalanche of connections to the web server from many users, and in the DB, tables without proper indexes.
About indexes: An index is "proper" when the most common query against a table can be performed with a seek and a minimal number of reads (like 1-5). In your case, this could be an incrementing id or a date (if it has enough precision). If you design it right, the operation to find the most recent update_id should be a single read. Then when your client submits its ajax request to see if there is updated content, first do a query to see if the value submitted (id or time) is less than the current value. If so, respond immediately with the new content via a second query. Keeping the "ping" action as lightweight as possible is your goal, even if this incurs a slightly greater cost for when there is new content.
Using a push would be far better, though, so please explore Comet.
If you don't know how many reads are going on with your queries then I encourage you to explore this aspect of the database so you can find it out and assess it properly.
Update: offering the idea of clients getting a "yes there's new content" answer and then actually requesting the content was perhaps not the best. Please see Why the Fat Pings Win for some very interesting related material.

Best way to display 60,000 records on a php webpage

I am looking to display 60,000 records on a webpage with php pulling the records from a mysql database on localhost. These 60,000 records may change depending on the data input.
The records have 5 text fields and due to the sheer number of records, a significant time is taken to send the data from the mysql server to the web browser. Even on a localhost, the time taking is around 15 seconds. During this time, the page is empty.
I would like to seek professional opinion on how to either to
1. display the data in an alternative method, (which I'm not sure what method) or
2. hasten the sending of data from mysql server to the web browser using caching technology like memcache.
In the end i will be deploying the application on the internet where the lag would be immensely unacceptable (i.e. > 15 seconds).
Thank you and Best Regards!

I would suggest trying AJAX pagination. No user will be able to see and analyze 60k records at one time. You can have the php display the first x (however many fit on the average screen or two) records to fill 2-3 pages, and have JavaScript listen for a scroll change. If a user starts scrolling down, have it automatically query the next y records, and add them to the display list. Possibly also removing the records from the top of the list.
Also, adding some quick-jump links or a search feature could help, as you wouldn't want to scroll down 60k records to make changes.
This will significantly lighten the server and client load, as it would only have to serve up a couple hundred records at a time.

DataTable
You should have a look at YUI's DataTable. You should hook the datatable up to autocomplete. There is also an example how they did it in YUI2(help) but YUI3 is a lot faster.
Caching
Caching is also important. You say you could use memcached so that is very good. I am a big fan of redis(But both will work, but the nice thing is that redis is I think better suited for autocomplete). There is even a free plan of Redis To go.

Another important tip is to make sure you are getting your data as you want it displayed from the database. In other words if there is any calculation or processing that you have to do, avoid doing it in PHP code during loops. Use SQL functions to process data, name fields, etc. Databases are good at that sort of thing. Of course this may or may not apply to exactly what you're doing.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.