Handling a very large dataset in PHP regarding HTML tables - php

So on a website let's say that a user can click on a button to load in a report. The problem is, some of these reports are easily 3000+ records coming back from a database. While the call is relatively quick (15 seconds for 20000 records), it takes a much longer time to add a table containing that data into the DOM.
Is there a way to speed this up at all, in php or javascript? I've already simplified the loop in php as much as possible.

Related

Are using div tags faster than using table tr and td tags for tabular data of huge size?

I have a huge table loading from remote aws ec2 server through php scripts accessing a postgres database.
Currently, there are 2000 rows with 20 columns in the table, taking about 2 min to load after all query optimizations and using multidimensional php arrays to store data for all members through a single query instead of querying the database every now and then for each of the 2000 users.
I was able to reduce the loading time from 5 min to 2 min.
My question is, does the UI also needs to be optimized? Is the traditional table tag outdated? Which is the best way to go about?
Suppose i have all the data ready. If I need to put them out to display instantly, in huge numbers, which is the best way to do it?
I'm not very well versed with HTML or CSS or even UI testing.
In this situation, UI and HTML optimization will not make a real difference in terms of loading time.
I would recommend you to measure the time your webpage takes to load the complete table, but step by step. Put "echo" commands in your code to divide the full process in little tasks, then you will see where you have to put more effort in terms of timing.
How much time does my program take to...
...connect to my remote server?
...receive the raw data from my remote server?
...process the data for the presentation in the next step?
...present the processed data on the webpage?
Then you will see that the last step, the UI-related step, is not that relevant (think that for example PHPMyAdmin can show you thousands of rows in seconds).
In any case, HTML tables will be the fastest way as they are more "simple" than DIVs. Try to preconfigurate all the CSS you can so your browser will not need to calculate it!
HTML tables aren't outdated at all! :) They're the fastest and most semantic way to display tables of data.
If you need to micro-optimize, I speculate that it might be fastest to use fixed column widths (avoiding dynamic resizing). Other ideas: something in JS that "streamed" the data onto the page (avoiding rerendering), use PHP's output buffering functions, or maybe look into exporting/importing CSV for managing huge datasets? ;)

Best way to deliver real time information?

I currently have three screens showing a web page with information taken from a database. This information gets updated every 10 seconds. These three screens call AJAX on a PHP script that does a MySQL query that prints JSON encoded results, and that's what it uses to represent the data. The query is a pretty resource intensive one.
My problem has to do with scalability: If ten people were to enter that page, they would all be making the php script to run at the same time and every 10 seconds, thus overloading the server.
I'm considering doing this query on the background and outputting the results to a text file and then retrieve that text file via AJAX, but I feel like there must be a better way to do this.
So the question is: How do I deal with repetitive and very slow sql queries to allow access for multiple users?

Caching in PHP for speeding up

I am running application (build on PHP & MySql) on VPS. I have article table which have millions of records in it. Whenever user login i am displaying last 50 records for each section.
So every-time use login or refresh page it is executing sql query to get those records. now there are lots of users on website due to that my page speed has dropped significantly.
I done some research on caching and found that i can read mysql data based on section, no. articles e.g (section - 1 and no. of articles - 50). store it in disk file cache/md5(section no.).
then in future when i get request for that section just get the data from cache/md5(section no).
Above solution looks great. But before i go ahead i really would like to clarify few below doubts from experts .
Will it really speed up my application (i know disk io faster than mysql query but dont know how much..)
i am currently using pagination on my page like display first 5 articles and when user click on "display more" then display next 5 articles etc... this can be easily don in mysql query. I have no idea how i should do it in if i store all records(50) in cache file. If someone could share some info that would be great.
any alternative solution if you believe above will not work.
Any opensource application if you know. (PHP)
Thank you in advance
Regards,
Raj
I ran into the same issue where every page load results in 2+ queries being run. Thankfully they're very similar queries being run over and over so caching (like your situation) is very helpful.
You have a couple options:
offload the database to a separate VPS on the same network to scale it up and down as needed
cache the data from each query and try to retrieve from the cache before hitting the database
In the end we chose both, installing Memecached and its php extension for query caching purposes. Memecached is a key-value store (much like PHP's associative array) with a set expiration time measured in seconds for each value stored. Since it stores everything in RAM, the tradeoff for volatile cache data is extremely fast read/write times, much better than the filesystem.
Our implementation was basically to run every query through a filter; if it's a select statement, cache it by setting the memecached key to "namespace_[md5 of query]" and the value to a serialized version of an array with all resulting rows. Caching for 120 seconds (3 minutes) should be more than enough to help with the server load.
If Memecached isn't a viable solution, store all 50 articles for each section as an RSS feed. You can pull all articles at once, grabbing the content of each article with SimpleXML and wrapping it in your site's article template HTML, as per the site design. Once the data is there, use CSS styling to only display X articles, using JavaScript for pagination.
Since two processes modifying the same file at the same time would be a bad idea, have adding a new story to a section trigger an event, which would add the story to a message queue. That message queue would be processed by a worker which does two consecutive things, also using SimpleXML:
Remove the oldest story at the end of the XML file
Add a newer story given from the message queue to the top of the XML file
If you'd like, RSS feeds according to section can be a publicly facing feature.

Best way to speed up search result display

We've been prototyping a search results system for a mySQL database with about 2 million names and addresses and 3 million associated subscription and conference attendance records.
At the moment the search is executed and all results returned - for each result I then execute a second query to look up subscriptions / conferences for the person's unique ID. I've got indexes on all the important columns and the individual queries execute quite quickly in phpMyAdmin (0.0xxx seconds) but feed this into a webpage to display (PHP, paged using DataTables) and the page takes seconds to render. We've tried porting the data to a Lucene database and it's like LIGHTNING but the bottleneck still seems to be displayng the results rather than retrieving them.
I guess this would be due to the overhead of building, serving and rendering the page in browser. I think I can remove the subquery I mention above by doing GROUP_CONCAT to get the subscription codes in the original query, but how can I speed up the display of the page with the results on?
I'm thinking little and often querying with AJAX / server side paging might be the way to go here (maybe get 50 results, the query is smaller, the page is smaller and can be served quicker) but I welcome any suggestions you guys might have.
Even if you are using pagination with Datatables, all the results are loaded into the page source code at first although you are using the server side feature.
Loading 2 million rows at once will always render slowly. You have to go for server side pagination, it can be by AJAX or by a normal PHP script.
You can also consider using a cache system to speed up the loading of data from the server and avoiding calling the database when it is not needed. If your data can be changing randomly in time, you can always use a function to check whether or not the data has changed since the last time you cached the data and if so, updating the cached data.

MySql queries at certain time

I'm supposed to make queries from MySql database once a day and display data on the page... and this sounds like cron job - I never did this before and I'd like you opinion.
if I make query once a day, I have to save this data in a file, let's say, xml file and every time the page reloads, it has to parse data from that file.
From my point of view, it would be faster and more user friendly to make query every time the page loads, as data would be refreshed ...
Any help please ....
Thank for your answers, I'll update my answer ... I don't think the queries would be extensive: something like find the most popular categories from articles, the most popular cites from where the author is ... three of those queries. So data pulled out from database will rely only on two tables, max three and only one will have dynamic data, other will be small ones.
I didn't ask yet why ... because it is not available at the moment ...
It all depends on the load on the server. If users are requesting this data a few times a day, then pulling the data on each request should be ok (KISS first). However, if they are slamming the server many times and the request is slow on top of that, then you should store the data off. I would just suggest storing it to a table and just clearing the table each night on a successful reload.
If this is a normal query that doesn't take long to execute, there is no reason to cache the result in a file. MySQL also has caching built in, which may be closer to what you want.
That would depend on the complexity of the query. If the "query" is actually going through a lot of work to build a dataset, or querying a dozen different database servers, i can see only doing it once per day.
For example, if you own a chain of stores across 30 states and 5 countries, each with their own stock-levels, and you want to display local stock levels on your website, i can see only going through the trouble of doing that once per day...
If efficiency is the only concern, it should be pretty easy to estimate which is better:
Time to run Query + (Time to load xml x estimated visits)
versus
Time to run Query x Estimated Visits

Categories