We've been prototyping a search results system for a mySQL database with about 2 million names and addresses and 3 million associated subscription and conference attendance records.
At the moment the search is executed and all results returned - for each result I then execute a second query to look up subscriptions / conferences for the person's unique ID. I've got indexes on all the important columns and the individual queries execute quite quickly in phpMyAdmin (0.0xxx seconds) but feed this into a webpage to display (PHP, paged using DataTables) and the page takes seconds to render. We've tried porting the data to a Lucene database and it's like LIGHTNING but the bottleneck still seems to be displayng the results rather than retrieving them.
I guess this would be due to the overhead of building, serving and rendering the page in browser. I think I can remove the subquery I mention above by doing GROUP_CONCAT to get the subscription codes in the original query, but how can I speed up the display of the page with the results on?
I'm thinking little and often querying with AJAX / server side paging might be the way to go here (maybe get 50 results, the query is smaller, the page is smaller and can be served quicker) but I welcome any suggestions you guys might have.
Even if you are using pagination with Datatables, all the results are loaded into the page source code at first although you are using the server side feature.
Loading 2 million rows at once will always render slowly. You have to go for server side pagination, it can be by AJAX or by a normal PHP script.
You can also consider using a cache system to speed up the loading of data from the server and avoiding calling the database when it is not needed. If your data can be changing randomly in time, you can always use a function to check whether or not the data has changed since the last time you cached the data and if so, updating the cached data.
Related
I'm developing an application using the framework CakePHP, with MySQL as database.
The maximum number of records to fetch from a table are not more than 2000.
To show in an index page, I'm fetching the whole 2000.
For pagination I use the JavaScript DataTables plugin.
The user base of the application is no more than 10 users.
I found that the delay fetching the 2000 +/- records is negligible in page loading. And as all the records are in the client's side, making searches via DataTables without any delay.
Modern web browsers and internet speeds seems to handle this amount of data without a blink.
Discussing with some colleagues, they told me that I should paginate fetching, 10 (for example) records at a time, making a query to the server each 10 rows.
My take on this is that fetching 10 records at a time (I'm not using Ajax), could hurt the user experience, since implies a delay in every click and in every search. Also seems to me that the server side performance and efficiency is not really something serious in this case.
Taking into account all of these considerations, what of these two options is the correct one?
I posted in meta a question about the scope of the question. I can edit to be narrower if needed.
i am working on a project where i need to put large number of sql queries on a single page ..
my question is that is there any problem that i will be having in future if my site gets heavy traffic ...
i do not want my site to slow down..
please suggest some way so that the number of queries does not affect my site performance..
i am working on php
sql query may look like
$selectcomments=mysql_query("select `comment`,`email`,`Date` from `fk_views` where (`onid`='$idselect_forcomments' and comment !='') order by Date asc");
Of course, if your site gets bigger, you will have problems putting everything on one page. That's logic, and you can't change it.
Different solutions:
Pagination: you could create a pagination system (plenty of tutorials out there... http://net.tutsplus.com/tutorials/php/how-to-paginate-data-with-php/)
If it's possible, divide your pages. Don't have all the comments on one and only one page. Try to have different pages, with different type of data, so it'll divide the load.
It's obvious that if your database gets too big, it'll be impossible to simply dump all the data on one page. Even the quickest browsers would crash.
One thing what you can do is Memcached. It will store in cache those results. So for the next visitor, who chick on the same page will read a cached objects from sql not need to run again.
Other trick: order by Date asc if you have huge result, than better, faster to do it in PHP side, those can rally slow down the query if they need to do a full table scan.
Other like Yannik told: pagination ( this is basic, ofc ) and divide pages.
You can speed up the delay from pagination with pre-executing sql with Ajax: get count of total results for pagination.
Yes, obviously if you have a lot of queries on a single page then even in moderate traffic it can flood your database with queries.
Few Tips:
1)You should work on your database structure,how you have created tables,which table stores what,normalization etc.Try to optimise storage and retrieval of information so that in a single query,you fetch max. information.This will reduce the calls to database.
2)Never store and fetch redundant info (like age,that you can calculate from DOB) from database.
3)Pagination (as pointed earlier).
4)Caching
5)If you are updating small portions on your page at a time,then instead of loading entire page,use AJAX to update necessary portions.It will also increase interactivity.
I'm supposed to make queries from MySql database once a day and display data on the page... and this sounds like cron job - I never did this before and I'd like you opinion.
if I make query once a day, I have to save this data in a file, let's say, xml file and every time the page reloads, it has to parse data from that file.
From my point of view, it would be faster and more user friendly to make query every time the page loads, as data would be refreshed ...
Any help please ....
Thank for your answers, I'll update my answer ... I don't think the queries would be extensive: something like find the most popular categories from articles, the most popular cites from where the author is ... three of those queries. So data pulled out from database will rely only on two tables, max three and only one will have dynamic data, other will be small ones.
I didn't ask yet why ... because it is not available at the moment ...
It all depends on the load on the server. If users are requesting this data a few times a day, then pulling the data on each request should be ok (KISS first). However, if they are slamming the server many times and the request is slow on top of that, then you should store the data off. I would just suggest storing it to a table and just clearing the table each night on a successful reload.
If this is a normal query that doesn't take long to execute, there is no reason to cache the result in a file. MySQL also has caching built in, which may be closer to what you want.
That would depend on the complexity of the query. If the "query" is actually going through a lot of work to build a dataset, or querying a dozen different database servers, i can see only doing it once per day.
For example, if you own a chain of stores across 30 states and 5 countries, each with their own stock-levels, and you want to display local stock levels on your website, i can see only going through the trouble of doing that once per day...
If efficiency is the only concern, it should be pretty easy to estimate which is better:
Time to run Query + (Time to load xml x estimated visits)
versus
Time to run Query x Estimated Visits
I am looking to display 60,000 records on a webpage with php pulling the records from a mysql database on localhost. These 60,000 records may change depending on the data input.
The records have 5 text fields and due to the sheer number of records, a significant time is taken to send the data from the mysql server to the web browser. Even on a localhost, the time taking is around 15 seconds. During this time, the page is empty.
I would like to seek professional opinion on how to either to
1. display the data in an alternative method, (which I'm not sure what method) or
2. hasten the sending of data from mysql server to the web browser using caching technology like memcache.
In the end i will be deploying the application on the internet where the lag would be immensely unacceptable (i.e. > 15 seconds).
Thank you and Best Regards!
I would suggest trying AJAX pagination. No user will be able to see and analyze 60k records at one time. You can have the php display the first x (however many fit on the average screen or two) records to fill 2-3 pages, and have JavaScript listen for a scroll change. If a user starts scrolling down, have it automatically query the next y records, and add them to the display list. Possibly also removing the records from the top of the list.
Also, adding some quick-jump links or a search feature could help, as you wouldn't want to scroll down 60k records to make changes.
This will significantly lighten the server and client load, as it would only have to serve up a couple hundred records at a time.
DataTable
You should have a look at YUI's DataTable. You should hook the datatable up to autocomplete. There is also an example how they did it in YUI2(help) but YUI3 is a lot faster.
Caching
Caching is also important. You say you could use memcached so that is very good. I am a big fan of redis(But both will work, but the nice thing is that redis is I think better suited for autocomplete). There is even a free plan of Redis To go.
Another important tip is to make sure you are getting your data as you want it displayed from the database. In other words if there is any calculation or processing that you have to do, avoid doing it in PHP code during loops. Use SQL functions to process data, name fields, etc. Databases are good at that sort of thing. Of course this may or may not apply to exactly what you're doing.
I have an area which gets populated with about 100 records from a php mysql query.
Need to add ajax pagination meaning to load 10 records first, then on user click to let's say "+" character it populates the next 20.
So the area instead of displaying all at once will display 20, then on click, then next 20, the next... and so on with no refresh.
Should I dump the 100 records on a session variable?
Should I call Myqsl query every time user clicks on next page trigger?
Unsure what will be the best aproach...or way to go around this.
My concern is the database will be growing from 100 to 10,000.
Any help direction is greatly apreciated.
If you have a large record set that will be viewed often (but not often updated), look at APC to cache the data and share it among sessions. You can also create a static file that is rewritten when the data changes.
If the data needs to be sorted/manipulated on the page, you will want to limit the number of records loaded to keep the JavaScript from running too long. ExtJS has some nice widgets that do this, just provide it with JSON data (use PHP's encode method on your record set). We made one talk to Oracle and do the 20-record paging fairly easily.
If your large record set is frequently updated, and the data must be "real time" accurate, you have some significant challenges ahead. I would look at comet, ape, or web workers for polling/push solution and build your API to only deal in updates to the "core" data--again, probably cached on the server rather than pulled from the DB every time.
your ajax call should call a page which only pulls the exact amount of rows it needs from the database. so select the top 20 rows from the query on the first page and so on. your ajax call can take a parameter called pagenum and depending on that is what records you actually pull from the database. no need for session variables.