Based on this tutorial I have built a page which functions correctly. I added a couple of dropdown boxes to the page, and based on this snippet, have been able to filter the results accordingly. So, in practice everything is working as it should. However, my question is regarding the efficiency of the proceedure. Right now, the process looks something like this:
1.) Users visits page
2.) Body onload() is called
3.) Javascript calls a PHP script, which queries the database (based on criteria passed along via the URL) and exports that query to an XML file.
4.) The XML file is then parsed via javascript on the users local machine.
For any one search there could be several thousand results (and thus, several thousand markers to place on the map). As you might have guessed, it takes a long time to place all of the markers. I have some ideas to speed it up, but wanted to touch base with experienced users to verify that my logic is sound. I'm open to any suggestions!
Idea #1: Is there a way (and would it speed things up?) to run the query once, generating an XML file via PHP which contained all possible results, store the XML file locally, then do the filtering via javascript?
Idea #2: Create a cron job on the server to export the XML file to a known location. Instead of using "Gdownloadurl(phpfile.php," I would use gdownloadurl(xmlfile.xml). Thus eliminating the need to run a new query every time the user changes the value of a drop down box
Idea #3: Instead of passing criteria back to the php file (via the URL) should I just be filtering the results via javascript before placing the marker on the map?
I have seen a lot of webpages that place tons and tons of markers on a google map and it doesn't take nearly as long as my application. What's the standard practice in a situation like this?
Thanks!
Edit: There may be a flaw in my logic: If I were to export all results to an XML file, how (other than javascript) could I then filter those results?
Your logic is sound, however, I probably wouldn't do the filtering in Javascript. If the user's computer is not very fast, then performance will be adversely affected. It is better to perform the filtering server side based on a cached resource (xml in your case).
The database is probably the biggest bottleneck in this operation, so caching the result would most likely speed your application up significantly. You might also consider you have setup your keys correctly to make your query as fast as possible.
Related
My stack is php and mysql.
I am trying to design a page to display details of a mutual fund.
Data for a single fund is distributed over 15-20 different tables.
Currently, my front-end is a brute-force php page that queries/joins these tables using 8 different queries for a single scheme. It's messy and poor performing.
I am considering alternatives. Good thing is that the data changes only once a day, so I can do some preprocessing.
An option that I am considering is to create run these queries for every fund (about 2000 funds) and create a complex json object for each of them, store it in mysql indexed for the fund code, retrieve the json at run time and show the data. I am thinking of using the simple json_object() mysql function to create the json, and json_decode in php to get the values for display. Is this a good approach?
I was tempted to store them in a separate MongoDB store - would that be an overkill for this?
Any other suggestion?
Thanks much!
To meet your objective of quick pageviews, your overnight-run approach is very good. You could generate JSON objects with your distilled data, or even prerendered HTML pages, and store them.
You can certainly store JSON objects in MySQL columns. If you don't need the database server to search the objects, simply use TEXT (or LONGTEXT) data types to store them.
To my way of thinking, adding a new type of server (mongodb) to your operations to store a few thousand JSON objects does not seem worth the the trouble. If you find it necessary to search the contents of your JSON objects, another type of server might be useful, however.
Other things to consider:
Optimize your SQL queries. Read up: https://use-the-index-luke.com and other sources of good info. Consider your queries one-by-one starting with the slowest one. Use the EXPLAIN or even the EXPLAIN ANALYZE command to get your MySQL server to tell you how it plans each query. And judiciously add indexes. Using the query-optimization tag here on StackOverflow, you can get help. Many queries can be optimized by adding indexes to MySQL without changing anything in your php code or your data. So this can be an ongoing project rather than a big new software release.
Consider measuring your query times. You can do this with MySQL's slow query log. The point of this is to identify your "dirty dozen" slowest queries in a particular time period. Then, see step one.
Make your pages fill up progressively, to keep your users busy reading while you get the data they need. Put the toplevel stuff (fund name, etc) in server-side HTML so search engines can see it. Use some sort of front-end tech (React, maybe, or Datatables that fetch data via AJAX) to render your pages client-side, and provide REST endpoints on your server to get the data, in JSON format, for each data block in the page.
In your overnight run create a sitemap file along with your JSON data rows. That lets you control exactly how you want search engines to present your data.
Instead of eval() I am investigating the pros and cons with creating .php-files on the fly using php-code.
Mainly because the generated code should be available to other visitors and for a long period of time, and not only for the current session. The generated php-files is created using functions dedicated for that and only that and under highly controlled conditions (no user input will ever reach those code files).
So, performance wise, how much load is put on the webserver when creating .php-files for instant execution using include() later elsewhere compared to updating a database record and always query a database at every visit?
The generated files should be updated (overwritten) quite frequently but not very frequent compared to how frequently they will be executed
What are the other pro/cons? Should the possibility of the combination of one user overwriting the code files at the same time as others is currently executing them introduce complicated concurrent conflict solving? Using Mutex? Is it next to impossible to overwrite the files if visitors is constantly "viewing" (executing) them?
PS. I am not interested in alternative methods/solutions for reaching "the same" goal, like:
Cached and/or saved output buffers, as an alternative, is out of the question, mainly because the output from the generated php-code is highly dynamic and context-sensitive
Storing the code as variables in a database and create dynamic php code that can do what is requested based on stored data, mainly because I don't want to use a database as backend for the feature. I don't ever need to search the data, query it for Aggregation, ranking or any other data collecting or manipulation
Memcached, APC etcetera. It's not a caching feature I want
Stand-alone (not PHP) server with custom compiled binary running in memory. Not what I am looking for here, although this alternative have crossed my mind.
EDIT:
Got many questions about what "type" of code is generated. Without getting into details I can say: It's very context sensitive code. Code is not based on user direct input but input in terms of choices, position and flags. Like "closed" objects in relation to other objects. Most code parts is related to each other in many different, but very controlled, ways (similar to linked lists, genetic cells in AI-code etcetera) so querying a database is out of the question. One code file will include one or more others, and so on..
I do the same thing in an application. It generates static PHP Code from data in a MySQL database. I store the code in memcached and use ‘eval’ to execute it. Only when something changes in the MySQL database I regenerate the PHP. It saves an awful lot of MySQL reads
I am writing an application, where it is necessary to fetch data from a third party website. Unfortunately, a specific type of info needed (a hotel name) can only be obtained by CURLing the webpage, and then parsing it (I'm using XPATHs) looking for an < h1> DOM element.
Since I'm going to run this script many times within the day, and I'll probably have to fetch the same hotel names again and again, I thought that a caching mechanism would be good: Checking if the hotel has been parsed in the past and then decide whether to make the webpage request or not.
However I have two concerns: this implementation is better to be made in a DB (since there will be an ID-Hotel name matching) or in a file? The second one is whether this "optimization" worth the whole trouble. Will I gain some significant speed up?
Go with DB, because it will give to you more flexibility and functionality for the data manipulation (filtering, sorting, etc.) by default.
I have a large dataset of around 600,000 values that need to be compared, swapped, etc. on the fly for a web app. The entire data must be loaded since some calculations will require skipping values, comparing out of order, and so on.
However, each value is only 1 byte
I considered loading it as a giant JSON array, but this page makes me think that might not work dependably: http://www.ziggytech.net/technology/web-development/how-big-is-too-big-for-json/
At the same time, forcing the server to load it all for every request to be a waste of server resources since the clients can do the number crunching just as easily.
So I guess my question is this:
1) Is this possible to do reliably in jQuery/Javascript, and if so how?
2) If jQuery/Javascript is not the better option, what would be the best way to do this in PHP (read in files vs. giant arrays via include?)
Thanks!
I know Apache Cordova can make sql queries.
http://docs.phonegap.com/en/2.7.0/cordova_storage_storage.md.html#Storage
I know it's PhoneGap but it works on desktop browsers (At least all the ones I've used for phone app development)
So my suggestion:
Mirror your database in each users' local Cordova database, then run all the sql queries you want!
Some tips:
-Transfer data from your server to the webapp via JSON
-Break the data requests down into a few parts. That way you can easily provide a progress bar instead of waiting for the entire database to download
-Create a table with one entry that keeps the current version of your database, check this table before you send all that data. And change it each time you want to 'force' an update. This keeps the users database up-to-date and lowers bandwidth
If you need a push in the right direction I have done this before.
I'm going to add simple live search to website (tips while entering text in input box).
Main task:
39k plain text lines for search into (~500 length of each line, 4Mb total size)
1k online users can simultaneously typing something in inputbox
In some cases 2k-3k resuts can match user request
I'm worried about the following questions:
Database VS textfile?
Are there any general rules or best practices related to my task aimed for decreasing db/server memory load? (caching/indexing/etc)
Do Sphinx/Solr are appropriate for such task?
Any links/advice will be extremely helpful.
Thanks
P.S. May be this is the best solution? PHP to search within txt file and echo the whole line
Put your data in a database (SQLite should do just fine, but you can also use a more heavy-duty RDBMS like MySQL or Postgres), and put an index on the column or columns that will be searched.
Only do the absolute minimum, which means that you should not use a framework, an ORM, etc. They will just slow down your code.
Create a PHP file, grab the search text and do a SELECT query using a native PHP driver, such as SQLite, MySQLi, PDO or similar.
Also, think about how the search box will work. You can prevent many requests if you e.g. put a minimum character limit (it does not make sense to search only for one or two characters), put a short delay between sending requests (so that you do not send requests that are never used), and so on.
Whether or not to use an extension such as Solr depends on your circumstances. If you have a lot of data, and a lot of requests, then maybe you should look into it. But if the problem can be solved using a simple solution then you should probably try it out before making it more complicated.
I have implemented 'live search' many times, always using AJAX with querying the database (MySQL) and haven't had/observed any speed or large load issues yet.
Anyway I saw an implementations using Solr but cannot suggest whether it was quicker or consumed less resources.
It completely depends on the HW the server will run on, IMO. As I wrote somewhere, I had seen a server with very slow filesystem so implementing live search while reading and parsing from txt files (or using Solr) could be slower than when querying the database. On the other hand You can host on poor shared webhosting with slow DB connection (that gets even slower with more concurrent connections) so this won't be the best solution.
My suggestion: use MySQL with AJAX (look at this jquery plugin or this article), set proper INDEXes on the searched columns and if this is found slow You still can move to a txt file.
In the past, i have used Zend search Lucene with great success.
It is a general purpose text search engine written entirely in PHP 5. It manages the indexing of your sources and is quite fast (in my experience). It supports many query types, search fields, search ranking.