I have a mysql table of postcodes with 200000 Records in it. I want to select just one field post_code from it and fetch the postcodes for auto suggestion in a textbox. I will do this using jquery but I want to know which method I should use to create code in php and select records faster?
Without specifics it's hard to give an answer more specific than "Write a faster query".
I'm going to make a huge assumption here and assume you're working on a web application that's populating an autocomplete form.
You don't need all the post-codes! Not even close.
Defer populating the autocomplete until the user starts typing into the postcode field. When that happens, do an AJAX load to get the postcodes from the database.
Why is doing it this way better than just fetching all the post codes?
Because you now know what letter the user's post code starts with.
Knowing the first letter of the post code means you can eliminate any post code in your dataset that doesn't start with the same letter. You'll go from having to fetch 20000 postcodes to having to fetch less than 2000, improving performance by over an order of magnitude. If you wait until the user has entered two letters, you can get the amount of data to be fetched down even further.
Also, I'm assuming you're using an index on your postcode column, right?
I would think you simply want to create a MySql index on "post_code"?
Q: How many postal codes are you planning on fetching at a time (and passing back to the browser)? One? Five? 200,000?
You can use Sphinx-Php Api to deal with big database
But you will need to indexer file for your search for the first time to use this way.
Or can go for Zend Lucene Search
you probably want to filter the records that matches the current input of the user and your query should be as follows:
select postal_code from table_name where postal_code like '<userinput>%' LIMIT 50 eg: '60%'
and send back this result set to your jquery callback function. remember to set a limit on the postal codes retrieved as you dont want to send back too much data which not only hampers the performance but doesnt help the user in a direct way.
Related
I am trying to write a predictive search system for a website I am making.
The finished functionality will be a lot like this:
I am not sure of the best way to do this, but here is what I have so far:
Searches Table:
id - term - count
Every time a search is made it is inserted into the searches table.
When a user enters a character into the search input, the following occurs:
The page makes an AJAX request to a search PHP file
The PHP file connects to MySQL database and executes a query: SELECT * FROM searches WHERE term LIKE 'x%' AND count >= 10 ORDER BY count DESC LIMIT 10 (x = text in search input)
The 10 top results based on past search criteria are then listed on the page
This solution is far from perfect. If any random person searches for the same term 10 times it will then show up as a recommended search (if somebody where to search a term starting with the same characters). By this I mean, if somebody searched "poo poo" 10 times and then someone on the site searched for "po" looking for potatoes, they would see "poo poo" as a popular search. This is not cool.
A few ideas to get around this do come to my head. For example, I could limit each insert query into the searches table to the user's IP address. This doesn't fully solve the problem though, if the user has a dynamic IP address they could just restart their modem and perform the search 10 times on each IP address. Sure, the amount of times it has to be entered could remain a secret so it is a little more secure.
I suppose another solution would be to add a blacklist to remove words like "poo poo" from showing up.
My question is, is there a better way of doing this or am I moving along the right lines? I would like to write code that is going to allow this to scale up.
Thanks
You are on the right track.
What I would do:
You store every query uniquely. Add a table tracking each IP for that search term and only update your count once per IP
If a certain new/unique keyword gets upcounted more then X times in an X period of time, let your system mail you/your admin so you have the opportunity to blacklist they keyword manually. This has to be manually because some hot topic might also show this behavior.
This is the most interesting one: Once the query is complete, check the amount of results. It is pointless to suggest keywords that give no results. So only suggest queries that atleast will give X amount of results. Queries like "poo poo" will give no results, so they won't show up in your suggestion list.
I hope this helps. Talk to me further in chat if you have questions :)
For example, you could add a new boolean column called validate, and avoid using a blacklist. If validate is false, not appear in recommended list
This field can be ajusted manually by an administrator (via query or backoffice tool). You could add another column called audit, which stores the timestamp of the query. If the difference between the maximum and minimum timestamp exceeds a value, validate field could be false by default.
This solution is easy and fast for develop your idea.
Regards and good luck.
SO I have jQuery DataTables setup and running fine.
My eventual goal, is to allow the user to use google places autocomplete, to update their location, and then to have a 'sortable' distance column added to my DataTables table, upon refresh.
The plan is is to have a button next to the text input of google autocomplete, that says use this address, once user searches for and finds an address using autocomplete, and then pushes the button , I want to (probably)send them to a loading page for a few seconds whilst we
1)read the text string, format it for use with google geo coding api, aka
"Cranbourne, Victoria, Australia" ----> "Cranbourne+Victoria+Australia"
2)insert it in a geocode request
3)extract the lat and lon values from the resulting json, and assign them to vars
Now comes the hard part, I was thinking using some php, to calculate the distance between the users(lat,long), and every entry that was in the DataTable table, which was was populated with sql data using a foreign key (all sql rows have lat lon data)
Then I was thinking of sending said entries to an xml, only this time every entry will have an extra value "DISTANCE".
Of coarse with some clever code this whole thing wont take a second, the new DataTable table is then populated with the xml, only this time it contains a distance column with distance values for every row, which of coarse will be sortable, without any DataTables hacks what so ever.
So am I stupid, or is this somewhat of a good idea, is there an easier way?
I would really like some answers to this thank you.
That actually could be difficult (depending on your math expertise), because of the fact that it requires using spherical geometry. You might consider using an API:
https://developers.google.com/maps/documentation/distancematrix/
If you want to try doing it yourself, you might want to use a class written for that, such as:
http://www.phpclasses.org/browse/file/36294.html
The second option is probably better, because Google has limits on its API. Once you have a way of calculating distance, it's a simple matter to query the database and create the XML. I don't know why you want to use XML instead of json, but the PHP function json_encode() might make it simpler, and it's supported by DataTables.
I know this is a really old question but I just came across it and wanted to recommend using the Haversine Formula to calculate the distances between lat/long values
I've used this implementation and found it worked quite wonderfully
I'm programming a search engine for my website in PHP, SQL and JQuery. I have experience in adding autocomplete with existing data in the database (i.e. searching article titles). But what about if I want to use the most common search queries that the users type, something similar to the one Google has, without having so much users to contribute to the creation of the data (most common queries)? Is there some kind of open-source SQL table with autocomplete data in it or something similar?
As of now use the static data that you have for auto complete.
Create another table in your database to store the actual user queries. The schema of the table can be <queryID, query, count> where count is incremented each time same query is supplied by some other user [Kind of Rank]. N-Gram Index (so that you could also auto-complete something like "Manchester United" when person just types "United", i.e. not just with the starting string) the queries and simply return the top N after sorting using count.
The above table will gradually keep on improving as and when your user base starts increasing.
One more thing, the Algorithm for accomplishing your task is pretty simple. However the real challenge lies in returning the data to be displayed in fraction of seconds. So when your query database/store size increases then you can use a search engine like Solr/Sphinx to search for you which will be pretty fast in returning back the results to be rendered.
You can use Lucene Search Engiine for this functionality.Refer this link
or you may also give look to Lucene Solr Autocomplete...
Google has (and having) thousands of entries which are arranged according to (day, time, geolocation, language....) and it is increasing by the entries of users, whenever user types a word the system checks the table of "mostly used words belonged to that location+day+time" + (if no answer) then "general words". So for that you should categorize every word entered by users, or make general word-relation table of you database, where the most suitable searched answer will be referenced to.
Yesterday I stumbled on something that answered my question. Google draws autocomplete suggestions from this XML file, so it is wise to use it if you have little users to create your own database with keywords:
http://google.com/complete/search?q=[keyword]&output=toolbar
Just replacing [keyword] with some word will give suggestions about that word then the taks is just to parse the returned xml and format the output to suit your needs.
I have my Search Results being displayed just fine, but I have several categories. I want to be able to select the Column and use ORDER BY. Here's what I have to try and do this.
<a href='searchresult.php?db=members&table=people&sql_query=SELECT+%2A+FROM+%60people%60+ORDER+BY+%60lastname%60.%60level%60++DESC+%60AGAINST+$search&token=5e1a18b6cccb5db7a37bb3fce055801a'>Last Name</a>
the $search is what I search for. and when I look at the link it shows what I searched for in its place. So I figured this would work, but of course it did not. What would be the right way to set this up for each one of my columns so when I click the link it will sort my results in the order of that column?
Thanks!
The first thing that should SCREAM at you is to NEVER run any SQL query that contains any input that could possibly come from a client without first sanitizing it. Running an entire query from (potentially) user input will allow them to run a command like DROP DATABASE or TRUNCATE TABLE, or worse, they could get sensitive information out of it.
So you should hard-code your SQL query and just take the user input for the specific values you are querying for, but first sanitize those, by doing something like this:
$query = sprintf("SELECT * FROM 'people' ORDER BY %s DESC", mysql_real_escape_string($_GET["orderby"]));
Now, on to your actual question...after the page has been loaded, you have two options:
You could re-query using AJAX.
You could sort the table using Javascript.
Which is better really depends on how many rows you are expecting.
If you are fetching a lot of rows, sorting via Javascript starts getting pretty slow, but if it's just a few rows (like less than 100 or so), then Javascript is probably the way to go. I don't know much about other libraries, like JQuery or otherwise, not sure if they have a better solution. So for a lot of rows, using AJAX to just re-query the database is probably faster.
However, if your page consists of just this table by itself, there's not really any point in using AJAX as opposed to just refreshing the page with a different query.
If you do decide to sort the table using Javascript, a library like this one might help. Or just google "javascript sort table".
Google unfortunately didn't seem to have the answers I wanted. I currently own a small search engine website for specific content using PHP GET.
I want to add a latest searches page, meaning to have each search recorded, saved, and then displayed on another page, with the "most searched" at the top, or even the "latest search" at the top.
In short: Store my latest searches in a MySQL database (or anything that'll work), and display them on a page afterwards.
I'm guessing this would best be accomplished with MySQL, and then I'd like to output it in to PHP.
Any help is greatly appreciated.
Recent searches could be abused easily. All I have to do is to go onto your site and search for "your site sucks" or worse and they've essentially defaced your site. I'd really think about adding that feature.
In terms of building the most popular searches and scaling it nicely I'd recommend:
Log queries somewhere. Could be a MySQL db table but a logfile would be more sensible as it's a log.
Run a script/job periodically to extract/group data from the log
Have that periodic script job populate some table with the most popular searches
I like this approach because:
A backend script does all of the hard work - there's no GROUP BY, etc made by user requests
You can introduce filtering or any other logic to the backend script and it doesn't effect user requests
You don't ever need to put big volumes of data into the database
Create a database, create a table (for example recent_searches) and fields such as query (the query searched) and timestamp (unix timestamp that the query was made) said, then for your script your MySQL query will be something like:
SELECT * FROM `recent_searches` ORDER BY `timestamp` DESC LIMIT 0, 5
This should return the 5 most recent searches, with the most recent one appearing first.
Create table (something named like latest_searches) with fields query, searched_count, results_count.
Then after each search (if results_count>0), check, if this search query exists in that table. And update or insert new line into table.
And on some page you can just use data from this table.
It's pretty simple.
Ok, your question is not yet clear. But I'm guessing that you mean you want to READ the latest results first.
To achieve this, follow these steps:
When storing the results use an extra field to hold DATETIME. So your insert query will look like this:
Insert into Table (SearchItem, When) Values ($strSearchItem, Now() )
When retrieving, make sure you include an order by like this:
Select * from Table Order by When Desc
I hope this is what you meant to do :)
You simply store the link and name of the link/search in MySQL and then add a timestamp to record what time sb searched for them. Then you pull them out of the DB ordered by the timestamp and display them on the website with PHP.
Create a table with three rows: search link timestamp.
Then write a PHP script to insert rows when needed (this is done when the user actually searches)
Your main page where you want stuff to be displayed simply gets the data back out and puts them into a link container $nameOfWebsite
It's probably best to use a for/while loop to do step 3
You could additionally add sth like a counter to know what searches are the most popular / this would be another field in MySQL and you just keep updating it (increasing it by one, but limited to the IP)