I use http://www.google.com/complete/search?output=toolbar&oe=utf8&hl=fr&q=test and i want to know num_queries for each keywords is the number of query for this per day, month, years ?
Do you know that ?
I verified that it's the total number of results that are returned for the search. You can see for yourself by plotting the autosuggest num_queries against the total number of results that are listed when you search google with that term. You'll find an extremely linear relationship.
Related
I have a mysql db of clients and crawled a website retrieving all the reviews for the past few years. Now I am trying to match those reviews up with the clients so I can email them. The problem is that the review site allowed them to enter anything they wanted for the name, so in some cases I have full first name and last initial, and in some cases first initial and last full name. It also gives an approximate time it was posted such as "1 week ago", "6 months ago" and so on which we already have converted to an approximate date.
Now I need to try matching those up to the clients. Seems the best way would be to do a fuzzy search on the names, and then once I find all John B% I look for the one with a job completion date nearest the posting of the review naturally eliminating anything that was posted before jobs were completed.
I put together a small sample dataset where table1 is the clients, table2 is the review to match on here:
http://sqlfiddle.com/#!9/23928c/6/0
I was initially thinking of doing a date_diff, but then I need to sort by the lowest number. Before I tackle this on my own, I thought I would ask if anyone has any tricks they want to share.
I am using PHP / Laravel to query MySql
You can use DATEDIFF with absolute values:
ORDER BY ABS(DATEDIFF(`date`, $calculatedDate)) DESC
To find records that match your estimation closely, positive or negative.
I need a way to get highest points players within salary range i.e 50,000
There is a similar question here Algorithm to select Player with max points but with a given cost.
Basically I have to select optimal 9-player line-up.
I googling lot and I found this can be achieve using linear programming.But I don't know how can I use Lp in php.
Any idea how can I achieve this or there is any other way to do this?
If you store the information in arrays, I believe you could achieve the result using array_multisort which would give result similar to SQL order by. For example, order by points DESC, salary ASC. This would give back the array having top points players at top and if any of them have the same amount of points, the first would be the one with the lowest salary.
The answer to this question shows how to use array_multisort.
I want to search some category of places supported from supported types like.. Getting Total number of museum in United States.
After reading these docs I found that we should do that search by this.
https://maps.googleapis.com/maps/api/place/nearbysearch/output?parameters
Even it shows the entire json..
I tried something like this
https://maps.googleapis.com/maps/api/place/details/json?placeid=ChIJN1t_tDeuEmsRUsoyG83frY4&key=MYAPI
But How can I exactly get only total numbers of Museum available in United States.
There is no way to query the Places API for the total number of places that fall within a given radius in the Google database. When you query the Places API, the maximum number of results that will be returned is 20 (twenty) as described in this question/answers: What is the proper way to use the radius parameter in the Google Places API?.
In an app I'm building, One of the features i'd like users to be able to discover people around them easily. I'm using the GPS to get the latitude and longitude, then storing that information in the mysql db along with the other user information under a column for latitude and another with the longitude. What's would the best practice be to do a query that looks at the person whos making the query's information...and then have it find the closest possible relatable lat and longitude then start there and limit it to 24 results? And furthermore, how would I remember where it stopped so I could allow another query to start where it left off and return more getting further and further away
Basically, to generalize how can I do a mysql query that starts as close as it can to 2 latitude and longitude points that I supply and then have it grab the next 24 sorted by closest to furthest?
I feel like its going to be hard to do because its being based on 2 columns. Is there a way I should/could be combining the GPS values into 1 column so it will be easy to find relative distance?
Maybe I could somehow get the zip code (but then that might cause non US problems). I'm not sure. I'm stuck.
Just search for "Haversine Formula" here on Stackoverflow and you will find several related questions.
As #cdonner mentioned, there are a number of resources for the Haversine formula which you use to transform lat and long into distance. You would pass in a distance variable based on how your formula is set up, usually based on miles and run your query starting at the closest radius. Using a php loop, you can simply increase the distance and re-run the query until you get the desired number of results. And do check out that google link re #Maleck13 as well, very helpful.
I have a system which will return all users from the database and order the results by lowest distance from a reference zip code.
For example: User will come on the site, enter zip code and it will return him all other users who are nearest to his zip (ascending order)
How am i doing this now and why is it a problem ?
The system contains more than 30 million users and their zipcodes. I am retreiving all the users in a particular state and city (narrows the dataset down to about 10,000).
This is where the problem actually happens. Now, all the result sent by mysql (10,000) rows to PHP are sent to a zipcode calculator library which calculates this distance between the base zip code and user's zip code - 10,000 times. Then orders the result by the zip code nearest.
As you can see, this is very badly optimized code. And the 10,000 records are looped through twice. Not to mention the amount of RAM each httpd process takes just transferring data to and fro mysql.
What I would like to ask the gurus in here that is there anyway to optimize this ?
I have a few ideas of my own, but i'm not sure how efficient they are.
Try to do all the zipcode calculation and ordering in mysql itself and return the paginated number of rows.
For this, i will need to move the distance between zipcode calculation logic to a stored procedure. This way I am preventing the processing of 10,000 records in PHP. However, there is still a problem. I would not need to calculate distance for zip codes which have already been calculated (for 2 users having the same zip code).
Secondly, how do i order rows in mysql using a stored procedure ?
What do you guys think ? Is this a good way ? Can i expect a performance boost using this ?
Do you have any other suggestions ?
I know this question is huge, and i really appreciate the time you have taken to read till the end. I would really like to hear your thoughts about this.
As I'm not overly familiar with PHP or MySQL, I can only give some basic tips but they should help. This also assumes you have no direct way of interfacing with the zip library from MySQL.
First, as it's doubtful that you have 10k zip codes in a city, take your existing query and do something like
SELECT DISTINCT ZipCode FROM Users WHERE ...
This will probably return a few dozen zip codes max, and no duplicates. Run this through your zip code library. That library itself is probably a source of slowness, as it has to look up the zip codes, and do a bunch of fancy trig to get actual distance. Take the results of this, and insert it into a temp table with just the zip code and the distance.
Once done with that list, have another query that gets the rest of the user data you want, and JOIN into the the temp table on zip code to get your distance.
This should give you quite a large speedup. You can do whatever paging you need in the second query after the results have been calculated. And no more looping through 10k rows.
I suggest that you narrow the latitude and longitude ranges before you compute the accurate distance for filtering and sorting purposes.
What I mean is if you do a full table scan and compute distances for all zip codes in the database relative to your reference point, it will be very slow.
Instead, filter zipcode by proximity. I mean if you have latitude 10 and longitude 20, first compute the maximum angular range for the proximity you want. Lets say you want a proximity range of 10 miles. That may translate into 0.15 degrees. So you need to filter you zip codes first latitude between 10-0.15 and 10+0.15 and longitude between 20-0.15 and 20+0.15 .
Only after that you include the accurate distance clause in your SQL query condition. That will be much faster because you no longer do full scan and you can eventually use range indexes on longitude and latitude fields.
To translate miles into degrees find the narrow range, keep in mind that the Earth has , approximately 25,000 miles of perimeter, divide 25000 by 360 degrees which gives 70 miles per degree. If you want a range of 10 miles, your range in degrees will be at most 0.15 degrees.
Keep in mind that these calculations are not accurate (the Earth is not exactly well rounded) but that is not important. What is important is that you find a degree range value that is higher than the really accurate value.
If you can get the latitude and longitude for all zipcodes into MySQL, or have an easy way of fetching the lat/long for your base zipcode and feeding it into your MySQL query, then you can order your 10k users by distance inside MySQL. There is a very similar question and answer here which gives you the correct math for the distance function. You may also want to investigate Mysql spatial extensions which would let you insert and index your lat/longs as 2D POINT data.