inRandomOrder() query with ajax return same elements - php

I have a query with Laravel
$query = Work::inRandomOrder()->where('orientation', $request->get('orientation')->paginate(11);
And I have a "load more" button who call 11 others works with each click.
But, I would like display my data in a random order. But with this query, he repeats to me several data. So in my list of works, I have 2 or 3 times the same and obviously, it's not good. Is there a way with RandomOrder () to avoid duplicates?
Thank you

I believe data is randomized each time the script is loaded, including each time loading dynamicaly 11 more data fields. One way to achieve this it to randomize, store the data in a user's session variable and then get next elements from it. However this might be very heavy.
I would go to a some easily predictable mathemtics, which will look 'random' to the end-user. For example you can get each N-th entry of a query like 1-st, 5-th, 9-th, 13-th ... if N=4 and store just the number N in the user's session.

Related

MySQL Performance for Online Games Highscore Lists

I have a question about making "Highscore-Lists".
Lets say I have an online game with 1.000.000 active users. Each user has points from 0 to X. Now, I want to show a ranking-list. It would be insane to show all million entries in one page so it is divided into Y pages (100 entries each page => 10.000 pages).
I am not really sure how to solve it.
1. The easiest way to do that would be loading all 1m entries
in one SELECT, get the result and find current user with a for loop and show that specific page. (but all other 999.900 entries will be saved in RAM eventhough its not showing up). For a page change I could just use the result data with no second database call. (So I don't care about point changes during that time)
SELECT UserName, UserID, Points FROM UserAccount ORDER BY Points;
2. My second idea was, to load each page individually but than I do not know
2.1 if it is really better performance
2.2 how to get the right start page because I only have the points of the user but not really his place
So how could I solve that problem. I dont really know what mysql can handle. Are more small calls better then one huge call.
Can I even save huge result data?
Second solution would update all changed points with each page change, though but i care more about performance then always uptodate list-data.
Thank you for your help!
Markus
Use pagination. In SQL it's a "limit" clause:
SELECT UserName, UserID, Points FROM UserAccount ORDER BY Points LIMIT 0, 20;
The above query will return only the first 20 rows of the original selection.
You can pass page parameters via get, like this: highscore.php?page=1 or ?page=2 and so on.

Yii Datatprovider remove rows

I am in the position where I need to remove records form a dataprovider. Long story shot, I have encrypted data in my db that is decrypted using the afterfind method. I need to ensure the decrypted string is 20 characters long and show the result in a cgrid view.
I have tried the visible option, but this hides the data on a per column basis. I have tried the rowCssClassExpression which successfully hides the rows on the screen using display:none, however, it still shows 1 of 1200 results even though only 10 match and also page 1 has one result, page 2 no results, page 3 2 results etc.
Currently I am able to get this working by doing a cdbcommand queryAll and then looping through and calling the object like so:
foreach($data as $key=>$d)
{
$lengthCheck = Data::model()->findByPk($d['id'])->checkLength;
if($lengthCheck !== true)
{
unset($data[$key]);
}
etc.
I can then use the resulting array in an arrayDataProvider, so effectively I can get the information I need using this methods, however, It is taking over 2 seconds per record, so effectively this would be over 3 minutes for 100 records.
Does anyone have an idea of how I could do this in a smarter/faster way?

Save additional information to MYSQL Database and use a simple query, or use complex query?

I have a drupal site, and am trying to use php to grab some data from my database. What I need to do is to display, in a user's profile, how many times they were the first person to review a venue (exactly like Yelp's "First" tally). I'm looking at two options, and trying to decide which is the better way to approach it.
First Option: The first time a venue is reviewed, save the value of the reviewer's user ID into a table in the database. This table will be dedicated to storing the UID of the first user to review each venue. Then, use a simple query to display a count in the user's profile of the number of times their UID appears in this table.
Second Option: Use a set of several more complex queries to display the count in the user's profile, without storing any extra data in the database. This will rely on several queries which will have to do something along the lines of:
Find the ID for each review the user has created
Check the ID of the venue contained in each review
First review for each venue based on the venue ID stored in the review
Get the User ID of the author for the first review
Check which, if any, of these Author UIDs match the current user's UID
I'm assuming that this would involve creating an array of the IDs in step one, and then somehow executing each step for each item in the array. There would also be 3 or 4 different tables involved in the query.
I'm relatively new to writing SQL queries, so I'm wondering if it would be better to perform the set of potentially longer queries, or to take the small database hit and use a much much smaller count query instead. Is there any way to compare the advantages of either, or is it like comparing apples and oranges?
The volume of extra data stored will be negligible; the simplification to the processing will be significant. The data won't change (the first person to review a venue won't change), so there is a negligible update burden. Go with the extra data and simpler query.

Tracking a total count of items over a series of paged results

What is the ideal way to keep track of the total count of items when dealing with paged results?
This seems like a simple question at first but it is slightly more complicated (to me... just bail now if you find this too stupid for words) when I actually start thinking about how to do it efficiently.
I need to get a count of items from the database. This is simple enough. I can then store this count in some variable (a $_SESSION variable for instance). I can check to see if this variable is set and if it isn't, get the count again. The trick part is deciding what is the best way to determine when I need to get a new count. It seems I would need to get a new count if I have added/deleted items to the total or if I am reloading or revisiting the grid.
So, how would I decide when to clear this $_SESSION variable? I can see clearing it and getting a new count after an update/delete (or even adding or subtracting to it to avoid the potentially expensive database hit) but (here comes the part I find tricky) what about when someone navigates away from the page or waits a variable amount of time before going to the next page of results or reloads the page?
Since we may be dealing with tens or hundreds of thousands of results, getting a count of them from the database could be quite expensive (right? Or is my assumption incorrect?). Since I need the total count to handle the total number of pages in the paged results... what's the most efficient way to handle this sort of situation and to persist it for... as long as might be needed?
BTW, I would get the count with an SQL query like:
SELECT COUNT(id) FROM foo;
I never use a session variable to store the total found in a query, I include the count in the regular query when I get the information and the count itself comes from a second query:
// first query
SELECT SQL_CALC_FOUND_ROWS * FROM table LIMIT 0, 20;
// I don´t actually use * but just select the columns I need...
// second query
SELECT FOUND_ROWS();
I´ve never noticed any performance degradation because of the second query but I guess you will have to measure that if you want to be sure.
By the way, I use this in PDO, I haven´t tried it in plain MySQL.
Why store it in a session variable? Will the result change per user? I'd rather store it in a user cache like APC or memcached, choose the cache key wisely, and then clear it when inserting or deleting a record related to the query.
A good way to do this would be to use an ORM that does it for you, like Doctrine, which has a result cache.
To get the count, I know that using COUNT(*) is not worse than using COUNT(id). (question: Is it even better?)
EDIT: interesting article about this on the MySQL performance blog
Most likely foo has a PRIMARY KEY index defined on the id column. Indexed COUNT() queries are usually quite easy on the DB.
However, if you want to go the extra mile, another option would be to insert a special hook into code that deals with inserting and deleting rows into foo. Have it write the number of total records into a protected file after each insert/update and read it from there. If every successful insert/update gets accounted for, the number in the protected file is always up-to-date.

Database structure for saving search results

I currently work for a social networking website.
My boss recently had the idea to show search results by random instead of normal results (registration date). The problem with that is simple and obvious: if you go from one page to another, it's going to show you different results each time as the list is randomized each time.
I had the idea to store results in database+cookies something like this:
Cookie containing a serialized version of the $_POST request (needed if we want to do a re-sort)
A table which would serve as the base for the search id => searches (id,user_id, creation_date)
A table which would store the results and their order => searches_results (search_id, order, user_id)
Flow chart would look like something like that:
After each searches I store the "where" into a cookie or session
Then I erase the previous search in "searches"
Then I delete previous results in "searches_results"
Then I insert a row into "searches" for the key
Then I insert each user row into "searches_results"
And finally I redirect the user to somethink like ?search_id=[search_key]
There is a big flaw here : performances .... it is definetly possible to make the system OR down OR very slow.
Any idea what would be the best to structure this ?
What if instead of ordering randomly, you ordered by some function where the order is known and repeatable, just non-obvious? You could seed such a function with some data from the search query to make it be even less obvious that it repeats. This way, you can page back and forth through your results and always get what you expect. Music players use this sort of function for their shuffle feature (so that if you click back, you get the previous song, and if you click next again, you're back where you started). I'm sure you can divine some function to accomplish this... bitwise XORing ID values with some constant (from the query) and then sorting by the resulting number might be sufficient. I chose XOR arbitrarily because it's a trivially simple function that will get you repeatable and non-obvious results.
Hum maybe, but doesn't the xor operator only say if it is an OR exclusive ? I mean, there is no mathematical operation here, as far as I know of tho.
Sorry, I know this doesn't help, but I don't understand why your boss would want this?
I know that if I search for a person on a social network, then I want the results to be ordered by relevance and relevance only. I think that randomized results would just be frustrating for the user, but maybe that's just me.
For example, if I search for "John Smith", then first first batch of results better be people named "John Smith". Then show me similar names near the end of the results. I don't want to search for "John Smith" and get "Jon Smithers" as my second result.
Well, I'm with Matt in asking "Why?"
I think rmeador has a good suggestion as well. You could randomly sort by a different field or some sort of algorithm. Just from the permutations of DESC / ASC on last updated or some other result field.
Other option would be to do an initial search the first time and return only related ID's and then store the full ID's string in the database and each subsequent page is then a lookup against those ID's.
My two cents.
I can see a scenario where a randomized result set is useful but not for searching but for browsing profiles or artists or local events. It offers more exposure to those that wouldn't show up in a traditionally directed search.

Categories