How to store and retrieve data-structure like this in redis in most efficient way. The record will be accessed by username value which will be unique and we need to sort data by date and counter values.
Also, is redis the best choice for this scenario (performance takes precedence here over everything).
Sorted sets sort items automatically based on a score value, which can be any string-based value for double precision floating point number. You can also use SORT command which does on-the-fly sorting for lists and sets.
Going back to sorted sets, you can sort by date using using long timestamps:
ZADD uniqueName 13733654484 stringData1 197625203673 stringData2
Or you can sort by counter:
ZADD uniqueName 100 stringData1 101 stringData2
That's one way to do it.
Related
I am having trouble sorting the job numbers in my project.
My client requirement is to store the job numbers in the below format:-
current year-1, current year-2, current year-3, and so on...
For example:-
The current year is 2021 so the job number will be like this:-
21-1, 21-2, 21-3, 21-4, and so on...
When the year changes it should start again from 1 and so on for the next year.
For example:-
22-1, 22-2, 22-3, 22-4, and so on...
I had stored the job numbers in the above format successfully but I am unable to sort the job numbers in the correct way as required by the client.
I had sorted the data in this way:-
21-1, 21-10, 21-100 to 21-109, 21-11 to 21-19, 21-2 and so on...
but the actual sort should be like this:-
21-1, 21-2, 21-3, 21-4...21-10, 21-11 to 21-99, 21-100 to 21-199 and so on...
And if the year changes then:-
22-1, 22-2, 22-3, 22-4...22-10, 22-11 to 22-99, 22-100 to 22-199 and so on...
I hope I have explained my problem briefly. Please help me in sorting out the job numbers.
I assume you need the sorting to occur in your database because you're paging or just otherwise not holding all results in application memory, use use the following sql order by clause:
select my_column
from my_table
order by left(my_column, 2),
len(my_column),
right(my_column, len(my_column) - 2)
Explanation:
left(my_column, 2), numerically sort first two digits so years are grouped together.
len(my_column) group record sequence based on magnitude (i.e., xx-100 appears after xx-2 because it's longer).
right(my_column, len(my_column) - 2) numerically sort record sequence.
Hint: This assumes your year-code is always exactly two digits. I could have found the index of the dash instead, but that feels even more presumptive.
If you require an application-side (PHP) solution, you can use natsort. From W3Schools:
Definition and Usage
The natsort() function sorts an array by using a
"natural order" algorithm. The values keep their original keys.
In a natural algorithm, the number 2 is less than the number 10. In
computer sorting, 10 is less than 2, because the first number in "10"
is less than 2.
Syntax
natsort(array)
21-1, 21-10, 21-100, 1, 10, 100
These are your ids from the database. At first, declare an empty array and a variable for serial then loop through the objects. Push every object with two new keys array["date_serial"], array["serial"]. Then take the new array and then display the result by sorting the array by "serial" key or asc or desc order.
I am developing a event organization website. Here when the user registers for an event he will be given a unique random number(10 digit), which we use to generate a barcode and mail it to him. Now,
I want to make the number unique for each registered event.
And also auto increment
One solution is to grab all the auto increment numbers in an array and generate a auto increment number using laravel takes the form (0000000001 to 9999999999) and loop through and check all the values. Grab the first value that doesn't equal to any of the values in the array and add it to the database.
But I am thinking that there might be a better solution to this. Any suggestion?
Select Maximum number stored in your DB and add 1 in it like:
SELECT (MAX(Column_Name)+1) AS Max_val FROM Table_Name;
I suggest simple timestamp-based solution using the Carbon class to produce a unique number using timestamp. It's fairly simple to have a basic unique and random stamp generation using timestamp.
You can use as given below,
use Carbon\Carbon;
$current_timestamp = Carbon::now()->timestamp; // Produces something like this 1552296328
You can use it as a unique identifier. If you want the next numbers, just +1. But keep in mind, you have to manage another number batch in a timely manner. (i.e if you have generated 500 numbers for now by increment, You should not generate another number for the next 500 seconds. Otherwise, it will repeat the number). If you want to know more, you can read here.
The solution with rand() function may not work here because it can re-produce the existing number in the database and you will be errored for Unique Constraint Violation(i.e. If you have column unique in DB).
No matter what approach you use, it would never be truly random. It will be PRNG. For your case, I think auto increment with zero fill should be enough.
But if you are set on using random number then using rand() function of PHP should be enough. 10 digit means 10000000000 unique number.Unless your project has millions of events it should realistically be no problem. So, approach 1 should be no problem. Also, you can check after generating any random number that whether that number is already present(There is 0.000001% or something like that chance.). If it is present then try to generate a random number again.
But if your project gets very successful (I.E. millions of events) then problems similar to Y2K might creep up.
MySQL UUID would give you truly unique is: Store UUID v4 in MySQL
You don’t need to worry about auto incrementing.
Imagine you were to search an array of N elements and perform Y Searches on the array values to find the corresponding keys; you can either do Y array_search's or do one array_flip and Y direct lookups. Why is the first method alot slower than the second method? Is there a scenario where the first method becomes faster than the second one?
You can assume that keys and values are unique
Array keys are hashed, so looking them up just requires calling the hash function and indexing into the hash table. So array_flip() is O(N) and looking up an array key is O(1), so Y searches are O(Y)+O(N).
Array values are not hashed, so searching them requires a linear search. This is O(N), so Y searches are O(N*Y).
Assuming values being searched for are evenly distributed through the array, the average case of linear search has to compare N/2 elements. So array_flip() should take about the time of 2 array_search() calls, since it has to examine N elements.
There's some extra overhead in creating the hash table. However, PHP uses copy-on-write, so it doesn't have to copy the keys or values during array_flip(), so it's not too bad. For a small number of lookups, the first method may be faster. You'd have to benchmark it to find the break-even point.
I iterate over an array of arrays and access the array's value through associative keys, this is a code snippet. Note: i never iterate over the total array but only with a window of 10.
//extract array from a db table (not real code)
$array = $query->executeAndFetchAssociative;
$window_start = 0;
for($i = $window_start; $i<count($array) && $i<$window_start+10; $i++)
echo($entry["db_field"]);
This is a sort of paginator for a web interface. I receive the windows_start value and display hte next 10 values.
A conceptual execution:
Receive the windows_start number
Start the cycle entering the window_start-TH array of the outer array
Display the value of a field of the inner array via associative index
Move to window_start+1
The inner arrays have about 40 fields. The outer array can grow a lot as it rapresent a database table.
Now i see that as the outer array gets bigger the execution over the windows of 10 takes more and more time.
I need some "performance theory" on my code:
If I enter the values of inner arrays via numeric key can I have better performance? In general is quickier accessing the array values with numeric index than accessing with associative index (a string)?
How does it cost entering a random entry ($array[random_num]) of an array of length N ? O(N), O(N/2) just for example
Finally the speed of iterating over an array depends on the array lenght? I mean i always iterate on 10 elements of the array, but how does the array lenght impact on my fixed length iteration?
Thanks
Alberto
If I enter the values of inner arrays via numeric key can I have
better performance? In general is quicker accessing the array values
with numeric index than accessing with associative index (a string)?
There might be a theoretical speed difference for integer-based vs string-based access (it depends on what the hash function for integer values does vs the one for string values, I have not read the PHP source to get a definite answer), but it's certainly going to be negligible.
How does it cost entering a random entry ($array[random_num]) of an
array of length N ? O(N), O(N/2) just for example
Arrays in PHP are implemented through hash tables, which means that insertion is amortized O(1) -- almost all insertions are O(1), but a few may be O(n). By the way, O(n) and O(n/2) are the same thing; you might want to revisit a text on algorithmic complexity.
Finally the speed of iterating over an array depends on the array
length? I mean i always iterate on 10 elements of the array, but how
does the array length impact on my fixed length iteration?
No, array length is not a factor.
The performance drops not because of how you access your array but because of the fact that you seem to be loading all of the records from your database just to process 10 of them.
You should move the paging logic to the database itself by including an offset and a limit in your SQL query.
Premature optimization is the root of all evil. Additional numeric and associative arrays have a very different semantic meaning and are therefore usually not interchangeable. And last but not least: No. Arrays in PHP are implemented as Hashmaps and accessing them by key is always O(1)
In your case (pagination) it's much more usefull to only fetch the items you want to display instead of fetching all and slicing them later. SQL has the LIMIT 10 OFFSET 20-syntax for that.
I have a relatively large database (130.000+ rows) of weather data, which is accumulating very fast (every 5minutes a new row is added). Now on my website I publish min/max data for day, and for the entire existence of my weatherstation (which is around 1 year).
Now I would like to know, if I would benefit from creating additional tables, where these min/max data would be stored, rather than let the php do a mysql query searching for day min/max data and min/max data for the entire existence of my weather station. Would a query for max(), min() or sum() (need sum() to sum rain accumulation for months) take that much longer time then a simple query to a table, that already holds those min, max and sum values?
That depends on weather your columns are indexed or not. In case of MIN() and MAX() you can read in the MySQL manual the following:
MySQL uses indexes for these
operations:
To find the MIN() or MAX() value for a
specific indexed column key_col. This
is optimized by a preprocessor that
checks whether you are using WHERE
key_part_N = constant on all key parts
that occur before key_col in the
index. In this case, MySQL does a
single key lookup for each MIN() or
MAX() expression and replaces it with
a constant.
In other words in case that your columns are indexed you are unlikely to gain much performance benefits by denormalization. In case they are NOT you will definitely gain performance.
As for SUM() it is likely to be faster on an indexed column but I'm not really confident about the performance gains here.
Please note that you should not be tempted to index your columns after reading this post. If you put indices your update queries will slow down!
Yes, denormalization should help performance a lot in this case.
There is nothing wrong with storing calculations for historical data that will not change in order to gain performance benefits.
While I agree with RedFilter that there is nothing wrong with storing historical data, I don't agree with the performance boost you will get. Your database is not what I would consider a heavy use database.
One of the major advantages of databases is indexes. They used advanced data structures to make data access lightening fast. Just think, every primary key you have is an index. You shouldn't be afraid of them. Of course, it would probably be counter productive to make all your fields indexes, but that should never really be necessary. I would suggest researching indexes more to find the right balance.
As for the work done when a change happens, it is not that bad. An index is a tree like representation of your field data. This is done to reduce a search down to a small number of near binary decisions.
For example, think of finding a number between 1 and 100. Normally you would randomly stab at numbers, or you would just start at 1 and count up. This is slow. Instead, it would be much faster if you set it up so that you could ask if you were over or under when you choose a number. Then you would start at 50 and ask if you are over or under. Under, then choose 75, and so on till you found the number. Instead of possibly going through 100 numbers, you would only have to go through around 6 numbers to find the correct one.
The problem here is when you add 50 numbers and make it out of 1 to 150. If you start at 50 again, your search is less optimized as there are 100 numbers above you. Your binary search is out of balance. So, what you do is rebalance your search by starting at the mid-point again, namely 75.
So the work a database is just an adjustment to rebalance the mid-point of its index. It isn't actually a lot of work. If you are working on a database that is large and requires many changes a second, you would definitely need to have a strong strategy for your indexes. In a small database that gets very few changes like yours, its not a problem.