How can i access recent 10 values from cassandra. i need to get the recent values from a particular super column. I am getting the result, but which is not sorted properly. i need to sort the result with descending order(latest first based on the column name/timestamp)
Without knowing the precise application it's quite hard to give a definite answer for your needs, but in general and if I understand you correctly then as a SuperColumn has no timestamp data, I believe you would need to key each of your SuperColumns with perhaps a timestamp or other numerical key, and then define a CompareWith="LongType" (or similar, check documentation) in the storage config XML.
This would result in each of your SuperColumns within the scope being sorted by ascending key. To retrieve the most recent, then, you would need to set the reversed attribute on your SliceRange to true (how exactly this is done will depend on the language/library you're using). See http://wiki.apache.org/cassandra/API#SliceRange
Hopefully this will be a point in the right direction :).
James
Related
I am using the commonly known reddit 'hot' algorithm on my table 'posts'. Now this hot column is a decimal number like this: 'XXXXX,XXXXXXXX'
I want this column to be an index, because when I order by 'hot', I want the query to be as fast as possible. However, I am kind of new to indexes. Does an index need to be unique?
If it has to be unique, would this work and be efficient?
$table->unique('id', 'hot');
If it does not have to be unique, would this be the right approach?
$table->index('hot');
Last question: would the following query be taking advantage of the index?
Post::orderBy('hot', 'desc')->get()
If not, how should I modify it?
Thank you very much!
Do not make it UNIQUE unless you need the constraint that you cannot insert duplicates.
Phrased differently, a UNIQUE key is two things: an INDEX (for speedy searching) and a "constraint" (to give an error when trying to insert a dup).
ORDER BY hot DESC can use INDEX(hot) (or UNIQUE(hot)). I say "can", not "will", because there are other issues where the Optimizer may decide not to use the index. (Without seeing the actual query and knowing more about the the dataset, I can't be more specific.)
If id is the PRIMARY KEY, then neither of these is of any use: INDEX(id, hot); UNIQUE(id, hot). Swapping the order of the columns makes sense. Or simply INDEX(hot).
A caveat: EXPLAIN does not say whether the index is used for ORDER BY, only for WHERE. On the other hand, EXPLAIN FORMAT=JSON does give more details. Try that.
(Yes, DECIMAL columns can be indexed.)
I use sorted set type in Redis store.
For each user I create a own KEY and put here data:
Example of KEY:
FEED:USER:**1**, FEED:USER:**2**, FEED:USER:**3**
I want to select data from Redis for user's keys: 1, 2, 3 and sorted each by score (timestamp).
If see at problem simply, I need select from any KEY a data across time and after combine all results sorted by score.
There are a couple of ways to do this but the right one depends on what you're trying to do. For example:
You can use ZRANGEBYSCORE (or ZREVRANGEBYSCORE) in your code for each FEED:USER:n key and "merge" the replies in the client
You can do a ZUNIONSTORE on the relevant keys and then do the ZRANGEBYSCORE on the result from the client.
However, if your "feeds" are large, #2's flow should be reversed - first range and then union.
You could also do similar types of processing entirely server-side with some Lua scripting.
EDIT: further clarifications
Re. 1 - Merging could be done client-side on the results that you get from ZRANGEBYSCORE or you could use server-side Lua scripts to do that. Use the WITHSCORES to get the timestamp and merge/sort on it. Regardless the your choice of location for running this code (I'd probably use Lua for data locality), the implementation is up to you - lmk if you need help with that :)
I have ran into a predicament on this system I have been working on. In the table "class" there are multiple rows of classes. The column I am focusing on is "dates". On the admin dashboard, I need to be able to list the classes by the closest date to the farthest away. I tried using this but it does not work the way I want it to.
mysqli_query("SELECT * FROM class ORDER BY dates ASC")
My problem is that the column "dates" is actually a serialized array of the start and end dates. Because of this I can't use strtotime() to make the above code work 100% correctly.
My overall question is, is there any way I can sort a query by a serialized date string?
I know opinions of different ways to do it will arise, but that's not what I'm shooting for. If it is possible or you have any idea, please post an answer.
mysql order by serialized data?
No, it is not possible. The only possible case when serialized data is acceptable is when you don't need to search or order by through that data. In all other cases - store your data as a separated fields.
From someone with more experience than myself, would it be a better idea to simply count the number of items in a table (such as counting the number of topics in a category) or to keep a variable that holds that value and just increment and call it (an extra field in the category table)?
Is there a significant difference between the two or is it just very slight, and even if it is slight, would one method still be better than the other? It's not for any one particular project, so please answer generally (if that makes sense) rather than based on something like the number of users.
Thank you.
To get the number of items (rows in a table), you'd use standard SQL and do it on demand
SELECT COUNT(*) FROM MyTable
Note, in case I've missed something, each item (row) in the table has some unique identifier, whether it's a part number, some code, or an auto-increment. So adding a new row could trigger the "auto-increment" of a column.
This is unrelated to "counting rows". Because of DELETEs or ROLLBACK, numbers may not be contiguous.
Trying to maintain row counts separately will end in tears and/or disaster. Trying to use COUNT(*)+1 or MAX(id)+1 to generate a new row identifier is even worse
I think there is some confusion about your question. My interpretation is whether you want to do a select count(*) or a column where you track your actual count.
I would not add such a column, if you don't have reasons to do so. This is premature optimization and you complicate your software design.
Also, you want to avoid having the same information stored in different places. Counting is a trivial task, so you actually duplicating information, which is a bad idea.
I'd go with just counting. If you notice a performance issue, you can consider other options, but as soon as you keep a value that's separate, you have to do some work to make sure it's always correct. Using COUNT() you always get the actual number "straight from the horse's mouth" so to speak.
Basically, don't start optimizing until you have to. If everything works fine and fast using COUNT(), then do that. Otherwise, store the count somewhere, but rather than adding/subtracting to update the stored value, run COUNT() when needed to get the new number of items
In my forum I count the sub-threads in a forum like this:
SELECT COUNT(forumid) AS count FROM forumtable
As long as you're using an identifier that is the same to specify what forum and/or sub-section, and the column has an index key, it's very fast. So there's no reason to add more columns than you need to.
ETA: I'm thinking that my question needs some clarification. I don't want to sort my arrays. I want to be sure that a given array that is in order by a particular criterion, is also in order by another criterion. I made a graphic to illustrate. Each row is an array ordered by number. If the letters are also in order, the array passes the test.
Original Question
I have a parent class PhysicalCount with 2 properties: date, count. I also have subclasses of PhysicalCount: ClutchCount, FryCount and MatCount. When I have a mixed array of PhysicalCounts and subclasses I need to verify (not set!) that the order matches these criteria:
objects are in order by date
0 to 1 objects of each child class may exist
0 to many objects of PhysicalCount may exist
if a ClutchCount is present it must have an earlier date than a FryCount or MatCount if those exist
if a FryCount is present it must have an earlier date than a MatCount if that exists
Boiled down, the question is something like:
Given a list sorted by one criterion ($o->date in my case), what is the most efficient way to ascertain that sorting that same list by another criterion (get_class($o)in my case) will result in the same order?
I'd prefer a solution in PHP, but I'm thinking this is a fairly common problem that has a standard solution that I just don't know the name of. (Here is me regretting my degree choice [not CS]).
Based on our discussion in the comments, what you need is a sort that is somewhat "stable" (not actually a stable sort by definition though) use usort to sort the objects in the array in a user-defined way. You can specify in your $cmp_function the criteria you mentioned in your question, so that the sorted array fits your needs.