What is the default order of a query when no ORDER BY is used?
There is no such order present. Taken from http://forums.mysql.com/read.php?21,239471,239688#msg-239688
Do not depend on order when ORDER BY is missing.
Always specify ORDER BY if you want a particular order -- in some situations the engine can eliminate the ORDER BY because of how it
does some other step.
GROUP BY forces ORDER BY. (This is a violation of the standard. It can be avoided by using ORDER BY NULL.)
SELECT * FROM tbl -- this will do a "table scan". If the table has
never had any DELETEs/REPLACEs/UPDATEs, the records will happen to be
in the insertion order, hence what you observed.
If you had done the same statement with an InnoDB table, they would
have been delivered in PRIMARY KEY order, not INSERT order. Again,
this is an artifact of the underlying implementation, not something to
depend on.
There's none. Depending on what you query and how your query was optimised, you can get any order. There's even no guarantee that two queries which look the same will return results in the same order: if you don't specify it, you cannot rely on it.
I've found SQL Server to be almost random in its default order (depending on age and complexity of the data), which is good as it forces you to specify all ordering.
(I vaguely remember Oracle being similar to SQL Server in this respect.)
MySQL by default seems to order by the record structure on disk, (which can include out-of-sequence entries due to deletions and optimisations) but it often initially fools developers into not bother using order-by clauses because the data appears to default to primary-key ordering, which is not the case!
I was surprised to discovere today, that MySQL 5.6 and 4.1 implicitly sub-order records which have been sorted on a column with a limited resolution in the opposite direction. Some of my results have identical sort-values and the overall order is unpredictable. e.g. in my case it was a sorted DESC by a datetime column and some of the entries were in the same second so they couldn't be explicitly ordered. On MySQL 5.6 they select in one order (the order of insertion), but in 4.1 they select backwards! This led to a very annoying deployment bug.
I have't found documentation on this change, but found notes on on implicit group order in MySQL:
By default, MySQL sorts all GROUP BY col1, col2, ... queries as if you specified ORDER BY col1, col2, ... in the query as well.
However:
Relying on implicit GROUP BY sorting in MySQL 5.5 is deprecated. To achieve a specific sort order of grouped results, it is preferable to use an explicit ORDER BY clause.
So in agreement with the other answers - never rely on default or implicit ordering in any database.
The default ordering will depend on indexes used in the query and in what order they are used. It can change as the data/statistics change and the optimizer chooses different plans.
If you want the data in a specific order, use ORDER BY
Related
What is the default order of a query when no ORDER BY is used?
There is no such order present. Taken from http://forums.mysql.com/read.php?21,239471,239688#msg-239688
Do not depend on order when ORDER BY is missing.
Always specify ORDER BY if you want a particular order -- in some situations the engine can eliminate the ORDER BY because of how it
does some other step.
GROUP BY forces ORDER BY. (This is a violation of the standard. It can be avoided by using ORDER BY NULL.)
SELECT * FROM tbl -- this will do a "table scan". If the table has
never had any DELETEs/REPLACEs/UPDATEs, the records will happen to be
in the insertion order, hence what you observed.
If you had done the same statement with an InnoDB table, they would
have been delivered in PRIMARY KEY order, not INSERT order. Again,
this is an artifact of the underlying implementation, not something to
depend on.
There's none. Depending on what you query and how your query was optimised, you can get any order. There's even no guarantee that two queries which look the same will return results in the same order: if you don't specify it, you cannot rely on it.
I've found SQL Server to be almost random in its default order (depending on age and complexity of the data), which is good as it forces you to specify all ordering.
(I vaguely remember Oracle being similar to SQL Server in this respect.)
MySQL by default seems to order by the record structure on disk, (which can include out-of-sequence entries due to deletions and optimisations) but it often initially fools developers into not bother using order-by clauses because the data appears to default to primary-key ordering, which is not the case!
I was surprised to discovere today, that MySQL 5.6 and 4.1 implicitly sub-order records which have been sorted on a column with a limited resolution in the opposite direction. Some of my results have identical sort-values and the overall order is unpredictable. e.g. in my case it was a sorted DESC by a datetime column and some of the entries were in the same second so they couldn't be explicitly ordered. On MySQL 5.6 they select in one order (the order of insertion), but in 4.1 they select backwards! This led to a very annoying deployment bug.
I have't found documentation on this change, but found notes on on implicit group order in MySQL:
By default, MySQL sorts all GROUP BY col1, col2, ... queries as if you specified ORDER BY col1, col2, ... in the query as well.
However:
Relying on implicit GROUP BY sorting in MySQL 5.5 is deprecated. To achieve a specific sort order of grouped results, it is preferable to use an explicit ORDER BY clause.
So in agreement with the other answers - never rely on default or implicit ordering in any database.
The default ordering will depend on indexes used in the query and in what order they are used. It can change as the data/statistics change and the optimizer chooses different plans.
If you want the data in a specific order, use ORDER BY
I want to get two rows from my table in a MySQL databse. these two rows must be the first one and last one after I ordered them. To achieve this i made two querys, these two:
SELECT dateBegin, dateTimeBegin FROM worktime ORDER BY dateTimeBegin ASC LIMIT 1;";
SELECT dateBegin, dateTimeBegin FROM worktime ORDER BY dateTimeBegin DESC LIMIT 1;";
I decided to not get the entire set and pick the first and last in PHP to avoid possibly very large arrays. My problem is, that I have two querys and I do not really know how efficient this is. I wanted to combine them for example with UNION, but then I would still have to order an unsorted list twice which I also want to avoid, because the second sorting does exactly the same as the first
I would like to order once and then select the first and last value of this ordered list, but I do not know a more efficient way then the one with two querys. I know the perfomance benefit will not be gigantic, but nevertheless I know that the lists are growing and as they get bigger and bigger and I execute this part for some tables I need the most efficient way to do this.
I found a couple of similar topics, but none of them adressed this particular perfomance question.
Any help is highly appreciated.
(This is both an "answer" and a rebuttal to errors in some of the comments.)
INDEX(dateTimeBegin)
will facilitate SELECT ... ORDER BY dateTimeBegin ASC LIMIT 1 and the corresponding row from the other end, using DESC.
MAX(dateTimeBegin) will find only the max value for that column; it will not directly find the rest of the columns in that row. That would require a subquery or JOIN.
INDEX(... DESC) -- The DESC is ignored by MySQL. This is almost never a drawback, since the optimizer is willing to go either direction through an index. The case where it does matter is ORDER BY x ASC, y DESC cannot use INDEX(x, y), nor INDEX(x ASC, y DESC). This is a MySQL deficiency. (Other than that, I agree with Gordon's 'answer'.)
( SELECT ... ASC )
UNION ALL
( SELECT ... DESC )
won't provide much, if any, performance advantage over two separate selects. Pick the technique that keeps your code simpler.
You are almost always better off having a single DATETIME (or TIMESTAMP) field than splitting out the DATE and/or TIME. SELECT DATE(dateTimeBegin), dateTimeBegin ... works simply, and "fast enough". See also the function DATE_FORMAT(). I recommend dropping the dateBegin column and adjusting the code accordingly. Note that shrinking the table may actually speed up the processing more than the cost of DATE(). (The diff will be infinitesimal.)
Without an index starting with dateTimeBegin, any of the techniques would be slow, and get slower as the table grows in size. (I'm pretty sure it can find both the MIN() and MAX() in only one full pass, and do it without sorting. The pair of ORDER BYs would take two full passes, plus two sorts; 5.6 may have an optimization that almost eliminates the sorts.)
If there are two rows with exactly the same min dateTimeBegin, which one you get will be unpredictable.
Your queries are fine. What you want is an index on worktime(dateTimeBegin). MySQL should be smart enough to use this index for both the ASC and DESC sorts. If you test it out, and it is not, then you'll want two indexes: worktime(dateTimeBegin asc) and worktime(dateTimeBegin desc).
Whether you run one query or two is up to you. One query (connected by UNION ALL) is slightly more efficient, because you have only one round-trip to the database. However, two might fit more easily into your code, and the difference in performance is unimportant for most purposes.
I am using CodeIgniter 2.1.3 and PHP 5.4.8, and two PostgreSQL servers (P1 and P2).
I am having a problem with list_fields() function in CodeIgniter. When I retrieve fields from P1 server, fields are in the order that I originally created the table with. However, if I use the exact same codes to retrieve fields from P2 server, fields are in reverse order.
If fields from P1 is array('id', 'name', 'age'),
fields from P2 becomes array('age', 'name', 'id')
I don't think this is CodeIgniter specific problem, but rather general database configuration or PHP problem, because codes are identical.
This is the code that I get fields with.
$fields = $this->db->list_fields("clients");
I have to clarify something. #muistooshort claims in a comment above:
Technically there is no defined order to the columns in a table.
#mu may be thinking of the order or rows, which is arbitrary without ORDER BY.
It is completely incorrect for the order of columns, which is well defined and stored in the column pg_attribute.attnum. It's used in many places, like INSERT without column definition list or SELECT *. It is preserved through a dump / restore cycle and has significant bearing on storage size and performance.
You cannot simply change the order of columns in PostgreSQL, because it has not been implemented, yet. It's deeply wired into the system and hard to change. There is a Postgres Wiki page and it's on the TODO list of the project:
Allow column display reordering by recording a display, storage, and
permanent id for every column?
Find out for your table:
SELECT attname, attnum
FROM pg_attribute
WHERE attrelid = 'myschema.mytable'::regclass
AND NOT attisdropped -- no dropped (dead) columns
AND attnum > 0 -- no system columns
ORDER BY attnum;
It is unwise to use SELECT * in some contexts, where the columns of the underlying table may change and break your code. It is explicitly wise to use SELECT * in other contexts, where you need all columns (in default order).
As to the primary question
This should not occur. SELECT * returns columns in a well defined order in PostgreSQL. Some middleware must be messing with you.
I suspect you are used to MySQL which allows you to reorder columns post columns post-table creation. PostgreSQL does not let you do this, so when you:
ALTER TABLE foo ADD bar int;
It puts this on the end of the table always and there is no way to change the order.
On PostgreSQL you should not assume that the order of the columns is meaningful because these can differ from server to server based on the order in which the columns were defined.
However the fact that these are reversed is odd to me.
If you want to see the expected order on the db, use:
\d foo
from psql
If these are reversed then the issue is in the db creation (this is my first impression). That's the first thing to look at. If that doesn't show the problem then there is something really odd going on with CodeIgniter.
I realized, that the response to a MySQL query becomes much faster, when creating an index for the column you use for "ORDER BY", e.g.
SELECT username FROM table ORDER BY registration_date DESC
Now I'm wondering which indices I should create to optimize the request time.
For example I frequently use the following queries:
SELECT username FROM table WHERE
registration_date > ".(time() - 10000)."
SELECT username FROM table WHERE
registration_date > ".(time() - 10000)."
&& status='active'
SELECT username FROM table WHERE
status='active'
SELECT username FROM table ORDER BY registration_date DESC
SELECT username FROM table WHERE
registration_date > ".(time() - 10000)."
&& status='active'
ORDER BY birth_date DESC
Question 1:
Should I set up separate indices for the first three request types? (i.e. one index for the column "registration_date", one index for the column "status", and another column for the combination of both?)
Question 2:
Are different indices independently used for "WHERE" and for "ORDER BY"? Say, I have a combined index for the columns "status" and "registration_date", and another index only for the column "birth_date". Should I setup another combined index for the three columns ("status", "registration_date" and "birth_date")?
There are no hard-and-fast rules for indices or query optimization. Each case needs to be considered and examined.
Generally speaking, however, you can and should add indices to columns that you frequently sort by or use in WHERE statements. (Answer to Question 2 -- No, the same indices are potentially used for ORDER BY and WHERE) Whether to do a multi-column index or a single-column one depends on the frequency of queries. Also, you should note that single-column indices may be combined by mySQL using the Index Merge Optimization:
The Index Merge method is used to retrieve rows with several range
scans and to merge their results into one. The merge can produce
unions, intersections, or unions-of-intersections of its underlying
scans. This access method merges index scans from a single table; it
does not merge scans across multiple tables.
(more reading: http://dev.mysql.com/doc/refman/5.0/en/index-merge-optimization.html)
Multi-column indices also require that you take care to structure your queries in such a way that your use of indexed columns matches the column order in the index:
MySQL cannot use an index if the columns do not form a leftmost
prefix of the index. Suppose that you have the SELECT statements shown
here:
SELECT * FROM tbl_name WHERE col1=val1; SELECT * FROM tbl_name WHERE
col1=val1 AND col2=val2;
SELECT * FROM tbl_name WHERE col2=val2; SELECT * FROM tbl_name WHERE
col2=val2 AND col3=val3;
If an index exists on (col1, col2, col3), only the first two queries
use the index. The third and fourth queries do involve indexed
columns, but (col2) and (col2, col3) are not leftmost prefixes of
(col1, col2, col3).
Bear in mind that indices DO have a performance consideration of their own -- it is possible to "over-index" a table. Each time a record is inserted or an indexed column is modified, the index/indices will have to be rebuilt. This does demand resources, and depending on the size and structure of your table, it may cause a decrease in responsiveness while the index building operations are active.
Use EXPLAIN to find out exactly what is happening in your queries. Analyze, experiment, and don't over-do it. The shotgun approach is not appropriate for database optimization.
Documentation
MySQL EXPLAIN - http://dev.mysql.com/doc/refman/5.0/en/explain.html
How MySQL uses indices - http://dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html
Index Merge Optimization - http://dev.mysql.com/doc/refman/5.0/en/index-merge-optimization.html
To quote this page:
[Indices] will slow down your updates and inserts.
That's the tradeoff you have to calculate. To optimize your table, you should put indices only in the columns you are most likely to apply conditions to - the more indices you have, the slower your data-changing operations become. In that sense, I personally don't see much merit in creating combined indices - if you create all 7 possible permutations of indices for 3 columns, you are most definitely putting more drag on your updates and inserts than just using 3 indices for 3 columns (and even that can be debatable). On the other hand, if the data is being edited much, much less than it is being SELECTed, then indices can really help you speed things up.
Something else to take into consideration (again quoting the above page):
If your table is very small [...] it's worse to use an index than to leave it out and just let it do a table scan. Indexes really only come in handy with tables that have a lot of rows.
Yes, it is a good idea to have indexes on your column you often use, both for order by and in your where clauses.
But be aware: UPDATES, INSERTS and DELETE slow down if you have indexes.
That is because after such an operation, the index must be updated too.
So, as a rule-of-thumb: If your application is read-intensive, use the indexes where you think they help.
If your application is often updating the data, be careful, because that may get slow because of the indexes.
When in doubt, you must simply get dirty hands, and study the results of EXPLAIN.
http://dev.mysql.com/doc/refman/5.6/en/explain.html
As for the first two examples, you can satisfy them with one index: {registration_date, status}. Such an index can support filters on the first item (registration_date) or on both.
It does not work for status alone, however. The question on status is how selective is the status. That is, what proportion of records have status = "active". If this is a high proportion (so, on average, every database page would have an active record), then an index may not help very much.
The order by's are trickier. I don't know if mysql uses indexes for this purpose. Often, using an index for sorting entire records is less efficient than just sorting the records. Using the index causes a random access pattern to the records in the pages, which can cause major performance problems for tables larger than the page cache.
Use the explain function on your select statements to determine where your joins are slowing down (the more rows that are referenced, the slower it will be). Then apply your indices to those columns.
EXPLAIN SELECT * FROM table JOIN table 2 ON a = b WHERE conditions;
I'm trying to implement multilingual indexes for the web application I'm developing. At the moment, records exist in a few languages, English, Malay & Arabic (but they are not separated into different columns). Only English stemmer is currently enabled.
Only two indexes are built, for the stemmed and the non-stemmed indexes. I'm having the problem with the stemmed index, as the result set returned is not consistent, depending on the sort column.
These two queries (from the stemmed index), each returns a different number of total results, although the difference between them is only the sort order.
SELECT * FROM test1stemmed WHERE MATCH('#institution universiti') GROUP BY art_id ORDER BY art_title_ord ASC;
SELECT * FROM test1stemmed WHERE MATCH('#institution universiti') GROUP BY art_id ORDER BY art_title_ord DESC;
However, if the same queries were run on the non-stemmed index, the numbers of results are equal.
I'm also having the same problem with Sphinx PHP API:
$sp = new SphinxClient();
$sp->SetServer('localhost', 9312);
$sp->SetMatchMode(SPH_MATCH_EXTENDED);
$sp->SetGroupBy('art_id', SPH_GROUPBY_ATTR, "$sp_sort_column $sort");
$sp->SetLimits($offset, $rows_per_page, 1000);
$sp->Query("$q", 'test1stemmed');
What am I missing?
Something that I missed from the documentation here http://sphinxsearch.com/docs/2.0.2/clustering.html
WARNING: grouping is done in fixed memory and thus its results are only approximate; so there might be more groups reported in total_found than actually present. #count might also be underestimated. To reduce inaccuracy, one should raise max_matches. If max_matches allows to store all found groups, results will be 100% correct.
So I can workaround this by increasing the value in max_matches, but since putting a very large value is absolutely undesirable, I would fix the query instead.