I've got a database table mytable with a column name in Varchar format, and column date with Datetime values. I'd like to count names with certain parameters grouped by date. Here is what I do:
SELECT
CAST(t.date AS DATE) AS 'date',
COUNT(*) AS total,
SUM(LENGTH(LTRIM(RTRIM(t.name))) > 4
AND (LOWER(t.name) LIKE '%[a-z]%')) AS 'n'
FROM
mytable t
GROUP BY
CAST(t.date AS DATE)
It seems that there's something wrong with range syntax here, if I just do LIKE 'a%' it does count properly all the fields starting with 'a'. However, the query above returns 0 for n, although should count all the fields containing at least one letter.
You write:
It seems that there's something wrong with range syntax here
Indeed so. MySQL's LIKE operator (and SQL generally) does not support range notation, merely simple wildcards.
Try MySQL's nonstandard RLIKE (a.k.a. REGEXP), for fuller-featured pattern matching.
I believe LIKE is just for searching for parts of a string, but it sounds like you want to implement a regular expression to search for a range.
In that case, use REGEXP instead. For example (simplified):
SELECT * FROM mytable WHERE name REGEXP "[a-z]"
Your current query is looking for a string of literally "[a-z]".
Updated:
SELECT
CAST(t.date AS DATE) AS 'date',
COUNT(*) AS total,
SUM(LENGTH(LTRIM(RTRIM(t.name))) > 4
AND (LOWER(t.name) REGEXP '%[a-z]%')) AS 'n'
FROM
mytable t
GROUP BY
CAST(t.date AS DATE)
I believe you want to use WHERE REGEXP '^[a-z]$' instead of LIKE.
You have regex in your LIKE statement, which doesn't work. You need to use RLIKE or REGEXP.
SELECT CAST(t.date AS DATE) AS date,
COUNT(*) AS total
FROM mytable AS t
WHERE t.name REGEXP '%[a-zA-Z]%'
GROUP BY CAST(t.date AS DATE)
HAVING SUM(LENGTH(LTRIM(RTRIM(t.name))) > 4
Also just FYI, MySQL is terrible with strings, so you really should trim before you insert into the database. That way you don't get all that crazy overhead everytime you want to select.
Related
Let me describe the problem based on the example below.
Lets say there is a string "abc12345" (could be any!!!) and there is a table mytable with a column mycolumn of varchar(100).
There are some rows that ends with the last character 5.
There are some rows that ends with the last characters 45.
There are some rows that ends with the last characters 345
There are no rows that ends with the last characters 2345.
In this case these rows should be selected:
SELECT * FROM mytable WHERE mycolumn LIKE "%345"
That's because "345" is the longest right substring of "abc12345" that occurs at least once as the right substring of at least one string in the mycolumn column.
Any ideas how to write it in one query?
Thank you.
This is a brute force method:
select t.*
from (select t.*,
dense_rank() over (order by (case when mycolumn like '%abc12345' then 1
when mycolumn like '%bc12345' then 2
when mycolumn like '%c12345' then 3
when mycolumn like '%12345' then 4
when mycolumn like '%2345' then 5
when mycolumn like '%345' then 6
when mycolumn like '%45' then 7
when mycolumn like '%5' then 8
end)
) as seqnum
where mycolumn like '%5' -- ensure at least one match
from t
) t
where seqnum = 1;
This then inspires something like this:
select t.*
from (select t.*, max(i) over () as maxi
from t join
(select str, generate_series(1, length(str)) as i
from (select 'abc12345' as str) s
) s
on left(t.mycolumn, i) = left(str, i)
) t
where i = maxi;
Interesting puzzle :)
The hardest problem here is finding what is the length of the target suffix matching your suffix pattern.
In MySQL you probably need to use either generating series or a UDF. Others proposed these already.
In PostgreSQL and other systems that provide regexp-based substring, you can use the following trick:
select v,
reverse(
substring(
reverse(v) || '#' || reverse('abcdefg')
from '^(.*).*#\1.*'
)) res
from table;
What it does is:
constructs a single string combining your string and suffix. Note, we reverse them.
we put # in between the strings that's important, you need a character that doesn't exist in your string.
we extract a match from a regular expression, using substring, such that
it starts at the beginning of the string ^
matches any number of characters (.*)
can have some remaining characters .*
now we find #
now, we want the same string we matched with (.*) to be present right after #. So we use \1
and there can be some tail characters .*
we reverse the extracted string
Once you have the longest suffix, finding maximum length, and then finding all strings having the suffix of that length is trivial.
Here's a SQLFiddle using PostgreSQL:
If you cannot restructure the table I would approach the problem this way:
Write an aggregate UDF LONGEST_SUFFIX_MATCH(col, str) in C (see an example in sql/udf_example.c in the MySQL source, search for avgcost)
SELECT #longest_match:=LONGEST_SUFFIX_MATCH(mycol, "abcd12345") FROM mytbl; SELECT * FROM mytbl WHERE mycol LIKE CONCAT('%', SUBSTR("abcd12345", -#longest_match))
If you could restructure the table, I do not have a complete solution yet, but the first thing I would add a special column mycol_rev obtained by reversing the string (via REVERSE() function) and create a key on it, then use that key for lookups. Will post a full solution when I have a moment.
Update:
If you can add a reversed column with a key on it:
use the query in the format of `SELECT myrevcol FROM mytbl WHERE myrevcol LIKE CONCAT(SUBSTR(REVERSE('$search_string'), $n),'%') LIMIT 1 performing a binary search with respect to $n over the range from 1 to the length of $search_string to find the largest value of $n for which the query returns a row
SELECT * FROM mytbl WHERE myrevcol LIKE CONCAT(SUBSTR(REVERSE('$search_string'), $found_n),'%')
This solution should be very fast as long as you do not have too many rows coming back. We will have a total of O(log(L)) queries where L is the length of the search string each of those being a B-tree search with the read of just one row followed by another B-tree search with the index read of only the needed rows.
I have table column that contain strings seperated by , like so
Algebraic topology,Riemannian geometries
Classical differential geometry,Noncommutative geometry,Integral transforms
Dark Matter
Spectral methods,Dark Energy,Noncommutative geometry
Energy,Functional analytical methods
I am trying to search for the MySQL row that has a string between comma, for example if I was search for Noncommutative geometry, I want to select these two rows
Classical differential geometry,Noncommutative geometry,Integral transforms
Spectral methods,Dark Energy,Noncommutative geometry
This is what I tried
SELECT * FROM `mytable` WHERE ``col` LIKE '%Noncommutative geometry%'
which works fine, but there problem is that if I was searching for Energy I want to select the row
Energy,Functional analytical methods
but my code gives the two rows
Energy,Functional analytical methods
Spectral methods,Dark Energy,Noncommutative geometry
which is not what I am looking for. Is there a way to fix this so that it only finds the rows that have the string between commas?
Give these a try, using the REGEXP operator:
SELECT * FROM `mytable`
WHERE `col` REGEXP '(^|.*,)Noncommutative geometry(,.*|$)'
SELECT * FROM `mytable`
WHERE `col` REGEXP '(^|.*,)Energy(,.*|$)'
The expression being used ('(^|.*,)$searchTerm(,.*|$)') requires the search term to be either preceded by a comma or the beginning of the string, and followed by either a comma or the end of the string.
you can do like this
SELECT * FROM `mytable` WHERE `col` LIKE '%,$yourString,%'
or `col` LIKE '$yourString,%'
or `col` LIKE '%,$yourString'
I have a MySQL table with over 200 values. One of the columns on my table is 'date'. Out of all 200 values there are only 5 unique dates.
How can I list out the unique values of the dates and echo them with php. e.g. not getting back 200 instances of dates but just 5.
Use DISTINCT
SELECT DISTINCT `date` FROM `tablename`....
SELECT myDate FROM myTable GROUP BY myDate
Or...
SELECT DISTINCT myDate FROM myTable
DISTINCT is a nice short hand, but if you ever then want to make use of the query for other purposes, if often constrains you a bit too much. So I prefer the GROUP BY version.
I have a field with this kind of info "web-1/1.,web-2/2.,web-3/3.,web-4/4.,web-5/5.". Other registers could have different values like "web-1/4.,web-2/5.,web-3/1.,web-4/2.,web-5/3."
I want to select and order by lets say web-2/? would be web-2/1, web-2/2, web-2/3 and so on all fields that contain web-2 and order by the last number
I want to create a featured properties script different websites and specify feature number. Different properties, different websites different order
I would suggest that you look at the MySQL String Functions and more specifically the SUBSTRING_INDEX function. The reason I suggest this one over SUBSTRING is because the number before or after the slash might be more than a single number which would make the length of the first and/or second parts vary.
Example:
SELECT `info`,
SUBSTRING_INDEX(`info`, '/', 1) AS `first_part`,
SUBSTRING_INDEX(`info`, '/', -1) AS `second_part`
FROM `table`
ORDER BY `first_part` ASC,
`second_part` ASC;
Result:
Additional Example
In this example, I'm using CAST to convert the second part into an unsigned integer just in case it contains additional characters such as symbols or letters. In other words, the second part of "web-4/15." would be "15" and the second part of "web-4/15****" would also be "15".
SELECT `info`,
SUBSTRING_INDEX(`info`, '/', 1) AS `first_part`,
CAST(SUBSTRING_INDEX(`info`, '/', -1) AS UNSIGNED) `second_part`
FROM `table`
ORDER BY `first_part` ASC,
`second_part` ASC;
If the strings will always match the pattern you described, we can assume that the first value you want to sort on is index position 5 in the string (5 characters from the left). The second character is index 7. With that in mind, you can do the following, assuming that the string 'web-2/1' is in a field named field:
SELECT
`field`
FROM
`table`
ORDER BY
substr(`field`, 5, 1) ASC,
substr(`field`, 7, 1) ASC;
The substr() functions take the field as the first option, the index we mentioned above, and an option third parameter which is the count for how many characters to include starting from the second option.
You can tweak this as necessary if the string is slightly off, the main thing being the second option in the subtr() function.
Just add: ...ORDER BY SUBSTRING(username, 2) ASC
I would do this
SELECT info
FROM table
WHERE info LIKE 'web-2%'
ORDER BY info ASC
I have a question about constructing a MySQL query. I have a table with one column containing values, and another column containing timestamps. What I'd like to do is get the number of distinct (unique) values for a field from a specific epoch up until various points in time so that I can plot the number of unique values over time. For example, I'd like the query result to look like the following:
Date, COUNT( DISTINCT col1)
2011-02-01, 10
2011-02-02, 16
2011-02-03, 24
etc.
Note that these values are the number of distinct values starting the same point in time. Currently to accomplish this, I'm using a loop in PHP to iterate a single query for each date and it takes forever since I have a large DB. To give a better picture, the inefficient code I'd like to replace looks like the following:
for($i=0;$i<count($dates),$i++){
$qry = "SELECT COUNT (DISTINCT `col1`) FROM `db`.`table` WHERE `Date` BETWEEN '".$EPOCH."' AND '".$dates[$i]."';";
}
Any help would be appreciated.
Thanks
If understood your question, you can use a GROUP statement:
SELECT StampCol, COUNT(DISTINCT DataCol) FROM MyTable GROUP BY StampCol
SELECT DATE_FORMAT(date_column, "%Y-%m-%d") AS date_column, COUNT(visitors) AS visitors FROM table GROUP BY DATE_FORMAT(date_column, "%Y-%m-%d") ORDER BY date_column desc"