SELECT users from MySQL database by privileges bitmask? - php

I have users table and want to SELECT some rows by bitmask criteria. I'll try to explain my problem with small example.
Structure of table users
user_id int [primary key, auto_increment]
user_email varchar(200)
user_privileges int
Note: It has more fields but they are irrelevant for this question.
Filled table may look like this
+---------+--------------------+-----------------+
| user_id | user_email | user_privileges | << binary
+---------+--------------------+-----------------+
| 1 | john#example.com | 165 | 10100101
| 2 | max#example.com | 13 | 00001101
| 3 | trevor#example.com | 33 | 00100001
| 4 | paul#example.com | 8 | 00001000
| 5 | rashid#example.com | 5 | 00000101
+---------+--------------------+-----------------+
Now I want to SELECT users by specific privileges bitmask (by user_privileges column).
For example:
bitmask=1 [00000001] would select user-ids 1, 2, 3 and 5
bitmask=9 [00001001] would select user-id 2 only
bitmask=5 [00000101] would select user-ids 1, 2 and 5
bitmask=130 [10000010] would select none
My question: Is it possible from query or I have to go all users one-by-one and check this value from PHP code? Also, is it possible if field user_privileges is text, containing hexadecimal numbers, instead of integers? I need working mysql query example.
Note: This is just a simple example with 8-bit privilege-set. In real environment it may have larger sets (greater integers, more bytes). Creating separate column for each privilege state works fine, but that's not possible solution. I'd rather work with hex values, but integers are fine too, something is better than nothing.
Thanx in advance.

SELECT
*
FROM
users
WHERE
(user_privileges & <level>) = <level>
<level> being the access level you want to search on (e.g. 1, 5, 9, 130, etc.)

[...] want to SELECT some fields
Wrong. You want to select some Rows. Columns are usually called fields.
You are supposed to read the Documentation: Bit Functions are documented for mysql.
So you can try:
Select * from users WHERE (user_privileges & 1) >0

Related

How to: MySQL order by user_id (RAND) with pagination

Looking for a solution to keep a random order of a user table in the database when clicking the next page button.
Actually I have a database with 1000 users and I want to display 10 users each page (in a memberlist), my query looks like this:
$sql = "SELECT * FROM users ORDER BY user_id LIMIT 1,10";
Now I would like to ORDER BY RAND() and it works, except of course when clicking the next page, then it is shuffled again and it happens sometimes that the same users will be there again.
So my question is about a solution to keep the random order I had on the first page, also on the next pages.
I thought about to set a $_SESSION variable when someone visits the memberlist for the first time with shuffled numbers from 1 to 1000 in it and then order the members by position in the $_SESSION variable where a number is equal to a user_id.
Don't know how this might be possible, but I actually imagine a solution like:
$numbers = range(1, 1000);
$shuffled_numbers = shuffle($numbers);
$sort = $_SESSION['random_user_sort'] = $shuffled_numbers;
So I will have a mysql query when clicking page two (next page) like this:
$sql = "SELECT * FROM users ORDER BY $sort LIMIT 11,20";
Any solution to let it work this way or even better ideas?
The RAND() function does not really generate random numbers but what's called pseudo random numbers: numbers are calculated with a deterministic formula and they're just intended to look random. To calculate a new number, you take the previous one and apply the formula to it, and that's how we get different output with a deterministic function: by using different input.
The initial number we use is known as seed. If you have a look at the manual you'll see that RAND() has an optional argument:
RAND(), RAND(N)
Returns a random floating-point value v in the range 0 <= v < 1.0. If
a constant integer argument N is specified, it is used as the seed
value, which produces a repeatable sequence of column values
You've probably figured out by now where I want to go:
mysql> SELECT language_id, name FROM language ORDER BY RAND(33);
+-------------+----------+
| language_id | name |
+-------------+----------+
| 3 | Japanese |
| 1 | English |
| 4 | Mandarin |
| 6 | German |
| 5 | French |
| 2 | Italian |
+-------------+----------+
6 rows in set (0.00 sec)
mysql> SELECT language_id, name FROM language ORDER BY RAND(33);
+-------------+----------+
| language_id | name |
+-------------+----------+
| 3 | Japanese |
| 1 | English |
| 4 | Mandarin |
| 6 | German |
| 5 | French |
| 2 | Italian |
+-------------+----------+
6 rows in set (0.00 sec)
P.S. The manual is not explicit about the seed range (it just says integer), you might need some extra research (or just some quick testing).

WHERE vs HAVING in generated queries

I know that this title is overused, but it seems that my kind of question is not answered yet.
So, the problem is like this:
I have a table structure made of four tables (tables, rows, cols, values) that I use to recreate the behavior of the information_schema (in a way).
In php I am generating queries to retrieve the data, and the result would still look like a normal table:
SELECT
(SELECT value FROM `values` WHERE `col` = "3" and row = rows.id) as "col1",
(SELECT value FROM `values` WHERE `col` = "4" and row = rows.id) as "col2"
FROM rows WHERE `table` = (SELECT id FROM tables WHERE name = 'table1')
HAVING (col2 LIKE "%4%")
OR
SELECT * FROM
(SELECT
(SELECT value FROM `values` WHERE `col` = "3" and row = rows.id) as "col1",
(SELECT value FROM `values` WHERE `col` = "4" and row = rows.id) as "col2"
FROM rows WHERE `table` = (SELECT id FROM tables WHERE name = 'table1')) d
WHERE col2 LIKE "%4%"
note that the part where I define the columns of the result is generated by a php script. It is less important why I am doing this, but I want to extend this algorithm that generates the queries for a broader use.
And we got to the core problem, I have to decide if I will generate a where or a having part for the query, and I know when to use them both, the problem is my algorithm doesn't and I have to make a few extra checks for this. But the two above queries are equivalent, I can always put any query in a sub-query, give it an alias, and use where on the new derived table. But I wonder if I will have problems with the performance or not, or if this will turn back on me in an unexpected way.
I know how they both work, and how where is supposed to be faster, but this is why I came here to ask. Hopefully I made myself understood, please excuse my english and the long useless turns of phrases, and all.
EDIT 1
I already know the difference between the two, and all that implies, my only dilemma is that using custom columns from other tables, with variable numbers and size, and trying to achieve the same result as using a normally created table implies that I must use HAVING for filtering the derived tables columns, at the same time having the option to wrap it up in a subquery and use where normally, this probably will create a temporary table that will be filtered afterwards. Will this affect performance for a large database? And unfortunately I cannot test this right now, as I do not afford to fill the database with over 1 billion entries (that will be something like this: 1 billion in rows table, 5 billions in values table, as every row have 5 columns, 5 rows in cols table and 1 row in tables table = 6,000,006 entries in total)
right now my database looks like this:
+----+--------+-----------+------+
| id | name | title | dets |
+----+--------+-----------+------+
| 1 | table1 | Table One | |
+----+--------+-----------+------+
+----+-------+------+
| id | table | name |
+----+-------+------+
| 3 | 1 | col1 |
| 4 | 1 | col2 |
+----+-------+------+
where `table` is a foreign key from table `tables`
+----+-------+-------+
| id | table | extra |
+----+-------+-------+
| 1 | 1 | |
| 2 | 1 | |
+----+-------+-------+
where `table` is a foreign key from table `tables`
+----+-----+-----+----------+
| id | row | col | value |
+----+-----+-----+----------+
| 1 | 1 | 3 | 13 |
| 2 | 1 | 4 | 14 |
| 6 | 2 | 4 | 24 |
| 9 | 2 | 3 | asdfghjk |
+----+-----+-----+----------+
where `row` is a foreign key from table `rows`
where `col` is a foreign key from table `cols`
EDIT 2
The conditions are there just for demonstration purposes!
EDIT 3
For only two rows, it seems there is a difference between the two, the one using having is 0,0008 and the one using where is 0.0014-0.0019. I wonder if this will affect performance for large numbers of rows and columns
EDIT 4
The result of the two queries is identical, and that is:
+----------+------+
| col1 | col2 |
+----------+------+
| 13 | 14 |
| asdfghjk | 24 |
+----------+------+
HAVING is specifically for GROUP BY, WHERE is to provide conditional parameters. See also WHERE vs HAVING
I believe the having clause would be faster in this case, as you're defining specific values, as opposed to reading through the values and looking for a match.
See: http://database-programmer.blogspot.com/2008/04/group-by-having-sum-avg-and-count.html
Basically, WHERE filters out columns before passing them to an aggregate function, but HAVING filters the aggregate function's results.
you could do it like that
WHERE col2 In (14,24)
your code WHERE col2 LIKE "%4%" is bad idea so what about col2 = 34 it will be also selected.

How do I track changes and store calculated content in Nermalization?

I'm trying to create a table like this:
lives_with_owner_no from until under_the_name
1 1998 2002 1
3 2002 NULL 1
2 1997 NULL 2
3 1850 NULL 3
3 1999 NULL 4
2 2002 2002 4
3 2002 NULL 5
It's the Nermalization example, which I guess is pretty popular.
Anyway, I think I am just supposed to set up a dependency within MySQL for the from pending a change to the lives_with table or the cat_name table, and then set up a dependency between the until and from column. I figure the owner might want to come and update the cat's info, though, and override the 'from' column, so I have to use PHP? Is there any special way I should do the time stamp on the override (for example, $date = date("Y-m-d H:i:s");)? How do I set up the dependency within MySQL?
I also have a column that can be generated by adding other columns together. I guess using the cat example, it would look like:
combined_family_age family_name
75 Alley
230 Koneko
132 Furrdenand
1,004 Whiskers
Should I add via PHP and then input the values with a query, or should I use MySQL to manage the addition? Should I use a special engine for this, like MemoryAll?
I disagree with the nermalization example on two counts.
There is no cat entity in the end. Instead, there is a relation (cat_name_no, cat_name), which in your example has the immediate consequence that you can't tell how many cats named Lara exist. This is an anomaly that can easily be avoided.
The table crams two relations, lives_with_owner and under_the_name into one table. That's not a good idea, especially if the data is temporal, as it creates all kinds of nasty anomalies. Instead, you should use a table for each.
I would design this database as follows:
create table owner (id integer not null primary key, name varchar(255));
create table cat (id integer not null primary key, current_name varchar(255));
create table cat_lives_with (
cat_id integer references cat(id),
owner_id integer references owner(id),
valid_from date,
valid_to date);
create table cat_has_name (
cat_id integer references cat(id),
name varchar(255),
valid_from date,
valid_to date);
So you would have data like:
id | name
1 | Andrea
2 | Sarah
3 | Louise
id | current_name
1 | Ada
2 | Shelley
cat_id | owner_id | valid_from | valid_to
1 | 1 | 1998-02-15 | 2002-08-11
1 | 3 | 2002-08-12 | 9999-12-31
2 | 2 | 2002-01-08 | 2001-10-23
2 | 3 | 2002-10-24 | 9999-12-31
cat_id | name | valid_from | valid_to
1 | Ada | 1998-02-15 | 9999-12-31
2 | Shelley | 2002-01-08 | 2001-10-23
2 | Callisto | 2002-10-24 | 9999-12-31
I would use a finer grained date type than just year (in the nermalization example having 2002-2002 as a range can really lead to messy query syntax), so that you can ask queries like select cat_id from owner where '2000-06-02' between valid_from and valid_to.
As for the question of how to deal with temporal data in the general case: there's an excellent book on the subject, "Developing Time-Oriented Database Applications in SQL" by Richard Snodgrass (free full-text PDF distributed by Richard Snodgrass), which i believe can even be legally downloaded as pdf, Google will help you with that.
Your other question: you can handle the combined_family_age either in sql externally, or, if that column is needed often, with a view. You shouldn't manage the content manually though, let the database calculate that for you.

How do I select times from MySQL in order based on precedence?

I know that question doesn't make much sense, but here goes:
Times Table
Authority | Time
-------------------------------------
animuson#forums | 45.6758
132075829385895 | 49.7869
qykumsoy#forums | 44.45
439854390263565 | 50.761
user#forums | 44.9
another#auth | 46.123
bingo#nameo | 47.4392
So let me explain this. By default, if you have not linked your account to the authority you use, it just stores times as the authority, but if you link your account, it stores your ID number instead. I want the people with ID numbers to have precedence, so they'll appear over someone who is not linked, but still in order. So for this sample of data, when choosing the top 5, it would output these results:
Authority | Time
-------------------------------------
qykumsoy#forums | 44.45
user#forums | 44.9
animuson#forums | 45.6758
132075829385895 | 49.7869
439854390263565 | 50.761
-------------------------------------
Ignoring These:
another#auth | 46.123
bingo#nameo | 47.4392
Even though those two users had better times, they got knocked off because they're not linked, the linked accounts got pushed up, but the top 5 still remained in order of their times. It is safe to assume that an '#' symbol being present within the Authority means that it is an unlinked account. It will always appear in an unlinked authority value and a linked account will always be pure numbers. Any ideas on how to do this in one query?
The current query I use which simply selects the top 5 without thinking:
SELECT * FROM `tronner_times` WHERE `mid` = '{$map['mid']}' ORDER BY `time` + 0 LIMIT 5
This is the first solution that comes to mind. I'm not sure if it can be optimized further, but you may want to try the following:
SELECT dt.authority, dt.time
FROM (
SELECT authority, time
FROM tronner_times
ORDER BY INSTR(authority, '#') > 0, time
LIMIT 5
) dt
ORDER BY dt.time;
Test case:
CREATE TABLE tronner_times (authority varchar(90), time decimal(8, 4));
INSERT INTO tronner_times VALUES ('animuson#forums', 45.6758);
INSERT INTO tronner_times VALUES ('132075829385895', 49.7869);
INSERT INTO tronner_times VALUES ('qykumsoy#forums', 44.45);
INSERT INTO tronner_times VALUES ('439854390263565', 50.761);
INSERT INTO tronner_times VALUES ('user#forums', 44.9);
INSERT INTO tronner_times VALUES ('another#auth', 46.123);
INSERT INTO tronner_times VALUES ('bingo#nameo ', 47.4392);
Result:
+-----------------+---------+
| authority | time |
+-----------------+---------+
| user#forums | 44.9000 |
| another#auth | 46.1230 |
| bingo#nameo | 47.4392 |
| 132075829385895 | 49.7869 |
| 439854390263565 | 50.7610 |
+-----------------+---------+
5 rows in set (0.00 sec)
We are ordering twice, because the derived table returns the rows without the # sign at the very top. The expression INSTR(authority, '#') > 0 returns 1 if the # is present in the authority string, or 0 if it is not. Therefore the result set is first ordered by this expression, and then by the time field, giving rows without the # a priority (since 0 is sorted before 1). We therefore order the 5 rows from the derived table by the time field to produce the expected final result.
My idea is to do a case statement to filter out numbers, since u say it is confirm that numbers means linked. I also noticed those with #forums are included, so this part should be easy with like %#forums. The link for examples for checking numbers are shown, but you will need to change a bit. 2nd link would seem easier to me.
SELECT * FROM `tronner_times` WHERE PATINDEX('%[0-9]%',mid) > 0 OR mid like '%#forums' ORDER BY `time` + 0 LIMIT 5
http://www.tek-tips.com/faqs.cfm?fid=6423
http://www.sqlservercurry.com/2008/04/how-to-check-if-string-contains-numbers.html

MySQL design with dynamic number of fields

My experience with MySQL is very basic. The simple stuff is easy enough, but I ran into something that is going to require a little more knowledge. I have a need for a table that stores a small list of words. The number of words stored could be anywhere between 1 to 15. Later, I plan on searching through the table by these words. I have thought about a few different methods:
A.) I could create the database with 15 fields, and just fill the fields with null values whenever the data is smaller than 15. I don't really like this. It seems really inefficient.
B.) Another option is to use just a single field, and store the data as a comma separated list. Whenever I come back to search, I would just run a regular expression on the field. Again, this seems really inefficient.
I would hope there is a good alternative to those two options. Any advice would be very appreciated.
-Thanks
C) use a normal form; use multiple rows with appropriate keys. an example:
mysql> SELECT * FROM blah;
+----+-----+-----------+
| K | grp | name |
+----+-----+-----------+
| 1 | 1 | foo |
| 2 | 1 | bar |
| 3 | 2 | hydrogen |
| 4 | 4 | dasher |
| 5 | 2 | helium |
| 6 | 2 | lithium |
| 7 | 4 | dancer |
| 8 | 3 | winken |
| 9 | 4 | prancer |
| 10 | 2 | beryllium |
| 11 | 1 | baz |
| 12 | 3 | blinken |
| 13 | 4 | vixen |
| 14 | 1 | quux |
| 15 | 4 | comet |
| 16 | 2 | boron |
| 17 | 4 | cupid |
| 18 | 4 | donner |
| 19 | 4 | blitzen |
| 20 | 3 | nod |
| 21 | 4 | rudolph |
+----+-----+-----------+
21 rows in set (0.00 sec)
This is the table I posted in this other question about group_concat. You'll note that there is a unique key K for every row. There is another key grp which represents each category. The remaining field represents a category member, and there can be variable numbers of these per category.
What other data is associated with these words?
One typical way to handle this kind of problem is best described by example. Let's assume your table captures certain words found in certain documents. One typical way is to assign each document an identifier. Let's pretend, for the moment, that each document is a web URL, so you'd have a table something like this:
CREATE TABLE WebPage (
ID INTEGER NOT NULL,
URL VARCHAR(...) NOT NULL
)
Your Words table might look something like this:
CREATE TABLE Words (
Word VARCHAR(...) NOT NULL,
DocumentID INTEGER NOT NULL
)
Then, for each word, you create a new row in the table. To find all words in a particular document, select by the document's ID:
SELECT Words.Word FROM Words, WebPage
WHERE Words.DocumentID = WebPage.DocumentID
AND WebPage.URL = 'http://whatever/web/page/'
To find all documents with a particular word, select by word:
SELECT WebPage.URL FROM WebPage, Words
WHERE Words.Word = 'hello' AND Words.DocumentID = WebPage.DocumentID
Or some such.
Hurpe, is the scenario you are describing that you will have a database table with a column that can contain a up to 15 keywords. Later you will use these keywords to search the table which will presumably have other columns as well?
Then isn't the answer to have a separate table for the keywords? You will also need to have a many-to-many relationship between the keywords and the main table.
So using cars as an example, the WORD table that will store the 15 or so keywords would have the following structure:
ID int
Word varchar(100)
The CAR table would have a structure something like:
ID int
Name varchar(100)
Then finally you need a CAR_WORD table to hold the many-to-many relationships:
ID int
CAR_ID int
WORD_ID int
And sample data to go with this for the WORD table:
ID Word
001 Family
002 Sportscar
003 Sedan
004 Hatchback
005 Station-wagon
006 Two-door
007 Four-door
008 Diesel
009 Petrol
together with sample data for the CAR table
ID Name
001 Audi TT
002 Audi A3
003 Audi A4
then the intersection CAR_WORD table sample data could be:
ID CAR_ID WORD_ID
001 001 002
002 001 006
003 001 009
which give the Audi TT the correct characteristics.
and finally the SQL to search would be something like:
SELECT c.name
FROM CAR c
INNER JOIN CAR_WORD x
ON c.id = x.id
INNER JOIN WORD w
ON x.id = w.id
WHERE w.word IN('Petrol', 'Two-door')
Phew! Didn't intend to set out to write quite so much, it looks complicated but it is where I always seem to end up however hard I try to simplify things.
I would create a table with and ID and one field, then store your results as multiple records. This offers many benefits. For example, you can then programatically enforce your 15 word limit instead of doing it in your design, so if you ever change your mind it should be rather easy. Your queries to search on the data will also be much faster to run, regular expressions take a lot of time to run (comparatively). Plus using a varchar for the field will allow you to compress your table much better. And indexing on the table should be much easier (more efficient) with this design.
Do the extra work and store the 15 words as 15 rows in the table, i.e. normalize the data. It may require you to re-think your strategy a bit, but trust me when the client comes along and says "Can you change that 15 limit to 20...", you'll be glad you did.
Depending on exactly what you want to accomplish:
Use a full-text index on your string table
Three tables: one for the original string, one for unique words (after word-rooting?), and a join table. This would also let you do more complicated searches, like "return all strings containing at least three of the following five words" or "return all strings where 'fox' occurs after 'dog'".
CREATE TABLE string (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
string TEXT NOT NULL
)
CREATE TABLE word (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
word VARCHAR(14) NOT NULL UNIQUE,
UNIQUE INDEX (word ASC)
)
CREATE TABLE word_string (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
string_id INT NOT NULL,
word_id INT NOT NULL,
word_order INT NOT NULL,
FOREIGN KEY (string_id) REFERENCES (string.id),
FOREIGN KEY (word_id) REFERENCES (word.id),
INDEX (word_id ASC)
)
// Sample data
INSERT INTO string (string) VALUES
('This is a test string'),
('The quick red fox jumped over the lazy brown dog')
INSERT INTO word (word) VALUES
('this'),
('test'),
('string'),
('quick'),
('red'),
('fox'),
('jump'),
('over'),
('lazy'),
('brown'),
('dog')
INSERT INTO word_string ( string_id, word_id, word_order ) VALUES
( 0, 0, 0 ),
( 0, 1, 3 ),
( 0, 2, 4 ),
( 1, 3, 1 ),
( 1, 4, 2 ),
( 1, 5, 3 ),
( 1, 6, 4 ),
( 1, 7, 5 ),
( 1, 8, 7 ),
( 1, 9, 8 ),
( 1, 10, 9 )
// Sample query - find all strings containing 'fox' and 'quick'
SELECT
UNIQUE string.id, string.string
FROM
string
INNER JOIN word_string ON string.id=word_string.string_id
INNER JOIN word AS fox ON fox.word='fox' AND word_string.word_id=fox.id
INNER JOIN word AS quick ON quick.word='quick' AND word_string.word_id=word.id
You are correct that A is no good. B is also no good, as it fails to adhere to First Normal Form (each field must be atomic). There's nothing in your example that suggests you would gain by avoiding 1NF.
You want a table for your list of words with each word in its own row.

Categories