Say i have three bit flags in a status, stored in mysql as an integer
Approved Has Result Finished
0|1 0|1 0|1
Now i want to find the rows with status: Finished = 1, Has Result = 1 and Approved = 0.
Then the data:
0 "000"
1 "001"
3 "011"
7 "111"
Should produce
false
false
true
false
Can I do something like? (in mysql)
status & "011" AND ~((bool) status & "100")
Can't quite figure out how to query "Approved = 0".
Or should i completely drop using bit flags, and split these into separate columns?
The reasoning for using bit flags is, in part, for mysql performance.
Use ints instead of binary text. Instead of 011, use 3.
To get approved rows:
SELECT
*
FROM
`foo`
WHERE
(`status` & 4)
or approved and finished rows:
SELECT
*
FROM
`foo`
WHERE
(`status` & 5)
or finished but not accepted:
SELECT
*
FROM
`foo`
WHERE
(`status` & 1)
AND
(`status` ^ 4)
"Finished = 1, Has Result = 1 and Approved = 0" could be as simple as status = 3.
Something I liked to do when I began programming was using powers of 2 as flags given a lack of boolean or bit types:
const FINISHED = 1;
const HAS_RESULT = 2;
const APPROVED = 4;
Then you can check like this:
$status = 5; // 101
if ($status & FINISHED) {
/*...*/
}
EDIT:
Let me expand on this:
Can't quite figure out how to query "Approved = 0".
Or should i completely drop using bit flags, and split these into separate columns?
The reasoning for using bit flags is, in part, for mysql performance.
The issue is, you are not using bitwise flags. You are using a string which "emulates" a bitwise flag and sort of makes it hard to actually do proper flag checking. You'd have to convert it to its bitwise representation and then do the checking.
Store the flag value as an integer and declare the flag identifiers that you will then use to do the checking. A TINYINT should give you 7 possible flags (minus the most significant bit used for sign) and an unsigned TINYINT 8 possible flags.
Related
It's been a while I didn't work on mysql and I was suprised to see that the following statement is valid:
UPDATE table_A
SET col_a = value
WHERE id;
It looks like MySQL updates all rows in the table. I have a good MSSQL background and I am trying to understand why this valid in MySql?
Also, I ran a query with a similar where clause on a varchar column:
SELECT distinct description
FROM table_A
where NOT description;
I tought it will return only null or empty values. Instead, it returns lots of rows with non-null values.
Any ideas why?
Thanks,
WHERE x will match any rows for which x evaluates to a truthy value. Since BOOLEAN is actually a number type TINYINT(1), this works by converting to a number and then comparing to zero - any nonzero number is truthy. (NULL is falsy.)
If you write WHERE id = 123, then only for the row where id is 123, the expression id = 123 will evaluate to TRUE (which is the same as 1) and it will match, otherwise it will evaluate to FALSE (0).
But if you write WHERE id, the requirement is that id evaluates to a truthy value. If id is a number, only IDs 0 and NULL will be falsy.
However, in case of description, you have a string. And the string is first converted to a number. The reason you got many results there is that any string that starts with a number (that is nonzero) is matching, such as 01234hello (which converts to 1234, which is nonzero). Check what CONVERT(description, SIGNED) gives - if it is nonzero, then it matches.
This is why, when building AND or OR queries in code, you can avoid handling the case of zero conditions specially by starting with TRUE or 1 (in the AND case) or FALSE or 0 (in the OR case), since WHERE TRUE/WHERE 1 is valid (matches everything), as is WHERE FALSE/WHERE 0 (matches nothing). So you build a query by starting with WHERE 1 and adding to it: WHERE 1, WHERE 1 AND id = 123, WHERE 1 AND id = 123 AND type = 'xy', etc.
I have 2 fields on db. Minor and Major:
Minor, Major
0,0
1,0
2,0
3,0
4,0
5,0
7,0
8,0
...
65536,0
0,1
1,1
2,1
3,1
4,1
...
65536,1
0,2
What is best way to compare this. I am doing this on Bookshelf.js but in php or ruby also is welcome. I need to check current situation, get greater major and add minor + 1, if is not 65536 else minor is 0 major gets major + 1.
Thanks in advance.
EDIT:
I have to save major and minor to respective fields. They increment for every user registered.
eg.
Users
id, username,minor,major
1, john , 0, 0
2, mike, 1, 0
....
65537, jeff, 65536,0
Now Tom's ,major increments becuse last minor on table is 65536.
65538, tom, 0 , 1
I don't know how to explain more.
I'm absolutely not sure to understand the problem, but here are some ideas about limiting the range of an integer value:
Like many languages, MySQL has some UNSIGNED SMALLINT data types that holds 2-bytes values, that is from 0 to 65535 (not 65536 !)
Most programming laguage have a "modulus" operator (% -- php mysql) that allow you to collect the rest of an integral division. For example, ... % 65536 will return a value between 0 and 65535 incl. If you really need a value between 0 and 65536 incl, you will write ... % 65537 instead.
You could use mask operator ("bitwise and" & -- php mysql). For example, ... & 0xFFFF will only keep the two lowest significant bytes of a number -- actually performing the equivalent of a "modulo 65536" operation (having a result between 0 and 65535 incl.)
$magicNumber = 65536;
$sql = "
SELECT
MAX(userIndex) userIndex
FROM (
SELECT
(Minor + (Major * ".$magicNumber.")) AS userIndex
FROM TableName
) AS innerSelect
";
running the sql gives you the currently highest userIndex, let's say it is 145323.
Now increment this by one, and you have $newIndex = 145324.
This gives you the currently highest Index. Now the fields can be calculated like this:
$major = (int)($newIndex / $magicNumber);
$minor = $newIndex % $magicNumber;
I am using php mysql_query() to select rows from my SQL Database but for some reason it is selecting rows that do not match the query. For example:
mysql_query("SELECT * FROM Table WHERE ID='153'")
This will return the row who has an ID of 153
but so will:
mysql_query("SELECT * FROM Table WHERE ID='153c'")
Am I doing something wrong?
As noted, the issue is comparing a string value to an integer value. Your expression is:
WHERE ID = '153' and
WHERE ID = '153a'
You can imagine the MySQL engine describing what it does as: "id is an integer column. So, I need to compare it to an integer value. Oh, the right side is a string, so I will convert the right hand side to an integer."
The way that MySQL converts values to a number from a string could be called "silent conversion". It converts the longest leading number that it finds, and then stops. If there is no leading number (say 'a123'), then the value is 0. There is no error produced.
If you really want to confuse yourself, consider the following:
select (case when 123 = '123e' then 1 else 0 end),
(case when 123 = '123e3' then 1 else 0 end),
(case when 123 = '123a' then 1 else 0 end),
(case when 123 = '123a3' then 1 else 0 end)
This returns: true, false, true, and true. Why is the second one false, but the others true? Well, '123e3' is interpreted as scientific notation, so the value becomes 123,000. For all the others, the conversion stops at the first alphabetic character.
As mentioned in the other answers, the obvious fix is to drop the single quotes on the constant.
Why are you passing the ID value as a String? This would most likely be your issue. Try passing the following query instead:
mysql_query("SELECT * FROM Table WHERE ID= 153")
This way, the ID value being searched for is an int (or any other primitive the ID column is set to). This should restrict the IDs properly. In addition, if you are creating IDs with numbers, it's best practice to use distinct ints only and not strings.
Ok, let's suppose we have members table. There is a field called, let's say, about_member. There will be a string like this 1-1-2-1-2 for everybody. Let's suppose member_1 has this string 1-1-2-2-1 and he searches who has the similar string or as much similar as possible. For example if member_2 has string 1-1-2-2-1 it will be 100% match, but if member_3 has string like this 2-1-1-2-1 it will be 60% match. And it has to be ordered by match percent. What is the most optimal way to do it with MYSQL and PHP? It's really hard to explain what I mean, but maybe you got it, if not, ask me. Thanks.
Edit: Please give me ideas without Levenshtein method. That answer will get bounty. Thanks. (bounty will be announced when I will be able to do that)
convert your number sequences to bit masks and use BIT_COUNT(column ^ search) as similarity function, ranged from 0 (= 100% match, strings are equal) to [bit length] (=0%, strings are completely different). To convert this similarity function to the percent value use
100 * (bit_length - similarity) / bit_length
For example, "1-1-2-2-1" becomes "00110" (assuming you have only two states), 2-1-1-2-1 is "10010", bit_count(00110 ^ 10010) = 2, bit-length = 5, and 100 * (5 - 2) / 5 = 60%.
Jawa posted this idea originally; here is my attempt.
^ is the XOR function. It compares 2 binary numbers bit-by-bit and returns 0 if both bits are the same, and 1 otherwise.
0 1 0 0 0 1 0 1 0 1 1 1 (number 1)
^ 0 1 1 1 0 1 0 1 1 0 1 1 (number 2)
= 0 0 1 1 0 0 0 0 1 1 0 0 (result)
How this applies to your problem:
// In binary...
1111 ^ 0111 = 1000 // (1 bit out of 4 didn't match: 75% match)
1111 ^ 0000 = 1111 // (4 bits out of 4 didn't match: 0% match)
// The same examples, except now in decimal...
15 ^ 7 = 8 (1000 in binary) // (1 bit out of 4 didn't match: 75% match)
15 ^ 0 = 15 (1111 in binary) // (4 bits out of 4 didn't match: 0% match)
How we can count these bits in MySQL:
BIT_COUNT(b'0111') = 3 // Bit count of binary '0111'
BIT_COUNT(7) = 3 // Bit count of decimal 7 (= 0111 in binary)
BIT_COUNT(b'1111' ^ b'0111') = 1 // (1 bit out of 4 didn't match: 75% match)
So to get the similarity...
// First we focus on calculating mismatch.
(BIT_COUNT(b'1111' ^ b'0111') / YOUR_TOTAL_BITS) = 0.25 (25% mismatch)
(BIT_COUNT(b'1111' ^ b'1111') / YOUR_TOTAL_BITS) = 0 (0% mismatch; 100% match)
// Now, getting the proportion of matched bits is easy
1 - (BIT_COUNT(b'1111' ^ b'0111') / YOUR_TOTAL_BITS) = 0.75 (75% match)
1 - (BIT_COUNT(b'1111' ^ b'1111') / YOUR_TOTAL_BITS) = 1.00 (100% match)
If we could just make your about_member field store data as bits (and be represented by an integer), we could do all of this easily! Instead of 1-2-1-1-1, use 0-1-0-0-0, but without the dashes.
Here's how PHP can help us:
bindec('01000') == 8;
bindec('00001') == 1;
decbin(8) == '01000';
decbin(1) == '00001';
And finally, here's the implementation:
// Setting a member's about_member property...
$about_member = '01100101';
$about_member_int = bindec($about_member);
$query = "INSERT INTO members (name,about_member) VALUES ($name,$about_member_int)";
// Getting matches...
$total_bits = 8; // The maximum length the member_about field can be (8 in this example)
$my_member_about = '00101100';
$my_member_about_int = bindec($my_member_about_int);
$query = "
SELECT
*,
(1 - (BIT_COUNT(member_about ^ $my_member_about_int) / $total_bits)) match
FROM members
ORDER BY match DESC
LIMIT 10";
This last query will have selected the 10 members most similar to me!
Now, to recap, in layman's terms,
We use binary because it makes things easier; the binary number is like a long line of light switches. We want to save our "light switch configuration" as well as find members that have the most similar configurations.
The ^ operator, given 2 light switch configurations, does a comparison for us. The result is again a series of switches; a switch will be ON if the 2 original switches were in different positions, and OFF if they were in the same position.
BIT_COUNT tells us how many switches are ON--giving us a count of how many switches were different. YOUR_TOTAL_BITS is the total number of switches.
But binary numbers are still just numbers... and so a string of 1's and 0's really just represents a number like 133 or 94. But it's a lot harder to visualize our "light switch configuration" if we use decimal numbers. That's where PHP's decbin and bindec come in.
Learn more about the binary numeral system.
Hope this helps!
The obvious solution is to look at the levenstein distance (there isn't an implementation built into mysql but there are other implementations accesible e.g. this one in pl/sql and some extensions), however as usual, the right way to solve the problem would be to have normalised the data properly in the first place.
One way to do this is to calculate the Levenshtein distance between your search string and the about_member fields for each member. Here's an implementation of the function as a MySQL stored function.
With that you can do:
SELECT name, LEVENSHTEIN(about_member, '1-1-2-1-2') AS diff
FROM members
ORDER BY diff ASC
The % of similarity is related to diff; if diff=0 then it's 100%, if diff is the size of the string (minus the amount of dashes), it's 0%.
Having read the clarification comments on the original question, the Levenshtein distance is not the answer you are looking for.
You are not trying to compute the smallest number of edits to change one string into another.
You are trying to compare one set of numbers with another set of numbers. What you are looking for is the minimum (weighted) sum of the differences between the two sets of numbers.
Place each answer in a separate column (Ans1, Ans2, Ans3, Ans4, .... )
Assume you are searching for similarities to 1-2-1-2.
SELECT UserName, Abs( Ans1 - 1 ) + Abs( Ans2 - 2 ) + Abs( Ans3 - 1 ) + Abs( Ans4 - 2) as Difference ORDER BY Difference ASC
Will list users by similarity to answers 1-2-1-2, assuming all questions are weighted evenly.
If you want to make certain answers more important, just multiply each of the terms by a weighting factor.
If the questions will always be yes/no and the number of answers is small enough that all the answers can be fitted into a single integer and all answers are equally weighted, then you could encode all the answers in a single column and use BIT_COUNT as suggested. This would be a faster and more space-efficient implementation.
I would go with the similar_text() PHP built-in. It seems to be exactly what you want:
$percent = 0;
similar_text($string1, $string2, $percent);
echo $percent;
It works as the question expects.
I would go with the Levenshtein distance approach, you can use it within MySQL or PHP.
If you don't have too many fields, you could create an index on the integer representation of about_member. Then you can find the 100% by an exact match on the about_member field, followed by the 80% matches by changing 1 bit, the 60% matches by changing 2 bits, and so on.
If you represent your answer patterns as bit sequences you can use the formula (100 * (bit_length - similarity) / bit_length).
Following the mentioned example, when we convert "1"s to bit off and "2"s to bit on "1-1-2-2-1" becomes 6 (as base-10, 00110 in binary) and "2-1-1-2-1" becomes 18 (10010b) etc.
Also, I think you should store the answers' bits to the least significant bits, but it doesn't matter as long as you are consistent that the answers of different members align.
Here's a sample script to be run against MySQL.
DROP TABLE IF EXISTS `test`;
CREATE TABLE `members` (
`id` VARCHAR(16) NOT NULL ,
`about_member` INT NOT NULL
) ENGINE = InnoDB;
INSERT INTO `members`
(`id`, `about_member`)
VALUES
('member_1', '6'),
('member_2', '18');
SELECT 100 * ( 5 - BIT_COUNT( about_member ^ (
SELECT about_member
FROM members
WHERE id = 'member_1' ) ) ) / 5
FROM members;
The magical 5 in the script is the number of answers (bit_length in the formula above). You should change it according to your situation, regardless of how many bits there are in the actual data type used, as BIT_COUNT doesn't know how many bytes you are using.
BIT_COUNT returns the number of bits set and is explained in MySQL manual. ^ is the binary XOR operator in MySQL.
Here the comparison of member_1's answers is compared with everybody's, including their own - which results as 100% match, naturally.
I have the following piece of code, executing a pretty simple MySQL query:
$netnestquery = 'SELECT (`nested`+1) AS `nest` FROM `ipspace6` WHERE `id`<='.$adaddr.' AND `subnet`<='.$postmask.' AND `type`="net" AND `addr` NOT IN(SELECT `id` FROM `ipspace6` WHERE `addr`<'.$adaddr.' AND `type`="broadcast") ORDER BY `id`,`subnet` DESC LIMIT 1';
$netnestresults = mysql_query($netnestquery);
$netnestrow = mysql_fetch_array($netnestresults);
$nestlvl = $netnestrow['nest'];
echo '<br> NESTQ: '.$netnestquery;
Now, when I execute this in PHP, I get no results; an empty query. However, when I copy and paste the query echoed by my code (for debug purposes) into the mysql command line, I get a valid result:
mysql> SELECT (`nested` + 1) AS `nest` FROM `ipspace6` WHERE `id`<=50552019054038629283648959286463168512 AND `subnet`<=36 AND `type`='net' AND `addr` NOT IN (SELECT `id` FROM `ipspace6` WHERE `addr`<50552019054038629283648959286463168512 AND `type`='broadcast') ORDER BY `id`,`subnet` DESC LIMIT 1;
+------+
| nest |
+------+
| 1 |
+------+
1 row in set (0.00 sec)
Can anybody tell me what I'm doing wrong? I can't put quotes around my variables, as then MySQL will try to evaluate the variable as a string, when it is, in fact, a very large decimal. I think I might just be making a stupid mistake somewhere, but I can't tell where.
Can you modify the line to say $netnestresults = mysql_query($netnestquery) or die(mysql_error());
It may be giving you an unknown error, such as a bad connection, missing DB, etc.
do an echo $netnestquery
before calling mysql_query
also add a die(mysql_error()) there.
WHERE `id`<=50552019054038629283648959286463168512
That's a pretty big number there.
PHP has issues with big numbers. The maximum size of an integer depends on how PHP was compiled, and if it's on a 64-bit system.
Have you checked that the variable containing that number hasn't been capped to a 32-bit or 64-bit integer? If it has been capped, you're going to need to take steps to make sure it's only being stored as a string in PHP. MySQL accepts strings that are entirely numeric as numbers without complaining.
(That being said, I'm not sure that MySQL can do anything with a number larger than 64-bits. The largest integer column is BIGINT, which is 64-bits. There's also NUMERIC, but it's treated as a floating point number, and that might not be what you want to do...)