Search HTML using LIKE Operator in Database MySQL - php

I have stored some html in my database, for example:
ID | Data
1 | <a href=\"link\" class=\"someclass\" id=\"id_10923074\"><h3 class=\"class1 class2\"><\/h3><br \/><div class=\"clearfix\"><\/div><\/a>
2 | <a href=\"lin2\" class=\"someclass\" id=\"id_10923075\"><h3 class=\"class1 class2\">some text<\/h3><br \/><div class=\"clearfix\"><\/div><\/a>
Now, I would like to query an invalid records which doesn't contain text in the h3, which is row 1.
I have tried many queries, some are bellow:
SELECT `mytable`.* FROM `mytable` WHERE (Data LIKE '%<h3 class=\"class1 class2\"><\/h3>%')
SELECT `mytable`.* FROM `mytable` WHERE (Data LIKE '%h3 class="class1 class2"></h3%')
SELECT `mytable`.* FROM `mytable` WHERE (Data LIKE '"%class1 class2%"')
SELECT `mytable`.* FROM `mytable` WHERE (Data LIKE '%<h3 class=\"class1 class2\">%')
What am I missing here? I have been checking many questions here but cannot find any solution.
Thank you.

Your code is working as expected. See this sqlfiddle:
SELECT * FROM `mytable` WHERE (Data LIKE '%<h3 class=\"class1 class2\"><\/h3>%')
Your issue most likely lies with your quoting. Ensure you are escaping backslashes (\) and quotes (") properly in your PHP code.

Found the solution here:
So the trick is to double escape ONLY the backslash, for string escapes only a single escape is needed.
For example
The single quote ' only needs escaping once LIKE '%\'%'
But to query backslash \ you need to double escape to LIKE '%\\\\%'
If you wanted to query backslash+singlequote \' then LIKE '%\\\\\'%' (with 5 backslashes)
Explanation Source excerpt:
Because MySQL uses C escape syntax in strings (for example, “\n” to
represent a newline character), you must double any “\” that you use
in LIKE strings. For example, to search for “\n”, specify it as “\n”.
To search for “\”, specify it as “\”; this is because the backslashes
are stripped once by the parser and again when the pattern match is
made, leaving a single backslash to be matched against.
Correct Query:
SELECT * FROM `mytable` WHERE (Data LIKE '%<h3 class=\\\\"class1 class2\\\\"><\\\\/h3>%')

To find rows where Data has at least one <h3> (with or without any attributes such as class=) with no text before </h3>:
WHERE Data REGEXP '<h3[^>]*></h3>'

Related

SQL Searching for String in DB with Special Character/s

I'm trying to do a search in the database with special characters, specifically string with apostrophe.
For example, I want to search for the string: "Sandy's dog", but I just entered "sandys dog" leaving out the apostrophe. Even though "Sandy's dog" exists in the database, it doesn't seem to show it in the results.
Here's my query:
SELECT * FROM `Table` WHERE `Title` LIKE '%sandys dog%'
I have searched everywhere and I can't seem to find a solution that works.
EDIT
Limitations: the string is user generated
Notes:
- If a user searches for sandy's dog with the apostrophe, it works fine as expected.
- Ultimately I would like to get all possible results, if the table contains both strings with and without apostrophe.
In SQL server, you can use REPLACE:
SELECT *
FROM Table
WHERE REPLACE(Title, '''', '') LIKE '%sandys dog%'
The double-apostrophe inside the string is an escape character, so it finds any apostrophes in the string and replaces them with blank strings.
Please try using escape sequences.
http://dev.mysql.com/doc/refman/5.7/en/string-literals.html
Something like, SELECT * FROM Table WHERE Title LIKE '%sandy\'s dog%'
How about this?
SELECT * FROM Table WHERE Title LIKE '%sandy''s dog%'
or
SELECT * FROM Table WHERE Title LIKE '%sandy_s dog%'
The underscore is a "single character" wildcard.

mysqli query issue. mysqli_real_escape_string error

Having trouble with a mysqli query - specifically the WHERE stoname= clause.
This doesn't work:
$result = #mysqli_query($dbc, "SELECT * FROM thedb WHERE coname='{$_SESSION['user']}' AND stoname='{$_SESSION['store']}' ");
If I echo $_SESSION['store'] then it prints as o\'store which matches what's the in the database. Yet this doesn't work.
However, if I echo mysqli_real_escape_string($dbc,($_SESSION['store'])) then it prints o\\\'store which is NOT what's in the database. Yet it works.
$result = #mysqli_query($dbc, "SELECT * FROM thedb WHERE user='{$_SESSION['user']}' AND stoname='".mysqli_real_escape_string($dbc,($_SESSION['store']))."' ");
I accept that I have working code, but I'm confused as to why this the case. Can anyone explain what I've done / am doing wrong? Thanks
If your table literally contains o\'store then mysqli_real_escape_string() has done its job correctly. The function's purpose is to escape a string for safe inclusion inside a SQL statement, not to be an exact literal match for what is actually already in your table.
The table value contains literally \'. When you use that value directly in the SQL statement, the backslash is misinterpreted as an escape character to the ' rather than as a literal \ as appears in your table. So the query produces no results because the executed SQL statement is:
# MySQL sees only an escaped ' and no \
SELECT * FROM thedb WHERE coname='something' AND stoname='o\'store'
...meaning the value of stoname actually compared is just o'store without \, because \ has been discarded by MySQL as an escape character.
So mysqli_real_escape_string() produces a value with two changes.
First, the literal ' in your original string is backslashed escaped as \' for use in the SQL.
Then, the literal \ already in your string is itself backslash-escaped so that it can be understood as a literal characer by MySQL rather than an escape character. That results in \\. Combined with the escaped \', you now have \\\'.
MySQL receives that string \\\' and is able to correctly interpret it as one literal \ followed by one literal ' after discarding the extra \ escape character before each. The condition matches the column's actual value and your query is successful.
# MySQL sees an escaped \ followed by an escaped '
SELECT * FROM thedb WHERE coname='something' AND stoname='o\\\'store'
About storage...
We don't know much about how your table originally received its value, but I have a hunch it was stored in an escaped form. If the string o\'store was originally o'store without the \, it suggests that an escaped value was inserted in the table. That is not usually done, and is undesirable. Correct use of mysqli_real_escape_string() at the time of data insertion should store the original string rather than an escaped string. Escaping is only done when constructing SQL statements.

What does it mean to escape a string?

I was reading Does $_SESSION['username'] need to be escaped before getting into an SQL query? and it said "You need to escape every string you pass to the sql query, regardless of its origin". Now I know something like this is really basic. A Google search turned up over 20, 000 results. Stackoverflow alone had 20 pages of results but no one actually explains what escaping a string is or how to do it. It is just assumed. Can you help me? I want to learn because as always I am making a web app in PHP.
I have looked at:
Inserting Escape Characters, What are all the escape characters in Java?,
Cant escape a string with addcslashes(),
Escape character,
what does mysql_real_escape_string() really do?,
How can i escape double quotes from a string in php?,
MySQL_real_escape_string not adding slashes?,
remove escape sequences from string in php I could go on but I am sure you get the point. This is not laziness.
Escaping a string means to reduce ambiguity in quotes (and other characters) used in that string. For instance, when you're defining a string, you typically surround it in either double quotes or single quotes:
"Hello World."
But what if my string had double quotes within it?
"Hello "World.""
Now I have ambiguity - the interpreter doesn't know where my string ends. If I want to keep my double quotes, I have a couple options. I could use single quotes around my string:
'Hello "World."'
Or I can escape my quotes:
"Hello \"World.\""
Any quote that is preceded by a slash is escaped, and understood to be part of the value of the string.
When it comes to queries, MySQL has certain keywords it watches for that we cannot use in our queries without causing some confusion. Suppose we had a table of values where a column was named "Select", and we wanted to select that:
SELECT select FROM myTable
We've now introduced some ambiguity into our query. Within our query, we can reduce that ambiguity by using back-ticks:
SELECT `select` FROM myTable
This removes the confusion we've introduced by using poor judgment in selecting field names.
A lot of this can be handled for you by simply passing your values through mysql_real_escape_string(). In the example below you can see that we're passing user-submitted data through this function to ensure it won't cause any problems for our query:
// Query
$query = sprintf("SELECT * FROM users WHERE user='%s' AND password='%s'",
mysql_real_escape_string($user),
mysql_real_escape_string($password));
Other methods exist for escaping strings, such as add_slashes, addcslashes, quotemeta, and more, though you'll find that when the goal is to run a safe query, by and large developers prefer mysql_real_escape_string or pg_escape_string (in the context of PostgreSQL.
Some characters have special meaning to the SQL database you are using. When these characters are being used in a query they can cause unexpected and/or unintended behavior including allowing an attacker to compromise your database. To prevent these characters from affecting a query in this way they need to be escaped, or to say it a different way, the database needs to be told to not treat them as special characters in this query.
In the case of mysql_real_escape_string() it escapes \x00, \n, \r,\, ', " and \x1a as these, when not escaped, can cause the previously mentioned problems which includes SQL injections with a MySQL database.
For simplicity, you could basically imagine the backslash "\" to be a command to the interpreter during runtime.
For e.g. while interpreting this statement:
$txt = "Hello world!";
during the lexical analysis phase ( or when splitting up the statement into individual tokens) these would be the tokens identified
$, txt, =, ", Hello world!, ", and ;
However the backslash within the string will cause an extra set of tokens and is interpreted as a command to do something with the character that immediately follows it :
for e.g.
$txt = "this \" is escaped";
results in the following tokens:
$, txt, =, ", this, \, ", is escaped, ", and ;
the interpreter already knows (or has preset routes it can take) what to do based on the character that succeeds the \ token. So in the case of " it proceeds to treat it as a character and not as the end-of-string command.

Searching MySQL for data that contains backslashes

In a database, I have some text stored in a field call Description, the value of the string saved in my database is Me\You "R'S'" % and thats how it appears when querying the database command line.
Now, on a web page i have a function which searches this field as such:
WHERE Description LIKE '%$searchstring%'
So when $searchstring has been cleaned, if i was searching for Me\You, the backslash gets escape and my query reads:
WHERE Description LIKE '%Me\\You%'
However it doesn't return anything.
Strange part of this, is that when i search Me\\You or Me\\\You (So two or three backslashes, but no less or no more) it will return the result i expect with one backslash.
When querying for the result command-line, it does not return a result for:
WHERE Description LIKE '%Me\You%'
or when i use two or three backslashes.
However it will return the result if i use 4 - 7 backslashes, for example:
WHERE Description LIKE '%Me\\\\\\\You%'
will return the string which is Me\You "R'S'" %
Anyone have a reason to this happening? Thanks
Note
Because MySQL uses C escape syntax in strings (for example, “\n” to represent a newline character), you must double any “\” that you use in LIKE strings. For example, to search for “\n”, specify it as “\\n”. To search for “\”, specify it as “\\\\”; this is because the backslashes are stripped once by the parser and again when the pattern match is made, leaving a single backslash to be matched against.
Source: http://dev.mysql.com/doc/refman/5.1/en/string-comparison-functions.html#operator_like
Read this Need to select only data that contains backslashes in MySQL to see how to use double backslash escaping. You could also run MySQL in NO_BACKSLASH_ESCAPES mode (http://dev.mysql.com/doc/refman/5.0/en/server-sql-mode.html#sqlmode_no_backslash_escapes)
Although an old post, you can bypass this limitation using replace function to change backslash to another character: something like this in the WHERE clause. EXAMPLE:
WHERE replace('your field here', '\', '-') like "You-Me%"

Mysql, escaped string fields couldn't be searched with "LIKE" keyword

Consider one of the field( sample_field ) in Mysql table has as "your\'s data", when I query the same table as
SELECT * FROM sample_table WHERE sample_field = "your\'s data"
and also like as
SELECT * FROM sample_table WHERE sample_field = "your's data"
both of the above query returns 0 rows, even though the sample_field has the value as "your\'s data"
After a long search I came to know that, my search would be
SELECT * FROM sample_table WHERE sample_field = "your\\\'s data"
that is working very fine. but
SELECT * FROM sample_table WHERE sample_field LIKE "your\\\'s data"
is not working. So if I want to search any product or categories with quotes( like "your's data" ), in site search I must use LIKE keyword with wildcard patterns('%').
Now I have found the answer that if I try like this
SELECT * FROM sample_table WHERE sample_field LIKE "your\\\\'s data"
as given in http://dev.mysql.com/doc/refman/5.0/en/string-comparison-functions.html#operator_like, it is working fine.
But please let me know why it needs this amount of 4 back slashes to achieve this ?
Thanks in advance
You'll have to escape the escaping backslash, if you want it to be a literal backslash:
WHERE sample_field = "your\\'s data"
But really, having a backslash in your database in the first place is the mistake, fix that instead.
Because MySQL uses C escape syntax in strings (for example, “\n” to represent a newline character), you must double any “\” that you use in LIKE strings. For example, to search for “\n”, specify it as “\n”. To search for “\”, specify it as “\\”; this is because the backslashes are stripped once by the parser and again when the pattern match is made, leaving a single backslash to be matched against.
Try it like this:
... WHERE sample_field = 'your\'s data'

Categories