MYSQL Fulltext Search Issue With Special Characters - php

I am doing a fulltext search in PHP in BOOLEAN mode, which is working 95% of the time. However when I enter a special character like ", the query fails.
The table in question holds around 6,000 records, with a large quantity being commercial grade fasteners.
Example Data:
7/16" UNF x 2 1/4" Hex
Before the query is performed, the input is cleaned using the following function
function clean($db, $str) {
$str = #trim($str);
if(get_magic_quotes_gpc()) {
$str = stripslashes($str);
}
$str = htmlspecialchars($str, ENT_NOQUOTES);
$str = preg_replace("/[[:blank:]]+/"," ",$str);
return mysqli_real_escape_string($db, $str);
}
When searching for 7/16" and nothing else, I should get a few hundred results but I'm getting nothing and I believe it's to do with how the input is being escaped incorrectly.
Once the input is cleaned I add + to the beginning of each word to ensure it is included in the results. This is done using the following:
$symbol = '+';
$string = $symbol . str_replace(' ', " $symbol", $term);
Example Query:
SELECT * FROM products WHERE MATCH (code, desc) AGAINST ('+7\/16\\\"' IN BOOLEAN MODE) ORDER BY desc ASC
As you can see the original input of 7/16" has been changed to +7/16\\". Here is where I suspect the problem lies.
I'm really not sure what part of the clean function is causing this, I thought by adding ENT_NOQUOTES to htmlspecialchars would have resolved it but it hasn't.

Related

How to replace matching characters in two sets of strings with bolded characters?

currently working on a following function:
public function boldText($searchSuggestions)
{
$search = $this->getRequestParameter('search');
$pattern = "/".$search."/u";
$searchSuggestions = preg_replace($pattern, '<b>'.$search.'</b>', $searchSuggestions);
echo $searchSuggestions;
}
Let's say $searchSuggestions = hello
While the user is typing in the search box, which in this case the variable $search contains this input, a dropdown menu of all possible result suggestions are displayed. If a user types 'hello', then the search results like 'helloworld' or 'hello2' would pop up and the inputted word, int this case 'hello' would be bold in all outputted search results. So far it is working fine, however, big Characters are being replaced with small Characters and vice versa in the outputted search results. I have a feeling that the underlying problem might be in this function, however I am not entirely sure. If anyone has any suggestions or tips on where to look, it would be great!
If I should give out more info please do let me know, and I will edit the question immediately.
Thankyou!
Example output currently -
User types in search bar - 'hello'
result shown should be - 'Hello'
result actually being shown - 'hello'
P.S The results are accessed from an sql query. If a user types, than a query that gets data related to the words inputted is shown. For instance - 'SELECT * FROM test WHERE example LIKE '%hello%'
In database one can find the word Hello. Note the H has a big character.
I tried this following code
public function boldText($searchSuggestions)
{
$search = $this->getRequestParameter('search');
$pattern = "/".$search."/u";
$searchSuggestions = preg_replace($pattern, '<b>'.$search.'</b>', $searchSuggestions);
echo $searchSuggestions;
}
Brief Explanation
$searchSuggestions
Hello
$search(being typed by the user)
h
The output should be
<b>H</b>ello
The output I am currently getting is
<b>h</b>ello
I tried to look further on how to properly solve this and it was a bit simpler than I expected. Did not have to use preg_replace for this. The following code did the trick for me:
public function boldText($searchSuggestions)
{
$search = $this->getRequestParameter('search');
$lastPosition = strlen($search);
$firstPosition = stripos($searchSuggestions, $search);
$replacement = substr($searchSuggestions, $firstPosition, $lastPosition);
$searchSuggestions = substr_replace($searchSuggestions, '<b><u>'.$replacement.'</u></b>', $firstPosition, $lastPosition);
echo $searchSuggestions;
}
$lastPosition - we find how long the search input is, therfore getting the last position
$firstPosition - compare the two strings by finding a match(case insensitive), therefore finding the first position
$replacement - we remove what matches between search and searchSuggestion from the searchSuggestion string, with the help of the firstPosition and lastPosition variables.
At the end the searchSuggestion variable is replaced accordingly. Hope I was clear enough if not please let me know. (Answered my own question lol!)
You need to capture the found text you then can use that in the replacement bit. Your current implementation uses the input which may or may not match the case of the original string. So you should:
use the i modifier so your search is case insensitive
use word boundaries so partial matching doesn't occur
use capture group to capture the match
use preg_quote so any special regex characters don't affect regex operations
public function boldText($searchSuggestions)
{
$search = $this->getRequestParameter('search');
$pattern = "/\b(". preg_quote($search) . ")\b/iu";
$searchSuggestions = preg_replace($pattern, '<b>$1</b>', $searchSuggestions);
echo $searchSuggestions;
}
If partial word matching is intended you can remove word boundaries:
public function boldText($searchSuggestions)
{
$search = $this->getRequestParameter('search');
$pattern = "/(". preg_quote($search) . ")/iu";
$searchSuggestions = preg_replace($pattern, '<b>$1</b>', $searchSuggestions);
echo $searchSuggestions;
}
consider https://en.wikipedia.org/wiki/Scunthorpe_problem#Origin_and_history also though.

Preg replace callback validation

So I need to re-write some old code that I found on a library.
$text = preg_replace("/(<\/?)(\w+)([^>]*>)/e",
"'\\1'.strtolower('\\2').'\\3'", $text);
$text = preg_replace("/<br[ \/]*>\s*/","\n",$text);
$text = preg_replace("/(^[\r\n]*|[\r\n]+)[\s\t]*[\r\n]+/", "\n",
$text);
And for the first one I have tried like this:
$text = preg_replace_callback(
"/(<\/?)(\w+)([^>]*>)/",
function($subs) {
return strtolower($subs[0]);
},
$text);
I'm a bit confused b/c I don't understand this part: "'\\1'.strtolower('\\2').'\\3'" so I'm not sure what should I replace it with.
As far as I understand the first line looks for tags, and makes them lowercase in case I have data like
<B>FOO</B>
Can you guys help me out here with a clarification, and If my code is done properly?
The $subs is an array that contains the whole value in the first item and captured texts in the subsequent items. So, Group 1 is in $subs[1], Group 2 value is in $subs[2], etc. The $subs[0] contains the whole match value, and you applied strtolower to it, but the original code left the Group 3 value (captured with ([^>]*>) that may also contain uppercase letters) intact.
Use
$text = preg_replace_callback("~(</?)(\w+)([^>]*>)~", function($subs) {
return $subs[1] . strtolower($subs[2]) . $subs[3];
}, $text);
See the PHP demo.

php replace with regex and remove specified pattern with regex

|affffc100|Hitem:bb:101:1:1:1:1:48:-30:47:18:5:2:6:6:0:0:0:0:0:0:0:0|h[Subject Name]|h|r
my usual printed out variable is ^
|cffffc700|Hitem:x:x:x:x:x:x:x:x:x:x:x:x:x:x:x:x:x:x:x:x:x:x|h[SUBJECT_NAME]|h|r
my pattern is ^
ALL X's can be a-Z, 0-9
in one column I have many variables like that (up to 8).
and all variables are mixed with strings like that:
|affffc100|Hitem:bb:101:1:1:1:1:48:-30:47:18:5:2:6:6:0:0:0:0:0:0:0:0|h[Gold]|h|r NEW SOLD |affffc451|Hitem:bb:101:1:1:1:1:25:-33:12:42:5a:2f:6w:6:0:0:0:0f:0:0a:0b:0|h[Copper]|h|r maximum price 15k|affffx312|Hitem:bb:101:1:1:1:1:25:-33:12:42:5a:2f:6w:6:0:0:0:0f:0:0a:0b:0|h[Silver]|h|r
In one variable I want to clean all these unnecessary patterns and leave only subject name in brackets. []
So;
|cffffc700|Hitem:x:x:x:x:x:x:x:x:x:x:x:x:x:x:x:x:x:x:x:x:x:x|h[SUBJECT NAME]|h|r
needs to leave only SUBJECT_NAME in my variable.
just to remind, I have always more than one from these pattern in my every variable... (up to 8)
I've searched it everywhere but couldn't find any reasonable answers NOR good patterns. Tried to make it myself but I guess I need to take all these patterns and make it array and clean it and only leave these subject names but I don't know exactly how to do it.
how do I convert this to :
|affffc100|Hitem:bb:101:1:1:1:1:48:-30:47:18:5:2:6:6:0:0:0:0:0:0:0:0|h[Gold]|h|r NEW SOLD |affffc451|Hitem:bb:101:1:1:1:1:25:-33:12:42:5a:2f:6w:6:0:0:0:0f:0:0a:0b:0|h[Copper]|h|r maximum price 15k|affffx312|Hitem:bb:101:1:1:1:1:25:-33:12:42:5a:2f:6w:6:0:0:0:0f:0:0a:0b:0|h[Silver]|h|r
this:
Gold NEW SOLD Copper maxiumum price 15k Silver
what should I use, preg_replace?
one more thing left, when I have a string without my special pattern, I get empty result from the function eg:
$str = "15KKK sold, 20KK updated";
expected result:
"15KKK sold, 20KK updated" // same without any pattern
but ^ that one returns EMPTY result..
another string:
$str = "|affffc100|Hitem:bb:101:1:1:1:1:48:-30:47:18:5:2:6:6:0:0:0:0:0:0:0:0|h[Uranium]|h|r 155kk |affffc451|Hitem:bb:101:1:1:1:1:25:-33:12:42:5a:2f:6w:6:0:0:0:0f:0:0a:0b:0|h[Metal]|h|r is sold";
expected result:
"Uranium 155kk Metal is sold"
if I use that function with non-pattern string it returns empty result that's my problem now
thank you very much
I'd do:
$str = '|affffc100|Hitem:bb:101:1:1:1:1:48:-30:47:18:5:2:6:6:0:0:0:0:0:0:0:0|h[Gold]|h|r NEW SOLD |affffc451|Hitem:bb:101:1:1:1:1:25:-33:12:42:5a:2f:6w:6:0:0:0:0f:0:0a:0b:0|h[Copper]|h|r maximum price 15k|affffx312|Hitem:bb:101:1:1:1:1:25:-33:12:42:5a:2f:6w:6:0:0:0:0f:0:0a:0b:0|h[Silver]|h|r';
preg_match_all('/h(\[.+?\])\|h\|r([^|]*)/', $str, $m);
for($i=0; $i<count($m[0]); $i++) {
$res .= $m[1][$i] . ' ' . $m[2][$i] . ' ';
}
echo $res,"\n";
Output:
[Gold] NEW SOLD [Copper] maximum price 15k [Silver]
If you want to keep the strings that don't match, test the result of preg_match:
if (preg_match_all('/h(\[.+?\])\|h\|r([^|]*)/', $str, $m)) {
for($i=0; $i<count($m[0]); $i++) {
$res .= $m[1][$i] . ' ' . $m[2][$i] . ' ';
}
} else {
$res = $str;
}
echo $res,"\n";
try this regex:
\|\w{9}\|Hitem(?::-?\w+)+\|h\[(?<SUBJECTNAME>\w+)\]\|h\|r
it will capture each variable sequence, as well as the relevant element name in the named group.
see the demo here

PHP trouble with preg_match

I thought I had this working; however after further evaluation it seems it's not working as I would have hoped it was.
I have a query pulling back a string. The string is a comma separated list just as you see here:
(1,145,154,155,158,304)
Nothing has been added or removed.
I have a function that I thought I could use preg_match to determine if the user's id was contained within the string. However, it appears that my code is looking for any part.
preg_match('/'.$_SESSION['MyUserID'].'/',$datafs['OptFilter_1']))
using the same it would look like such
preg_match('/1/',(1,145,154,155,158,304)) I would think. After testing if my user id is 4 the current code returns true and it shouldn't. What am I doing wrong? As you can see the id length can change.
It's better to have all your IDs in an array then checking if a desired ID is existed:
<?php
$str = "(1,145,154,155,158,304)";
$str = str_replace(array("(", ")"), "", $str);
$arr = explode(',', $str);
if(in_array($_SESSION['MyUserID'], $arr))
{
// ID existed
}
As your string - In dealing with Regular Expressions, however it's not recommended here, below regex will match your ID if it's there:
preg_match("#[,(]$ID[,)]#", $str)
Explanations:
[,(] # a comma , or opening-parenthesis ( character
$ID # your ID
[,)] # a comma , or closing-parenthesis ) character

Restrict text input to only allow five (5) words in PHP

I want to know how I can allow only five (5) words on text input using PHP.
I know that I can use the strlen function for character count, but I was wondering how I can do it for words.
You can try it like this:
$string = "this has way more than 5 words so we want to deny it ";
//edit: make sure only one space separates words if we want to get really robust:
//(found this regex through a search and havent tested it)
$string = preg_replace("/\\s+/", " ", $string);
//trim off beginning and end spaces;
$string = trim($string);
//get an array of the words
$wordArray = explode(" ", $string);
//get the word count
$wordCount = sizeof($wordArray);
//see if its too big
if($wordCount > 5) echo "Please make a shorter string";
should work :-)
If you do;
substr_count($_POST['your text box'], ' ');
And limit it to 4
If $input is your input string,
$wordArray = explode(' ', $input);
if (count($wordArray) > 5)
//do something; too many words
Although I honestly don't know why you'd want to do your input validation with php. If you just use javascript, you can give the user a chance to correct the input before the form submits.
Aside from all these nice solutions using explode() or substr_count(), why not simply use PHP's built-in function to count the number of words in a string. I know the function name isn't particularly intuitive for this purpose, but:
$wordCount = str_word_count($string);
would be my suggestion.
Note, this isn't necessarily quite as effective when using multibyte character sets. In that case, something like:
define("WORD_COUNT_MASK", "/\p{L}[\p{L}\p{Mn}\p{Pd}'\x{2019}]*/u");
function str_word_count_utf8($str)
{
return preg_match_all(WORD_COUNT_MASK, $str, $matches);
}
is suggested on the str_word_count() manual page
You will have to do it twice, once using JavaScript at the client-side and then using PHP at the server-side.
In PHP, use split function to split it by space.So you will get the words in an array. Then check the length of the array.
$mytextboxcontent=$_GET["txtContent"];
$words = explode(" ", $mytextboxcontent);
$numberOfWords=count($words);
if($numberOfWords>5)
{
echo "Only 5 words allowed";
}
else
{
//do whatever you want....
}
I didn't test this.Hope this works. I don't have a PHP environment set up on my machine now.
You could count the number of spaces...
$wordCount = substr_count($input, ' ');
i think you want to do it first with Javascript to only allow the user to insert 5 words (and after validate it with PHP to avoid bad intentions)
In JS you need to count the chars as you type them and keep the count of the words you write ( by incrementing the counter each space)
Take a look of this code: http://javascript.internet.com/forms/limit-characters-and-words-entered.html

Categories