I have a very long list of names and I am using preg_replace to match if a name from the list is anywhere in the string. If I test it with few names in the regex it works fine, but having in mind that I have over 5000 names it gives me the error "preg_replace(): Compilation failed: regular expression is too large".
Somehow I cannot figure out how to split the regex into pieces so it becomes smaller (if even possible).
The list with names is created dynamically from a database. Here is my code.
$query_gdpr_names = "select name FROM gdpr_names";
$result_gdpr_names = mysqli_query($connect, $query_gdpr_names);
while ($row_gdpr_names = mysqli_fetch_assoc($result_gdpr_names))
{
$AllNames .= '"/'.$row_gdpr_names['name'].'\b/ui",';
}
$AllNames = rtrim($AllNames, ',');
$AllNames = "[$AllNames]";
$search = preg_replace($AllNames, '****', $search);
The created $AllNames str looks like this (in the example 3 names only)
$AllNames = ["/Lola/ui", "/Monica\b/ui", "/Chris\b/ui"];
And the test string
$search = "I am Lola and my friend name is Chris";
Any help is very appreciated.
Since it appears that you can't easily handle the replacement from PHP using a single regex alternation, one alternative would be to just iterate each name in the result set one by one and make a replacement:
while ($row_gdpr_names = mysqli_fetch_assoc($result_gdpr_names)) {
$name = $row_gdpr_names['name'];
$regex = "/\b" . $name . "\b/ui";
$search = preg_replace($regex, '----', $search);
}
$search = preg_replace("/----/", '****', $search);
This is not the most efficient pattern for doing this. Perhaps there is some way you can limit your result set to avoid a too long single alternation.
Ok, I was debugging a lot. Even isolating everything else but this part of code
$search = "Lola and Chris";
$query_gdpr_names = "select * FROM gdpr_names";
$result_gdpr_names = mysqli_query($connect, $query_gdpr_names);
while ($row_gdpr_names = mysqli_fetch_assoc($result_gdpr_names)) {
$name = $row_gdpr_names['name'];
$regex = "/\b" . $name . "\b/ui";
$search = preg_replace($regex, '****', $search);
}
echo $search;
Still, print inside but not outside the loop.
The problem actually was in the database records. There was a slash in one of the records
I have a SQL query for my search form.
$term = $request->get('term');
$queries = Article::where('title', 'LIKE', '%' . $term . '%')->published()->get();
My research is working. If I have an article called "my great article is awesome" and that I write in my search form "greate article" it works.
But if I write "article awesome", the words do not follow each other, and it does not work.
How do I get my query to work just with keywords?
Thank you
You can do something like follows:
$term = $request->get('term');
$keywords = explode(" ", $term);
$article = Article::query();
foreach($keywords as $word){
$article->orWhere('title', 'LIKE', '%'.$word.'%');
}
$articles = $article->published()->get();
If you want only results that contain all the words in the query just replace the orWhere with where.
If you want to filter out certain words you could add something like:
$filtered = ["a", "an", "the"];
$filteredKeywords = array_diff($keywords, $filtered);
Alternatively you can pass a closure if you want to be more dynamic:
$filteredKeywords = array_filter($keywords, function($word) {
return strlen($word) > 2;
});
why don't you try something like that
$search = "article awesome";
$search = preg_replace("#\s+#", '%', $search);
replacing spaces with '%' will resolve the case you mentioned
if you want to make the search ignoring the words order this should work
$search = trim(" article awesome great ");
$search = preg_split("#\s+#", $search);
$where = "WHERE column like '%" . implode( "%' AND column like '%", $search ) . "%'";
however it will take more execution time and resources on the server,
also think to add some injection escaping to avoid sql syntax errors
I have Chinese php search queries.
I want to split up any query up into individual characters.
ex: 你好 (ni hao, hello) split into 你 and 好
my query is set like:
$q = $_REQUEST["q"];
the results I want to split is set up like:
$results4 = $db->query( "SELECT CHS, PIN, DEF FROM FOUR
WHERE CHS LIKE '%".$q."%' OR PIN LIKE '%".$q."%'");
while ($row4 = $results4->fetchArray()) {
How can I split up the keyword and look up all the components?
If you want it all in one query you will have to generate the whole query. If you were looking for an exact match you could use something similar to the in_array() function, but with LIKE it doesn't work.
You could however loop through the array of characters and put together the WHERE part programatically.
Like this
$where = array();
foreach ( $qtwo as $word ) {
$where[] = "CHS LIKE '%" . $word . "%'";
}
$where = implode(' OR ', $where);
Use this $where variable in your query
You can use str_split to convert a string in an array of chars
$chars = str_split($q)
I have this string "Beautiful sunset" (notice the double space)
When I search in the database:
SELECT * FROM images WHERE title LIKE '%$string%'
and I search for "Beautiful sunset" (notice the single space) it won't return any results.
How can I tackle this?
split the string by space.
now you have two strings something like $str1, $str2
trim this two strings(i.e remove leading and trailing whitespaces)
then rebuild string
$string = $str1+'%'+$str2
Try this
SELECT * FROM images WHERE where MATCH(title) AGAINST ('$string' IN BOOLEAN MODE)
Check this Link also
http://dev.mysql.com/doc/refman/5.5/en/fulltext-boolean.html
You could split your search string into multiple parts using the space and then build a sql like this for every part you have splitted:
SELECT * FROM images WHERE title LIKE '%$part[0]%'
or title LIKE '%$part[1]%'
or title LIKE '%$part[2]%'
or title LIKE '%$part[3]%'
...
Make sure to skip double spaces / empty parts.
One way you can do this is to use string replace to replace spaces with a wildcard character:
$string = str_replace(" ", "%", $string);
$Query = "SELECT * FROM images WHERE title LIKE '%$string%'";
If you don't know what string you're going to get (if it isn't "Beautiful sunset" all the time), you could explode the string and make a query based on that. Like so...
$c = false;
$stringtosearch = "Beautiful sunset";
$query = "SELECT * FROM images WHERE";
foreach(explode(" ", $stringtosearch) as $b)
{
if ($c)
{
$query .= " or";
}
$query .= " title LIKE " . $b;
$c = true;
}
And afterwards you would get the variable $query with your query string.
I want to take a url that does not have any apostrophes, commas or ampersands in it and match it with a record in a database that may have one of those characters.
For example:
mywebsite.com/bobs-big-boy
mywebsite.com/tom--jerry
mywebsite.com/one-two-three
rewrite to
index.php?name=bobs-big-boy
index.php?name=tom--jerry
index.php?name=bobs-big-boy
Then in php I want to use the $_GET['name'] to match the records
bob's big boy
tom & jerry
one, two, three
Now my query looks like this:
"SELECT * from the_records WHERE name=$NAME";
I can't change the records, because they're business names. Is there a way I can write the query to ignore ampersands, commas and apostrophes in the db?
Yes you can but I'm pretty sure it will ignore any indexes you have on the column. And it's disgusting.
Something like
SELECT * FROM the_records
WHERE replace(replace(replace(name, '''', ''), ',', ''), '&', '') = $NAME
By the way taking a get variable like that and injecting it into the mysql query can be ripe for sql injection as far as I know.
pg, I know you said you can't change/update the content in the database you're selecting from, but does anything preclude you from making a table in another database you do have write access to? You could just make a map of urlnames to business names and it'd only be slow the first time you do the replace method.
Greetings,
This one took me a few minutes to puzzle out! There are actually a few specifics missing on you requirements, so I've tried to work through the problem with different assumptions, as stated below.
Here is the set of assumed input from the URL, as pulled from your example, along with a MySQL injection attack (just for giggles), and variations on the business names. The keys are the expected URLs and the values are the database values to match.
<?php
$names = array(
'bobs-big-boy'=>"bob's big boy",
'tom--jerry'=>'tom & jerry',
'tomjerry'=>'tom&jerry',
'one-two-three'=>'one, two, three',
'onetwothree'=>'one,two,three',
"anything' OR 'haxor'='haxor"=>'die-haxor-die',
);
?>
One clever way to do an end-run mySQL's lack of regex replacement is to use SOUNDEX, and this approach would seem to mostly work in this case depending on the level of accuracy you need, the density of and similarity of customer names, etc. For example, this generates the soundex values for the values above:
$soundex_test = $names;
$select = 'SELECT ';
foreach ($soundex_test as $name=>$dbname) {
echo '<p>'.$name.': '.soundex($name).' :: '.$dbname.': '.soundex($dbname).'</p>';
$select .= sprintf("SOUNDEX('%s'),", $name);
}
echo '<pre>MySQL queries with attack -- '.print_r($select,1).'</pre>';
So, assuming that there are not customers named 'one, two, three' and separate one named 'onetwothree', this approach should work nicely.
To use this method, your queries would look something like this:
$soundex_unclean = $names;
foreach ($soundex_unclean as $name=>$dbname) {
$soundex_unclean[$name] = sprintf("SELECT * from the_records WHERE name SOUNDS LIKE '%s';", $name).' /* matches name field = ['.$dbname.'] */';
}
echo '<pre>MySQL queries with attack -- '.print_r(array_values($soundex_unclean),1).'</pre>';
However, here is a run that DOES deal with the injection attack (note the new line). I know this isn't the focus of the question, but ajreal mentioned the issue, so I thought to deal with it as well:
$soundex_clean = $names;
foreach ($soundex_clean as $name=>$dbname) {
// strip out everything but alpha-numerics and dashes
$clean_name = preg_replace('/[^[:alnum:]-]/', '', $name);
$soundex_unclean[$name] = sprintf("SELECT * from the_records WHERE name SOUNDS LIKE '%s';", $clean_name).' /* matches name field = ['.$dbname.'] */';
}
echo '<pre>MySQL queries with attack cleaned -- '.print_r($soundex_unclean,1).'</pre>';
If this approach does not suit, and you decided that the inline replacement approach is sufficient, then do remember to add a replacement for comma to the mix as well. As an example of that approach, I'm assuming here that the single quote, double quote, ampersand, and comma (i.e. ', ", &, and ,) are the only four special characters are included in the database but deleted from the URL, and that any other non-alpha-numeric character, spaces included, are converted to a dash (i.e. -).
First, a run that does not deal with the injection attack:
$unclean = $names;
foreach ($unclean as $name=>$dbname) {
$regex_name = preg_replace('/[-]+/', '[^[:alnum:]]+', $name);
$unclean[$name] = sprintf("SELECT * from the_records WHERE REPLACE(REPLACE(REPLACE(REPLACE(name, ',', ''), '&', ''), '\"', ''), \"'\", '') REGEXP '%s'", $regex_name);
}
echo '<pre>MySQL queries with attack -- '.print_r($unclean,1).'</pre>';
Second, a run that DOES deal with the attack:
$clean = $names;
foreach ($clean as $name=>$dbname) {
$regex_name = preg_replace('/[^[:alnum:]-]/', '', $name);
$regex_name = preg_replace('/[-]+/', '[^[:alnum:]]+', $regex_name);
$clean[$name] = sprintf("SELECT * from the_records WHERE REPLACE(REPLACE(REPLACE(REPLACE(name, ',', ''), '&', ''), '\"', ''), \"'\", '') REGEXP '%s'", $regex_name);
}
echo '<pre>MySQL queries with attack cleaned -- '.print_r($clean,1).'</pre>';
Aaaand that's enough brainstorming for me for one night! =o)
Using str_replace function we will grab the $name parameter and replace
ampersands (&) with ""
spaces (" ") with "-"
commas (",") with ""
apostrophes("'") with ""
str_replace ( mixed $search , mixed $replace , mixed $subject [, int &$count ] )
$Search = { "&", " ", ",", "'" }
$Replace = { "", "-", "", "" }
$ComparableString = str_replace($Search, $Replace, $_GET['name'])
After that we can do the sql query:
$name = mysql_real_escape_string($name, $db_resource);
SELECT * from the_records WHERE name='$name'
It's a little janky, but you could explode the GET and build a WHERE on multiple conditions.
Something like (untested):
$name_array = explode("-", $_GET['name']);
$sql_str = "SELECT * FROM the_records WHERE ";
$first_time = true;
foreach($name_array as $name){
if ($name != ""){
if ($first_time){
$sql_str .= "name LIKE \"%".$name."%\"";
$first_time = false;
}
else {
$sql_str .= " AND name LIKE \"%".$name."%\"";
}
}
}