Is there an easy way to parse a string for search terms including negative terms?
'this -that "the other thing" -"but not this" "-positive"'
would change to
array(
"positive" => array(
"this",
"the other thing",
"-positive"
),
"negative" => array(
"that",
"but not this"
)
)
so those terms could be used to search.
The code below will parse your query string and split it up into positive and negative search terms.
// parse the query string
$query = 'this -that "-that" "the other thing" -"but not this" ';
preg_match_all('/-*"[^"]+"|\S+/', $query, $matches);
// sort the terms
$terms = array(
'positive' => array(),
'negative' => array(),
);
foreach ($matches[0] as $match) {
if ('-' == $match[0]) {
$terms['negative'][] = trim(ltrim($match, '-'), '"');
} else {
$terms['positive'][] = trim($match, '"');
}
}
print_r($terms);
Output
Array
(
[positive] => Array
(
[0] => this
[1] => -that
[2] => the other thing
)
[negative] => Array
(
[0] => that
[1] => but not this
)
)
For those looking for the same thing I have created a gist for PHP and JavaScript
https://gist.github.com/UziTech/8877a79ebffe8b3de9a2
function getSearchTerms($search) {
$matches = null;
preg_match_all("/-?\"[^\"]+\"|-?'[^']+'|\S+/", $search, $matches);
// sort the terms
$terms = [
"positive" => [],
"negative" => []
];
foreach ($matches[0] as $i => $match) {
$negative = ("-" === $match[0]);
if ($negative) {
$match = substr($match, 1);
}
if (($match[0] === '"' && substr($match, -1) === '"') || ($match[0] === "'" && substr($match, -1) === "'")) {
$match = substr($match, 1, strlen($match) - 2);
}
if ($negative) {
$terms["negative"][] = $match;
} else {
$terms["positive"][] = $match;
}
}
return $terms;
}
Related
Wondering if anyone out there can help me with the following regular expression, i can't match the block multine CF.{Coordonnees Abonne}: when used in PHP's preg_match function.
What is weird is when I do regex online it seems to work despite the block is in another group regex101 example
Here is the code : source code
<?php
$response = array(
1 => 'CF.{Temps}: 1',
2 => 'CF.{Etat}: return',
3 => 'CF.{Code}: 2',
4 => 'CF.{Values}: plaque',
5 => '',
6 => 'CF.{Coordonnees}: LA PERSONNE',
7 => ' ',
8 => ' 10000 LA VILLE',
9 => ' ',
10 => ' 0500235689',
11 => ' 0645788923',
12 => ' Login : test#mail.com',
13 => ' Password : PassWord!',
14 => '',
15 => 'CF.{Groupe}: 3',
16 => 'CF.{Date}: 4',
);
print_r(parseResponseBody($response));
function parseResponseBody(array $response, $delimiter = ':')
{
$responseArray = array();
$lastkey = null;
foreach ($response as $line) {
if(preg_match('/^([a-zA-Z0-9]+|CF\.{[^}]+})' . $delimiter . '\s(.*)|([a-zA-Z0-9].*)$/', $line, $matches)) {
$lastkey = $matches[1];
$responseArray[$lastkey] = $matches[2];
}
}
return $responseArray;
}
?>
Output :
Array
(
[CF.{Temps}] => 1
[CF.{Etat}] => return
[CF.{Code}] => 2
[CF.{Values}] => plaque
[CF.{Coordonnees}] => LA PERSONNE
[] =>
[CF.{Groupe}] => 3
[CF.{Date}] => 4
)
And there is the wanted final result that i need to extract :
Array
(
[CF.{Temps}] => 1
[CF.{Etat}] => return
[CF.{Code}] => 2
[CF.{Values}] => plaque
[CF.{Coordonnees}] => LA PERSONNE
10000 LA VILLE
0500235689
0645788923
Login : test#mail.com
Password : PassWord!
[CF.{Groupe}] => 3
[CF.{Date}] => 4
)
You have to check if current value at iteration starts with a block or not. Not both at same time though:
function parseResponseBody(array $response, $delimiter = ':') {
$array = [];
$lastIndex = null;
foreach ($response as $line) {
if (preg_match('~^\s*(CF\.{[^}]*})' . $delimiter . '\s+(.*)~', $line, $matches))
$array[$lastIndex = $matches[1]] = $matches[2];
elseif ((bool) $line)
$array[$lastIndex] .= PHP_EOL . $line;
}
return $array;
}
Live demo
I would do that this way:
function parse($response, $del=':', $nl="\n") {
$pattern = sprintf('~(CF\.{[^}]+})%s \K.*~A', preg_quote($del, '~'));
foreach ($response as $line) {
if ( preg_match($pattern, $line, $m) ) {
if ( !empty($key) )
$result[$key] = rtrim($result[$key]);
$key = $m[1];
$result[$key] = $m[0];
} else {
$result[$key] .= $nl . $line;
}
}
return $result;
}
var_export(parse($response));
demo
The key is stored in the capture group 1 $m[1] but the whole match $m[0] returns only the value part (the \K feature discards all matched characters on its left from the match result). When the pattern fails, the current line is appended for the last key.
The regex is fine, you just need to handle the case when there is no key:
function parseResponseBody(array $response, $delimiter = ':')
{
$responseArray = array();
$key = null;
foreach ($response as $line) {
if(preg_match('/^([a-zA-Z0-9]+|CF\.{[^}]+})' . $delimiter . '\s(.*)|([a-zA-Z0-9].*)$/', $line, $matches)) {
$key = $matches[1];
if(empty($key)){
$key = $lastKey;
$responseArray[$key] .= PHP_EOL . $matches[3];
}else{
$responseArray[$key] = $matches[2];
}
$lastKey = $key;
}
}
return $responseArray;
}
https://3v4l.org/rFIbk
Following php function is being used to replace bad words with starts but I need one additional parameters that will describe either bad words found or not .
$badwords = array('dog', 'dala', 'bad3', 'ass');
$text = 'This is a dog. . Grass. is good but ass is bad.';
print_r( filterBadwords($text,$badwords));
function filterBadwords($text, array $badwords, $replaceChar = '*') {
$repu = preg_replace_callback(array_map(function($w) { return '/\b' . preg_quote($w, '/') . '\b/i'; }, $badwords),
function($match) use ($replaceChar) {
return str_repeat($replaceChar, strlen($match[0])); },
$text
);
return array('error' =>'Match/No Match', 'text' => $repu );
}// Func
Output if badwords found should be like
Array ( [error] => Match[text] => Bad word dog match. )
If no badwords found then
Array ( [error] => No Match[text] => Bad word match. )
You can use the following:
function filterBadwords($text, array $badwords, $replaceChar = '*') {
//new bool var to see if there was any match
$matched = false;
$repu = preg_replace_callback(array_map(
function($w)
{
return '/\b' . preg_quote($w, '/') . '\b/i';
}, $badwords),
//pass the $matched by reference
function($match) use ($replaceChar, &$matched)
{
//if the $match array is not empty update $matched to true
if(!empty($match))
{
$matched = true;
}
return str_repeat($replaceChar, strlen($match[0]));
}, $text);
//return response based on the bool value of $matched
if($matched)
{
$return = array('error' =>'Match', 'text' => $repu );
}
else
{
$return = array('error' =>'No Match', 'text' => $repu );
}
return $return;
}
This uses reference and if condition to see if there were any matches and then returns response based on that.
Output(if matched):
array (size=2)
'error' => string 'Match' (length=5)
'text' => string 'This is a ***. . Grass. is good but *** is bad.'
Output(if none matched):
array (size=2)
'error' => string 'No Match' (length=8)
'text' => string 'This is a . . Grass. is good but is bad.'
<?php
$badwords = array('dog', 'dala', 'bad3', 'ass');
$text = 'This is a dog. . Grass. is good but ass is bad.';
$res=is_badword($badwords,$text);
echo "<pre>"; print_r($res);
function is_badword($badwords, $text)
{
$res=array('No Error','No Match');
foreach ($badwords as $name) {
if (stripos($text, $name) !== FALSE) {
$res=array($name,'Match');
return $res;
}
}
return $res;
}
?>
Output:
Array
(
[0] => dog
[1] => Match
)
I have following array
Array
(
[0] => Array
(
[data] => PHP
[attribs] => Array
(
)
[xml_base] =>
[xml_base_explicit] =>
[xml_lang] =>
)
[1] => Array
(
[data] => Wordpress
[attribs] => Array
(
)
[xml_base] =>
[xml_base_explicit] =>
[xml_lang] =>
)
)
one varialbe like $var = 'Php, Joomla';
I have tried following but not working
$key = in_multiarray('PHP', $array,"data");
function in_multiarray($elem, $array,$field)
{
$top = sizeof($array) - 1;
$bottom = 0;
while($bottom <= $top)
{
if($array[$bottom][$field] == $elem)
return true;
else
if(is_array($array[$bottom][$field]))
if(in_multiarray($elem, ($array[$bottom][$field])))
return true;
$bottom++;
}
return false;
}
so want to check if any value in $var is exists in array(case insensitive)
How can i do it without loop?
This should work for you:
(Put a few comments in the code the explain whats goning on)
<?php
//Array to search in
$array = array(
array(
"data" => "PHP",
"attribs" => array(),
"xml_base" => "",
"xml_base_explicit" => "",
"xml_lang" => ""
),
array(
"data" => "Wordpress",
"attribs" => array(),
"xml_base" => "",
"xml_base_explicit" => "",
"xml_lang" => "Joomla"
)
);
//Values to search
$var = "Php, Joomla";
//trim and strtolower all search values and put them in a array
$search = array_map(function($value) {
return trim(strtolower($value));
}, explode(",", $var));
//function to put all non array values into lowercase
function tolower($value) {
if(is_array($value))
return array_map("tolower", $value);
else
return strtolower($value);
}
//Search needle in haystack
function in_array_r($needle, $haystack, $strict = false) {
foreach ($haystack as $item) {
if (($strict ? $item === $needle : $item == $needle) || (is_array($item) && in_array_r($needle, $item, $strict))) {
return true;
}
}
return false;
}
//Search ever value in array
foreach($search as $value) {
if(in_array_r($value, array_map("tolower", array_values($array))))
echo $value . " found<br />";
}
?>
Output:
php found
joomla found
to my understanding , you are trying to pass the string ex : 'php' and the key : 'data' of the element .
so your key can hold a single value or an array .
$key = in_multiarray("php", $array,"data");
var_dump($key);
function in_multiarray($elem, $array,$field)
{
$top = sizeof($array) - 1;
$bottom = 0;
while($bottom <= $top)
{
if(is_array($array[$bottom][$field]))
{
foreach($array[$bottom][$field] as $value)
{
if(strtolower(trim($value)) == strtolower(trim($elem)))
{
return true;
}
}
}
else if(strtolower(trim($array[$bottom][$field])) == strtolower(trim($elem)))
{
return true;
}
$bottom++;
}
return false;
}
I have the following input:
$input = [
0 => '$id000001',
1 => '$id000002',
2 => '$id000003',
3 => 'Alexandre'
];
$keywords = [
'$id000001' => 'function_name($+2)',
'$id000002' => '$user',
'$id000003' => '$-1 = $+1'
];
I would like to implement a function that will replace $input elements with $keywords elements, with the following output:
[
0 => 'function_name($+2)',
1 => '$user',
2 => '$-1 = $+1',
3 => 'Alexandre'
];
Here is the point, my function have to replace all $(+|-)[0-9]+ elements (like $+2, $-1, ...) with $input element value (after it has been replaced) and then remove them. The number is the row offset index :
$-1 = $+1 will be replaced with $user = 'Alexandre'
function_name($+2) will be replaced with $-1 = $+1 (wich is $user = 'Alexandre')
So, the final output will be:
[
0 => function_name($user = 'Alexandre')
]
Ok, after trying to fix infinite recurtions, i found this :
function translate($input, array $keywords, $index = 0, $next = true)
{
if ((is_array($input) === true) &&
(array_key_exists($index, $input) === true))
{
$input[$index] = translate($input[$index], $keywords);
if (is_array($input[$index]) === true)
$input = translate($input, $keywords, $index + 1);
else
{
preg_match_all('/\$((?:\+|\-)[0-9]+)/i', $input[$index], $matches, PREG_SET_ORDER);
foreach ($matches as $match)
{
$element = 'false';
$offset = ($index + intval($match[1]));
$input = translate($input, $keywords, $offset, false);
if (array_key_exists($offset, $input) === true)
{
$element = $input[$offset];
unset($input[$offset]);
}
$input[$index] = str_replace($match[0], $element, $input[$index]);
}
if (empty($matches) === false)
$index--;
if ($next === true)
$input = translate(array_values($input), $keywords, $index + 1);
}
}
else if (is_array($input) === false)
$input = str_replace(array_keys($keywords), $keywords, $input);
return $input;
}
Maybe, someone could find some optimizations.
I'm interested in knowing if I can detect inflections (e.g. dogs/dog), remove non-important words ("made in the usa" -> "in" and "the" are not important), etc. in the search string entered by the user for the Magento search engine without hard-coding such many scenarios in one big PHP code block. I can process this search string to a certain degree, but it will look unsanitary and ugly.
Any suggestions or pointers for making it an "intelliegent" search engine?
Use this class:
class Inflection
{
static $plural = array(
'/(quiz)$/i' => "$1zes",
'/^(ox)$/i' => "$1en",
'/([m|l])ouse$/i' => "$1ice",
'/(matr|vert|ind)ix|ex$/i' => "$1ices",
'/(x|ch|ss|sh)$/i' => "$1es",
'/([^aeiouy]|qu)y$/i' => "$1ies",
'/(hive)$/i' => "$1s",
'/(?:([^f])fe|([lr])f)$/i' => "$1$2ves",
'/(shea|lea|loa|thie)f$/i' => "$1ves",
'/sis$/i' => "ses",
'/([ti])um$/i' => "$1a",
'/(tomat|potat|ech|her|vet)o$/i'=> "$1oes",
'/(bu)s$/i' => "$1ses",
'/(alias)$/i' => "$1es",
'/(octop)us$/i' => "$1i",
'/(ax|test)is$/i' => "$1es",
'/(us)$/i' => "$1es",
'/s$/i' => "s",
'/$/' => "s"
);
static $singular = array(
'/(quiz)zes$/i' => "$1",
'/(matr)ices$/i' => "$1ix",
'/(vert|ind)ices$/i' => "$1ex",
'/^(ox)en$/i' => "$1",
'/(alias)es$/i' => "$1",
'/(octop|vir)i$/i' => "$1us",
'/(cris|ax|test)es$/i' => "$1is",
'/(shoe)s$/i' => "$1",
'/(o)es$/i' => "$1",
'/(bus)es$/i' => "$1",
'/([m|l])ice$/i' => "$1ouse",
'/(x|ch|ss|sh)es$/i' => "$1",
'/(m)ovies$/i' => "$1ovie",
'/(s)eries$/i' => "$1eries",
'/([^aeiouy]|qu)ies$/i' => "$1y",
'/([lr])ves$/i' => "$1f",
'/(tive)s$/i' => "$1",
'/(hive)s$/i' => "$1",
'/(li|wi|kni)ves$/i' => "$1fe",
'/(shea|loa|lea|thie)ves$/i'=> "$1f",
'/(^analy)ses$/i' => "$1sis",
'/((a)naly|(b)a|(d)iagno|(p)arenthe|(p)rogno|(s)ynop|(t)he)ses$/i' => "$1$2sis",
'/([ti])a$/i' => "$1um",
'/(n)ews$/i' => "$1ews",
'/(h|bl)ouses$/i' => "$1ouse",
'/(corpse)s$/i' => "$1",
'/(us)es$/i' => "$1",
'/s$/i' => ""
);
static $irregular = array(
'move' => 'moves',
'foot' => 'feet',
'goose' => 'geese',
'sex' => 'sexes',
'child' => 'children',
'man' => 'men',
'tooth' => 'teeth',
'person' => 'people',
'admin' => 'admin'
);
static $uncountable = array(
'sheep',
'fish',
'deer',
'series',
'species',
'money',
'rice',
'information',
'equipment'
);
public static function pluralize( $string )
{
global $irregularWords;
// save some time in the case that singular and plural are the same
if ( in_array( strtolower( $string ), self::$uncountable ) )
return $string;
// check for irregular singular forms
foreach ( $irregularWords as $pattern => $result )
{
$pattern = '/' . $pattern . '$/i';
if ( preg_match( $pattern, $string ) )
return preg_replace( $pattern, $result, $string);
}
// check for irregular singular forms
foreach ( self::$irregular as $pattern => $result )
{
$pattern = '/' . $pattern . '$/i';
if ( preg_match( $pattern, $string ) )
return preg_replace( $pattern, $result, $string);
}
// check for matches using regular expressions
foreach ( self::$plural as $pattern => $result )
{
if ( preg_match( $pattern, $string ) )
return preg_replace( $pattern, $result, $string );
}
return $string;
}
public static function singularize( $string )
{
global $irregularWords;
// save some time in the case that singular and plural are the same
if ( in_array( strtolower( $string ), self::$uncountable ) )
return $string;
// check for irregular words
foreach ( $irregularWords as $result => $pattern )
{
$pattern = '/' . $pattern . '$/i';
if ( preg_match( $pattern, $string ) )
return preg_replace( $pattern, $result, $string);
}
// check for irregular plural forms
foreach ( self::$irregular as $result => $pattern )
{
$pattern = '/' . $pattern . '$/i';
if ( preg_match( $pattern, $string ) )
return preg_replace( $pattern, $result, $string);
}
// check for matches using regular expressions
foreach ( self::$singular as $pattern => $result )
{
if ( preg_match( $pattern, $string ) )
return preg_replace( $pattern, $result, $string );
}
return $string;
}
public static function pluralize_if($count, $string)
{
if ($count == 1)
return "1 $string";
else
return $count . " " . self::pluralize($string);
}
}
And if you have a time use a standard way for inflection usage: http://en.wikipedia.org/wiki/Inflection
You can as array combine with XML so put all inflections data, look at how codeigniter has inflection very friendly: http://ellislab.com/codeigniter/user-guide/helpers/inflector_helper.html
Many frameworks supports built-in inflections but it will focus only in mainly English only. For other languages you should write own... or use unicode.org with some inflections standards for other languages if you need it.