PHP or Regex: string matches ALL characters in search pattern - php

I need to build a regex that will look for the occurrence of all characters in an inputted string.
For example, if the user inputs "equ" as the search parameter, "queen" and "obsequious" would match, but "qadaffi" and "tour" and "quail" would not.
Obviously I'm trying the basic /[equ]/ pattern and it's looking for "at least one of".
If there's a basic PHP function that would do this without regex, then that would be acceptable. But sad.

/[equ]/ is a character class which means it matches just one character. Try /.*equ.*/ instead. I haven't used the php matching functions, so the .*'s might be unnecessary.
Edit: Apparently they're definitely unnecessary, so just use /equ/.

yeah, agreed that simple for loop would be more efficient in your case.
assuming $query = "que"; and $input = "queen"; or anything else:
$matched = true;
$len = strlen($query); // or mb_strlen($query) if you have multibyte string in input
for ($i = 0; $i < $len; $i++){
if (!strstr($input, $query[$i])){
$matched = false;
break;
}
}
very primitive loop to begin with.

#sln
#jancha
I've implemented a timer to measure the speeds. Oddly, I'm finding that the regex is faster than the loop in my code. Is this right?
$haystack = "Obsequious";
$needle = array('e','q','u');
$regex = "/^(?=.*e)(?=.*q)(?=.*u)/";
function trial(){
GLOBAL $haystack;
GLOBAL $needle;
foreach ($needle as $n) {
if (!strpos($haystack, $n)) return false;
}
return true;
}
function trial2(){
GLOBAL $haystack;
GLOBAL $regex;
if (preg_match($regex, $haystack)) {
return true;
}
return false;
}
print time_trial("trial");
print time_trial("trial2");
function time_trial($function, $iterations=100000){
$before = microtime(true);
for ($i=0 ; $i<$iterations ; $i++) {
call_user_func($function);
}
$after = microtime(true);
$total = round($after-$before, 4);
return "Executed timed trial '$function' // $iterations iterations // $total seconds<br />\n";
}

Probably a regex is not the tool for this job. Not a php expert, but loop through each required character checking that it exists in the object string.
Otherwise, using a regex does about the same, but its slow and overkill:
/^(?=.*e)(?=.*q)(?=.*u)/

Related

String between string with array in PHP, array order ISSUE

I'm facing an issue with a function that gets a string between two other strings.
function string_between($str, $starting_word, $ending_word) {
$subtring_start = strpos($str, $starting_word);
$subtring_start += strlen($starting_word);
foreach ($ending_word as $a){
$size = strpos($str, $a, $subtring_start) - $subtring_start;
}
return substr($str, $subtring_start, $size);
}
The issue is that the function searches for the first ending_word in the array.
An example will be easier to understand:
$array_a = ['the', 'amen']; // Starting strings
$array_b = [',', '.']; // Ending strings
$str = "Hello, the world. Then, it is over.";
Expected result:
"the world."
Current result:
"the world. Then,"
The function will think that the ending_word is "," because it is the first element met in the array_b. However, the text encounters first the '.' after the "the" starting word.
How can I make sure the function goes through the text and stops at the first element in the $str present in the array_b, whatever the position in the array?
Any idea?
Basically, you need to break outside of your foreach loop when $size > 0
That way it stops looping through your array when it finds the 1st occurrence. Here is the more complete code with other fixes:
function stringBetween($string, $startingWords, $endingWords) {
foreach ($startingWords as $startingWord) {
$subtringStart = strpos($string, $startingWord);
if ($subtringStart > 0) {
foreach ($endingWords as $endingWord){
$size = strpos($string, $endingWord, $subtringStart) - $subtringStart + strlen($endingWord);
if ($size > 0) {
break;
}
}
if ($size > 0) {
return substr($string, $subtringStart, $size);
}
}
}
return null;
}
$startArr = array('the', 'amen'); // Starting strings
$endArr = array('.', ','); // Ending strings
$str = "Hello, the world. Then, it is over.";
echo stringBetween($str, $startArr, $endArr); // the world.
This type of problems are best solved by PCRE regexes, only couple of lines needed in function :
function string_between($str, $starts, $ends) {
preg_match("/(?:{$starts}).*?(?:{$ends})/mi", $str, $m);
return $m[0];
}
Then calling like this :
echo string_between("Hello, the world. Then, it is over.", 'the|amen', ',|\.');
Produces : the world.
The trick,- search to the nearest matching ending symbol is done with regex non-greedy seach, indicated by question symbol in pattern .*?. You can even extend this function to accept arrays as starting/ending symbols, just that case modify function (possibly with implode('|',$arr)) for concatenating symbols into regex grouping formula.
Edited version
This works now. Iterate over your teststrings from first array looking for position of occurance from teststring. If found one then search for the second teststring at startposition from end of first string.
To get the shortest hit I store the position from the second and take the minimum.
You can try it at http://sandbox.onlinephpfunctions.com/code/0f1e5c97da62b4daaf0e49f52271fe288d1cacbb
$array_a =array('the','amen');
$array_b =array(',','.', '#');
$str = "Hello, the world. Then, it is over.";
function earchString($str, $array_a, $array_b) {
forEach($array_a as $test) {
$pos = strpos($str, $test);
if ($pos===false) continue;
$found = [];
forEach($array_b as $test2) {
$posStart = $pos+strlen($test);
$pos2 = strpos($str, $test2, $posStart);
$found[] = ($pos2!==false) ? $pos2 : INF;
}
$min = min($found);
if ($min !== INF)
return substr($str,$pos,$min-$pos) .$str[$min];
}
return '';
}
echo earchString($str, $array_a, $array_b);

preg_match() match to cases with one pattern?

How would I go about ordering 1 array into 2 arrays depending on what each part of the array starts with using preg_match() ?
I know how to do this using 1 expression but I don't know how to use 2 expressions.
So far I can have done (don't ask why I'm not using strpos() - I need to use regex):
$gen = array(
'F1',
'BBC450',
'BBC566',
'F2',
'F31',
'SOMETHING123',
'SOMETHING456'
);
$f = array();
$bbc = array();
foreach($gen as $part) {
if(preg_match('/^F/', $part)) {
// Add to F array
array_push($f, $part);
} else if(preg_match('/^BBC/', $part)) {
// Add to BBC array
array_push($bbc, $part);
} else {
// Not F or BBC
}
}
So my question is: is it possible to do this using 1 preg_match() function?
Please ignore the SOMETHING part in the array, it's to show that using just one if else statement wouldn't solve this.
Thanks.
You can use an alternation along with the third argument to preg_match, which contains the part of the regexp that matched.
preg_match('/^(?:F|BBC)/', $part, $match);
switch ($match) {
case 'F':
$f[] = $part;
break;
case 'BBC':
$bbc[] = $part;
break;
default:
// Not F or BBC
}
It is even possible without any loop, switch, or anything else (which is faster and more efficient then the accepted answer's solution).
<?php
preg_match_all("/(?:(^F.*$)|(^BBC.*$))/m", implode(PHP_EOL, $gen), $matches);
$f = isset($matches[1]) ? $matches[1] : array();
$bbc = isset($matches[2]) ? $matches[2] : array();
You can find an interactive explanation of the regular expression at regex101.com which I created for you.
The (not desired) strpos approach is nearly five times faster.
<?php
$c = count($gen);
$f = $bbc = array();
for ($i = 0; $i < $c; ++$i) {
if (strpos($gen[$i], "F") === 0) {
$f[] = $gen[$i];
}
elseif (strpos($gen[$i], "BBC") === 0) {
$bbc[] = $gen[$i];
}
}
Regular expressions are nice, but the are no silver bullet for everything.

Filter a set of bad words out of a PHP array

I have a PHP array of about 20,000 names, I need to filter through it and remove any name that has the word job, freelance, or project in the name.
Below is what I have started so far, it will cycle through the array and add the cleaned item to build a new clean array. I need help matching the "bad" words though. Please help if you can
$data1 = array('Phillyfreelance' , 'PhillyWebJobs', 'web2project', 'cleanname');
// freelance
// job
// project
$cleanArray = array();
foreach ($data1 as $name) {
# if a term is matched, we remove it from our array
if(preg_match('~\b(freelance|job|project)\b~i',$name)){
echo 'word removed';
}else{
$cleanArray[] = $name;
}
}
Right now it matches a word so if "freelance" is a name in the array it removes that item but if it is something like ImaFreelaner then it does not, I need to remove anything that has the matching words in it at all
A regular expression is not really necessary here — it'd likely be faster to use a few stripos calls. (Performance matters on this level because the search occurs for each of the 20,000 names.)
With array_filter, which only keeps elements in the array for which the callback returns true:
$data1 = array_filter($data1, function($el) {
return stripos($el, 'job') === FALSE
&& stripos($el, 'freelance') === FALSE
&& stripos($el, 'project') === FALSE;
});
Here's a more extensible / maintainable version, where the list of bad words can be loaded from an array rather than having to be explicitly denoted in the code:
$data1 = array_filter($data1, function($el) {
$bad_words = array('job', 'freelance', 'project');
$word_okay = true;
foreach ( $bad_words as $bad_word ) {
if ( stripos($el, $bad_word) !== FALSE ) {
$word_okay = false;
break;
}
}
return $word_okay;
});
I'd be inclined to use the array_filter function and change the regex to not match on word boundaries
$data1 = array('Phillyfreelance' , 'PhillyWebJobs', 'web2project', 'cleanname');
$cleanArray = array_filter($data1, function($w) {
return !preg_match('~(freelance|project|job)~i', $w);
});
Use of the preg_match() function and some regular expressions should do the trick; this is what I came up with and it worked fine on my end:
<?php
$data1=array('JoomlaFreelance','PhillyWebJobs','web2project','cleanname');
$cleanArray=array();
$badWords='/(job|freelance|project)/i';
foreach($data1 as $name) {
if(!preg_match($badWords,$name)) {
$cleanArray[]=$name;
}
}
echo(implode($cleanArray,','));
?>
Which returned:
cleanname
Personally, I would do something like this:
$badWords = ['job', 'freelance', 'project'];
$names = ['JoomlaFreelance', 'PhillyWebJobs', 'web2project', 'cleanname'];
// Escape characters with special meaning in regular expressions.
$quotedBadWords = array_map(function($word) {
return preg_quote($word, '/');
}, $badWords);
// Create the regular expression.
$badWordsRegex = implode('|', $quotedBadWords);
// Filter out any names that match the bad words.
$cleanNames = array_filter($names, function($name) use ($badWordsRegex) {
return preg_match('/' . $badWordsRegex . '/i', $name) === FALSE;
});
This should be what you want:
if (!preg_match('/(freelance|job|project)/i', $name)) {
$cleanArray[] = $name;
}

Php index of element in the array, update element

for some reason $post is always < 0. The indoxOf function works great. I use it on ohter codes and it works great
for some reason even after I add the element like this array_push($groups, $tempDon); on the next loop i continues to return -1
$donations = $this->getInstitutionDonations($post->ID);
$groups=array();
foreach( $donations as $don ) : setup_postdata($don);
$pos = $this->indexOf($don, $groups);
print_r($pos);
if($pos < 0)
{
$tempDom = $don;
$tempDon->count = 1;
array_push($groups, $tempDon);
}
else
{
$tempDom = $groups[$pos];
$tempDon->count++;
array_splice($tempDon);
array_push($groups, $tempDon);
echo '<br><br><br>ahhhhhhhhhh<br><br>';
}
endforeach;
protected function indexOf($needle, $haystack) { // conversion of JavaScripts most awesome
for ($i=0;$i<count($haystack);$i++) { // indexOf function. Searches an array for
if ($haystack[$i] == $needle) { // a value and returns the index of the *first*
return $i; // occurance
}
}
return -1;
}
This looks like an issue of poor proofreading to me (note $tempDom vs $tempDon):
$tempDom = $don;
$tempDon->count = 1;
array_push($groups, $tempDon);
Your else block has similar issues.
I also completely agree with #hakre's comment regarding syntax inconsistencies.
EDIT
I'd also like to recommend that you make use of PHP's built-in array_search function in the body of your indexOf method rather than rolling your own.

php: dynamic preg_replace

The code below is not a functioning method it's just written to help you understand what I'm trying to do.
// $i = occurrence to replace
// $r = content to replace
private function inject($i, $r) {
// regex matches anything in the format {value|:value}
$output = preg_replace('/\{(.*?)\|\:(.*?)\}/', '$r', $this->source);
$output[$i]
}
How do I find the $i occurrence in $output; and replace it with $r;?
Note: All I want to do is use $i (which is a number) to find the occurrence of that nmber in a preg_replace; For exmaple: I might want to replace the second occurrence of the preg_replace pattern with the variable $r
I think you can only accomplish such an occurence counting with a callback:
private function inject($i, $r) {
$this->i = $i;
$this->r = $r;
// regex matches anything in the format {value|:value}
$output = preg_replace_callback('/\{(.*?)\|\:(.*?)\}/',
array($this, "inject_cb"), $this->source);
}
function inject_cb($match) {
if ($this->i --) {
return $match[0];
}
else {
return $this->r;
}
}
It leaves the first $i matches as is, and uses the tempoary $this->r once when the countdown is matched. Could be done with a closure to avoid ->$i and ->$r however.

Categories