Php preg_match with wild card characters

Php preg_match with wild card characters - php

I have some zip code in an array which includes some wild card characters like this
$zip_codes = array( '12556', '765547', '234*', '987*' );
$target_zip = '2347890';
So to check whether the target zip is already present in the array. I am doing like this
foreach( $zip_codes as $zip ) {
if ( preg_match( "/{$target_zip}.*$/i", $zip ) ) {
echo 'matched';
break;
}
else {
echo 'not matched';
}
}
But its not matching the zip at all. Can someone tell me whats the issue here?

You need to turn your $zip values into valid regular expressions by converting * into .* (or perhaps \d*); then you can test them against $target_zip:
$zip_codes = array( '12556', '765547', '234*', '987*' );
$target_zip = '2347890';
foreach( $zip_codes as $zip ) {
echo $zip;
if (preg_match('/' . str_replace('*', '.*', $zip) . '/', $target_zip)) {
echo ' matched'. PHP_EOL;
break;
}
else {
echo ' not matched' . PHP_EOL;
}
}
Output:
12556 not matched
765547 not matched
234* matched
Demo on 3v4l.org
You haven't indicated whether you want the value in $zip_codes to match the entire $target_zip value or just part of it. The code above will work for just part (i.e. 234 will match 12345); if you don't want that, change the regex construction to:
if (preg_match('/^' . str_replace('*', '.*', $zip) . '$/', $target_zip)) {
The anchors will ensure that $zip matches the entirety of $target_zip.

One of the problems is that using 234* in a regex will match any number of 4's.
The other problem is that you match to the end (using $) but not the start, so 789 (with the .* appended) will also match (as it's in the middle). In this code, I use ^{$zip}$, with * replaced with .* to match any trailing characters...
$zip_codes = array( '12556', '765547', '789', '234', '234*', '987*' );
$target_zip = '2347890';
foreach( $zip_codes as $zip ) {
$zip = str_replace("*", ".*", $zip);
if ( preg_match( "/^{$zip}$/i", $target_zip ) ) {
echo $zip.' matched'.PHP_EOL;
break;
}
else {
echo $zip.' not matched'.PHP_EOL;
}
}

You don't need regex here.
You can look for * and based on that look for an exact on the number of characters -1.
$zip_codes = array( '12556', '765547', '234*', '987*' );
$target_zip = '2347890';
foreach($zip_codes as $zip){
if(strpos($zip, "*") !== false){
//if "234* -1 = "234" == substr("2347890",0,3)
if(substr($zip, 0, -1) == substr($target_zip, 0, strlen($zip)-1)){
echo "wildcard match";
}
}else{
if($zip == $target_zip){
echo "excat match";
}
}
}
https://3v4l.org/hIGER

The issue is that in the loop, the pattern is always the same:
if ( preg_match( "/2347890.*$/i", $zip ) ) {
I think you meant to use the value of $zip as part of the pattern, which causes the issue repeating the last digit 0+ times in:
if ( preg_match( "/234*.*$/i", $zip ) ) {
^^
As an alternative, you could also extract the digits from $zip_codes using a capturing group and match optional following *
^(\d+)\**$
Regex demo
Then use strpos to check if the target_zip start with the extracted digits.
$zip_codes = array( '12556', '765547', '234*', '987*', '237' );
$target_zip = '2347890';
foreach($zip_codes as $zip ) {
$digitsOnly = preg_replace("~^(\d+)\**$~", "$1", $zip);
if (strpos($target_zip, $digitsOnly) === 0) {
echo "$zip matched $target_zip" . PHP_EOL;
break;
}
else {
echo "$zip not matched $target_zip" . PHP_EOL;
}
}
Output
12556 not matched 2347890
765547 not matched 2347890
234* matched 2347890
Php demo

Related

Parse url with pattern in PHP?

How to determine, using regexp or something else in PHP, that following urls match some patterns with tokens (url => pattern):
node/11221 => node/%node
node/38429/news => node/%node/news
album/34234/shadowbox/321023 => album/%album/shadowbox/%photo
Thanks in advance!
Update 1
Wrote the following script:
<?php
$patterns = [
"node/%node",
"node/%node/news",
"album/%album/shadowbox/%photo",
"media/photo",
"blogs",
"news",
"node/%node/players",
];
$url = "node/11111/news";
foreach ($patterns as $pattern) {
$result_pattern = preg_replace("/\/%[^\/]+/x", '/*', $pattern);
$to_replace = ['/\\\\\*/']; // asterisks
$replacements = ['[^\/]+'];
$result_pattern = preg_quote($result_pattern, '/');
$result_pattern = '/^(' . preg_replace($to_replace, $replacements, $result_pattern) . ')$/';
if (preg_match($result_pattern, $url)) {
echo "<pre>" . $pattern . "</pre>";
}
}
?>
Could anyone analyze whether this code is good enough? And also explain why there is so many slashes in this part $to_replace = ['/\\\\\*/']; (regarding the replacement, found exactly such solution on the Internet).

If you know the format beforehand you can use preg_match. For example in the first example, you know %node can only be numbers. Matching multiples should be as as easy as we did it earlier, just store the regex in the array:
$patterns = array(
'node/%node' => '|node/[0-9]+$|',
'node/%node/news' => '|node/[0-9]+/news|',
'album/%album/shadowbox/%photo' => '|album/[0-9]+/shadowbox/[0-9]+|',
'media/photo' => '|media/photo|',
'blogs' => '|blogs|',
'news' => '|news|',
'node/%node/players' => '|node/[0-9]+/players|',
);
$url = "node/11111/players";
foreach ($patterns as $pattern => $regex) {
preg_match($regex, $url, $results);
if (!empty($results)) {
echo "<pre>" . $pattern . "</pre>";
}
}
Notice how I added the question mark $ to end of the first rule, this will insure that it doesn't break into the second rule.

Here is the generic solution to the solution above
<?php
// The url part
$url = "/node/123/hello/strText";
// The pattern part
$pattern = "/node/:id/hello/:test";
// Replace all variables with * using regex
$buffer = preg_replace("(:[a-z]+)", "*", $pattern);
// Explode to get strings at *
// In this case ['/node/','/hello/']
$buffer = explode("*", $buffer);
// Control variables for loop execution
$IS_MATCH = True;
$CAPTURE = [];
for ($i=0; $i < sizeof($buffer); $i++) {
$slug = $buffer[$i];
$real_slug = substr($url, 0 , strlen($slug));
if (!strcmp($slug, $real_slug)) {
$url = substr($url, strlen($slug));
$temp = explode("/", $url)[0];
$CAPTURE[sizeof($CAPTURE)+1] = $temp;
$url = substr($url,strlen($temp));
}else {
$IS_MATCH = False;
}
}
unset($CAPTURE[sizeof($CAPTURE)]);
if($IS_MATCH)
print_r($CAPTURE);
else
print "Not a match";
?>
You can pretty much convert the code above into a function and pass parameters to check against the array case. The first step is regex to convert all variables into * and the explode by *. Finally loop over this array and keep comparing to the url to see if the pattern matches using simple string comparison.

As long as the pattern is fixed, you can use preg_match() function:
$urls = array (
"node/11221",
"node/38429/news",
"album/34234/shadowbox/321023",
);
foreach ($urls as $url)
{
if (preg_match ("|node/([\d]+$)|", $url, $matches))
{
print "Node is {$matches[1]}\n";
}
elseif (preg_match ("|node/([\d]+)/news|", $url, $matches))
{
print "Node is {$matches[1]}\n";
}
elseif (preg_match ("|album/([\d]+)/shadowbox/([\d]+)$|", $url, $matches))
{
print "Album is {$matches[1]} and photo is {$matches[2]}\n";
}
}
For other patterns to match, adjust as necessary.

Find matches by word and check if it's a link or not

I want to create a PHP array like below, which contains two values, one for the matching word and one to check if the match is a link or not.
Input
$string = "test test"
What can I do here to find all matches of the word 'test' and check if the founded match is a link or not?
Output
Array
(
[0] =>
Array
(
[0] test
[1] false
)
[1] =>
Array
(
[0] test
[1] true
)
)

You could use a regular expression for this:
$string = 'test link text';
if (preg_match("#^\s*(.*?)\s*<a\s.*?href\s*=\s*['\"](.*?)['\"].*?>(.*?)</a\s*>#si",
$string, $match)) {
$textBefore = $match[1]; // test
$href = $match[2]; // http://test
$anchorText = $match[3]; // link text
// deal with these elements as you wish...
}
This solution is not case-sensitive, it will work with <A ...>...</A> just as well. If the href value is delimited with single quotes instead of double quotes, it will still work. Surrounding spaces of each value are ignored (trimmed).

Try this code :
<?php
$string ="test test";
$link ='';
$word = '';
$flag = true;
for($i=0;$i<strlen($string);$i++){
if($string[$i] == '<' && $string[$i+1] == 'a'){
$flag=false;
while($string[$i++] != '>')
{
}
while($string[$i] != '<' && $string[$i+1] != '/' && $string[$i+2] != 'a' && $string[$i+3] != '>'){
$link .= $string[$i++];
}
}
else{
if($flag)
$word.=$string[$i];
}
}
echo 'Link :'.$link . "<br/>";
echo 'Word:'.$word;
// You can now manipulate Link and word as you wish
?>

PHP : How can I Highlight searched words in results and keep the words original text case?

I am displaying search results on site where users can search for specific keyword, words.
On results page I am trying to Highlight the searched words , in the result.
So user can get idea which words matched where.
e.g.
if user searches for : mango
the resulting item original : This Post contains Mango.
the resulting output I want of highlighted item : This Post contains <strong>Mango</strong>
I am using it like this.
<?php
//highlight all words
function highlight_words( $title, $searched_words_array) {
// loop through searched_words_array
foreach( $searched_words_array as $searched_word ) {
$title = highlight_word( $title, $searched_word); // highlight word
}
return $title; // return highlighted data
}
//highlight single word with color
function highlight_word( $title, $searched_word) {
$replace = '<strong>' . $searched_word . '</strong>'; // create replacement
$title = str_ireplace( $searched_word, $replace, $title ); // replace content
return $title; // return highlighted data
}
I am getting searched words from Sphinx Search Engine , the issue is Sphinx returns entered/macthed words in lowercase.
So by using above code , my
results becomes : This Post contains <strong>mango</strong>
*notice the m from mango got lowercase.
So my question is how can I Highlight word i.e. wrap <strong> & </strong> around the words matching the Searched words ?
without loosing its textcase ?
*ppl. its not same questions as how to highlight search results , I am asking my keywords array is in lowercase and using above method the original word gets replaced by lowercase word.
so how can I stop that ?
the other question link will face this too , because the searched keywords are in lowercase. and using str_ireplace it will match it and replace it with lowercase word.
update :
i have combined various code snippets to get what i was expecting code to do.,
for now its working great.
function strong_words( $title, $searched_words_array) {
//for all words in array
foreach ($searched_words_array as $word){
$lastPos = 0;
$positions = array();
//find all positions of word
while (($lastPos = stripos($title, $word, $lastPos))!== false) {
$positions[] = $lastPos;
$lastPos = $lastPos + strlen($word);
}
//reverse sort numeric array
rsort($positions);
// highlight all occurances
foreach ($positions as $pos) {
$title = strong_word($title , $word, $pos);
}
}
//apply strong html code to occurances
$title = str_replace('#####','</strong>',$title);
$title = str_replace('*****','<strong>',$title);
return $title; // return highlighted data
}
function strong_word($title , $word, $pos){
//ugly hack to not use <strong> , </strong> here directly, as it can get replaced if searched word contains charcters from strong
$title = substr_replace($title, '#####', $pos+strlen($word) , 0) ;
$title = substr_replace($title, '*****', $pos , 0) ;
return $title;
}
$title = 'This is Great Mango00lk mango';
$words = array('man','a' , 'go','is','g', 'strong') ;
echo strong_words($title,$words);

Regex solution:
function highlight_word( $title, $searched_word) {
return preg_replace('#('.$searched_word.')#i','<strong>\1<strong>',$title) ;
}
Just be wary of special characters that may be interpreted as meta characters in $searched_word

Here's a code snippet I wrote a while back that's working to do exactly what you want:
if(stripos($result->question, $word) !== FALSE){
$word_to_highlight = substr($result->question, stripos($result->question, $word), strlen($word));
$result->question = str_replace($word_to_highlight, '<span class="search-term">'.$word_to_highlight.'</span>', $result->question);
}

//will find all occurances of all words and make them strong in html
function strong_words( $title, $searched_words_array) {
//for all words in array
foreach ($searched_words_array as $word){
$lastPos = 0;
$positions = array();
//find all positions of word
while (($lastPos = stripos($title, $word, $lastPos))!== false) {
$positions[] = $lastPos;
$lastPos = $lastPos + strlen($word);
}
//reverse sort numeric array
rsort($positions);
// highlight all occurances
foreach ($positions as $pos) {
$title = strong_word($title , $word, $pos);
}
}
//apply strong html code to occurances
$title = str_replace('#####','</strong>',$title);
$title = str_replace('*****','<strong>',$title);
return $title; // return highlighted data
}
function strong_word($title , $word, $pos){
//ugly hack to not use <strong> , </strong> here directly, as it can get replaced if searched word contains charcters from strong
$title = substr_replace($title, '#####', $pos+strlen($word) , 0) ;
$title = substr_replace($title, '*****', $pos , 0) ;
return $title;
}
$title = 'This is Great Mango00lk mango';
$word = array('man','a' , 'go','is','g', 'strong') ;
echo strong_words($title,$word);
This code will find all occurrences of all words and make them strong in html while keeping original text case.

function highlight_word( $content, $word, $color ) {
$replace = '<span style="background-color: ' . $color . ';">' . $word . '</span>'; // create replacement
$content = str_replace( $word, $replace, $content ); // replace content
return $content; // return highlighted data
}
function highlight_words( $content, $words, $colors ) {
$color_index = 0; // index of color (assuming it's an array)
// loop through words
foreach( $words as $word ) {
$content = highlight_word( $content, $word, $colors[$color_index] ); // highlight word
$color_index = ( $color_index + 1 ) % count( $colors ); // get next color index
}
return $content; // return highlighted data
}
// words to find
$words = array(
'normal',
'text'
);
// colors to use
$colors = array(
'#88ccff',
'#cc88ff'
);
// faking your results_text
$results_text = array(
array(
'ab' => 'AB #1',
'cd' => 'Some normal text with normal words isn\'t abnormal at all'
), array(
'ab' => 'AB #2',
'cd' => 'This is another text containing very normal content'
)
);
// loop through results (assuming $output1 is true)
foreach( $results_text as $result ) {
$result['cd'] = highlight_words( $result['cd'], $words, $colors );
echo '<fieldset><p>ab: ' . $result['ab'] . '<br />cd: ' . $result['cd'] . '</p></fieldset>';
}
Original link check here

PHP Sentence case a string with capitalized proper nouns using a known words dictionary?

I need to search a string of words against a dictionary of words(txt file) and capitalize any word that is not found.
I'm trying to split the string into an array of words and check them against the unix /usr/dict/words dictionary. If a match is found for the word it gets lcfirst($word) if no match then ucfirst( $word )
The dictionary is opened and put into an array using fgetcsv (I also tried using fgets and exploding on end of line).
function wnd_title_case( $string ) {
$file = fopen( "/users/chris/sites/wp-dev/trunk/core/words.txt", "rb" );
while ( !feof( $file ) ) {
$line_of_text = fgetcsv( $file );
$exceptions = array( $line_of_text );
}
fclose( $file );
$delimiters = array(" ", "-", "O'");
foreach ( $delimiters as $delimiter ) {
$words = explode( $delimiter, $string );
$newwords = array();
foreach ($words as $word) {
if ( in_array( strtoupper( $word ), $exceptions ) ) {
// check exceptions list for any words that should be lower case
$word = lcfirst( $word );
} elseif ( !in_array( $word, $exceptions ) ) {
// everything else capitalized
$word = ucfirst( $word );
}
array_push( $newwords, $word );
}
$string = join( $delimiter, $newwords );
}
$string = ucfirst( $string );
return $string;
}
I have verified that the file gets opened.
The desired output: Sentence case title string with proper nouns capitalized.
The current output: Title string with every word capitalized
Edit:
Using Jay's answer below I came up with a workable solution. My first problem was that my words dictionary contained both capitalized and non capitalized words so I found a proper names dictionary to to check against using a regex callback. It's not perfect but gets it right most of the time.
function title_case( $string ) {
$fp = #fopen( THEME_DIR. "/_/inc/propernames", "r" );
$exceptions = array();
if ( $fp ) {
while( !feof($fp) ) {
$buffer = fgets( $fp );
array_push( $exceptions, trim($buffer) );
}
}
fclose( $fp );
$content = strtolower( $string );
$pattern = '~\b' . implode ( '|', $exceptions ) . '\b~i';
$content = preg_replace_callback ( $pattern, 'regex_callback', $content );
$new_content = $content;
return ucfirst( $new_content );
}
function regex_callback ( $data ) {
if ( strlen( $data[0] ) > 3 )
return ucfirst( strtolower( $data[0] ));
else return ( $data[0] );
}

The simplest way to do this with regex is to do the following
convert your text to all uppercase first letters $content = ucwords($original_content);
Using your array of words in the dictionary, create a regex by imploding all your words with a pipe character |, and surrounding it with boundary markers and delimiters followed by the case insensitive flag, so you would end up with ~\bword1|word2|word3\b~i (obviously with your large list)
create a function to lower the matched value using strtolower to be used with preg_replace_callback
An example of a working demo is this
function regex_callback($data) {
return strtolower($data[0]);
}
$original_content = 'hello my name is jay gilford';
$words = array('hello', 'my', 'name', 'is');
$content = ucwords($original_content);
$pattern = '~\b' . implode('|', $words) . '\b~i';
$content = preg_replace_callback($pattern, 'regex_callback', $content);
echo $content;
You could also optionally use strtolower to begin with on the content for consistency. The above code outputs hello my name is Jay Gilford

preg_match must end with "/"?

In the preg_match below, I'm comparing against two static strings, $url and $my_folder...
$url = get_bloginfo('url')
//$url = 'http://site.com'
$my_folder = get_option('my_folder');
//$my_folder = 'http://site.com/somefolder;
I'm getting a match when the $my_folder string has a trailing slash
http://somefolder/go/
But this does not create a match...
http://somefolder/go
However, another problem is that this also matches...
http://somefolder/gone
Code is...
$my_folder = get_option('rseo_nofollow_folder');
if($my_folder !=='') $my_folder = trim($my_folder,'/');
$url = trim(get_bloginfo('url'),'/');
preg_match_all('~<a.*>~isU',$content["post_content"],$matches);
for ( $i = 0; $i <= sizeof($matches[0]); $i++){
if($my_folder !=='')
{
//HERES WHERE IM HAVING PROBLEMS
if ( !preg_match( '~nofollow~is',$matches[0][$i])
&& (preg_match('~' . $my_folder . '/?$~', $matches[0][$i])
|| !preg_match( '~'. $url .'/?$~',$matches[0][$i])))
{
$result = trim($matches[0][$i],">");
$result .= ' rel="nofollow">';
$content["post_content"] = str_replace($matches[0][$i], $result, $content["post_content"]);
}
}
else
{
//THIS WORKS FINE, NO PROBLEMS HERE
if ( !preg_match( '~nofollow~is',$matches[0][$i]) && (!preg_match( '~'.$url.'~',$matches[0][$i])))
{
$result = trim($matches[0][$i],">");
$result .= ' rel="nofollow">';
$content["post_content"] = str_replace($matches[0][$i], $result, $content["post_content"]);
}
}
}
return $content;

~^http://somefolder/go(?:/|$)~

You need to first remove the trailing slash and add '/?' at the end of your regexp
$my_folder = trim($my_folder,'/');
$url = trim(get_bloginfo('url'),'/');
if ( !preg_match( '~nofollow~is',$matches[0][$i])
&& (preg_match('~' . $my_folder . '/?$~', $matches[0][$i])
|| !preg_match( '~'. $url .'/?$~',$matches[0][$i])))

This is a shot in the dark, but try:
preg_match( '/' . preg_quote( get_bloginfo('url'), '/' ) . '?/', $matches[0][$i] )
You can use whatever char you want in place of the / chars. I'm guessing that you're using wordpress and guessing that get_bloginfo('url') is normalized to always have a trailing slash. If that is the case, the last slash will be selected optionally by the ? at the end of the regex.

You should just use strstr() or strpos() if it's fixed strings anyway.
Your example rewritten:
if (!strstr($matches[0][$i], "nofollow")
and strstr($matches[0][$i], $my_folder)
or !strstr($matches[0][$i], $url) )
strpos works similarly, but you need an extra boolean check:
if (strpos($matches, "nofollow") === FALSE
or strpos($matches, $my_folder) !== FALSE)

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Php preg_match with wild card characters - php

Related

Parse url with pattern in PHP?

Find matches by word and check if it's a link or not

PHP : How can I Highlight searched words in results and keep the words original text case?

PHP Sentence case a string with capitalized proper nouns using a known words dictionary?

preg_match must end with "/"?

Categories

Resources