Strpos with exact matches - php

I have a function in php which I would like to perform a simple search on a string, using a kw as the search phrase, and return true if found.
This is what I have now:
for($i=0; $i<count($search_strings); $i++){
$pos = strpos($search_strings[$i], $kw_to_search_for);
}
This works fine, and does actually find the Keyword inside the string beeing searched, but the problem is that strpos doesn't match exact phrases or words.
For instance, a search for 'HP' would return true if the word 'PHP' was in the string.
I know of preg_split and regular expressions which can be used to do exact matches, but in my case I don't know what the keyword is for every search, because the keyword is user-inputted.
So the keyword could be "hot-rods", "AC/DC", "Title:Subject" etc etc...
This means I cannot split the words and check them separately because I would have to use some kind of a dynamic pattern for the regex.
If anybody know of a good solution I would much appreciate it.
I mean, basically I want exact matches only, so if the KW is "Prof" then this will return true if the match in the searched string is "Prof" and doesn't have any other characters surrounding it.
For instance "Professional" would have to be FALSE.

You can use word boundaries \b:
if (preg_match("/\b".preg_quote($kw_to_search_for)."\b/i", $search_strings[$i])) {
// found
}
For instance:
echo preg_match("/\bProfessional\b/i", 'Prof'); // 0
echo preg_match("/\bProf\b/i", 'Prof'); // 1
/i modifier makes it case insensitive.

In my case I needed to match exactly professional when professional.bowler existed in the sentence.
Where preg_match('/\bprofessional\b/i', 'Im a professional.bowler'); returned int(1).
To resolve this I resorted to arrays to find an exact word match using isset on the keys.
Detection Demo
$wordList = array_flip(explode(' ', 'Im a professional.bowler'));
var_dump(isset($wordList['professional'])); //false
var_dump(isset($wordList['professional.bowler'])); //true
The method also works for directory paths, such as when altering the php include_path, as opposed to using preg_replace which was my specific use-case.
Replacement Demo
$removePath = '/path/to/exist-not' ;
$includepath = '.' . PATH_SEPARATOR . '/path/to/exist-not' . PATH_SEPARATOR . '/path/to/exist';
$wordsPath = str_replace(PATH_SEPARATOR, ' ', $includepath);
$result = preg_replace('/\b' . preg_quote($removePath, '/'). '\b/i', '', $wordsPath);
var_dump(str_replace(' ', PATH_SEPARATOR, $result));
//".:/path/to/exist-not:/path/to/exist"
$paths = array_flip(explode(PATH_SEPARATOR, $includepath));
if(isset($paths[$removePath])){
unset($paths[$removePath]);
}
$includepath = implode(PATH_SEPARATOR, array_flip($paths));
var_dump($includepath);
//".:/path/to/exist"

Related

Create a function to find a specific word in the title

I have the following title formation on my website:
It's no use going back to yesterday, because at that time I was... Lewis Carroll
Always is: The phrase… (author).
I want to delete everything after the ellipsis (…), leaving only the sentence as the title. I thought of creating a function in php that would take the parts of the titles, throw them in an array and then I would work each part, identifying the only pattern I have in the title, which is the ellipsis… and then delete everything. But when I do that, in the X space of my array, it returns the following:
was...
In position 8 of the array comes the word and the ellipsis and I don't know how to find a pattern to delete the author of the title, my pattern was the ellipsis. Any idea?
<?php
$a = get_the_title(155571);
$search = '... ';
if(preg_match("/{$search}/i", $a)) {
echo 'true';
}
?>
I tried with the code above and found the ellipsis, but I needed to bring it into an array to delete the part I need. I tried something like this:
<?php
define('WP_USE_THEMES', false);
require('./wp-blog-header.php');
global $wpdb;
$title_array = explode(' ', get_the_title(155571));
$search = '... ';
if (array_key_exists("/{$search}/i",$title_array)) {
echo "true";
}
?>
I started doing it this way, but it doesn't work, any ideas?
Thanks,
If you use regex you need to escape the string as preg_quote() would do, because a dot belongs to the pattern.
But in your simple case, I would not use a regex and just search for the three dots from the end of the string.
Note: When the elipsis come from the browser, there's no way to detect in PHP.
$title = 'The phrase... (author).';
echo getPlainTitle($title);
function getPlainTitle(string $title) {
$rpos = strrpos($title, '...');
return ($rpos === false) ? $title : substr($title, 0, $rpos);
}
will output
The phrase
First of all, since you're working with regular expressions, you need to remember that . has a special meaning there: it means "any character". So /... / just means "any three characters followed by a space", which isn't what you want. To match a literal . you need to escape it as \.
Secondly, rather than searching or splitting, you could achieve what you want by replacing part of the string. For instance, you could find everything after the ellipsis, and replace it with an empty string. To do that you want a pattern of "dot dot dot followed by anything", where "anything" is spelled .*, so \.\.\..*
$title = preg_replace('/\.\.\..*/', '', $title);

Regex only for specific domain name in URL

As much as I've tried I can't seem to find the correct regex to locate what I'm after here.
I only want to select the first instance of the url that matches the domain www.myweb.com from the following...
Some text https://www.myweb.com/page/cat/323123442321-rghe432 and then another https://www.adifferentsite.com/fsdhjss/erwr
I need to completely ignore the second url www.adifferentsite.com and only work with the first one that matches www.myweb.com, ignoring any other possible instances of www.myweb.com
Once the first matching domain is discovered I need to store the rest of the url that comes after it...
page/cat/323123442321-rghe432
...into a new variable $newvar, so...
$newvar = 'page/cat/323123442321-rghe432';
I'm trying :
return preg_replace_callback( '/http://www.myweb.com/\/[0-9a-zA-Z]+/', array( __CLASS__, 'my_callback' ), $newvar );
I've read tons of documents on how to detect url's but can't find anything about detecting a specific url.
I really can't grasp how to formulate regex so this formula is incorrect. Any help would be greatly appreciated.
EDIT Edited the question to be a bit more specific and hopefully a bit easier to resolve.
You can use a preg_replace_callback and pass an array into the anonymous function (or just your custom callback function) to fill it with all the necessary URL parts.
Here is a demo:
$rests = array();
$re = '~\b(https?://)www\.myweb\.com/(\S+)~';
$str = "Some text https://www.myweb.com/page/cat/323123442321-rghe432 and then another https://www.adifferentsite.com/fsdhjss/erwr";
echo $result = preg_replace_callback($re, function ($m) use (&$rests) {
array_push($rests, $m[2]);
return $m[1] . "embed.myweb.com/" . $m[2];
}, $str) . PHP_EOL;
print_r($rests);
Results:
Some text https://embed.myweb.com/page/cat/323123442321-rghe432 and then another https://www.adifferentsite.com/fsdhjss/erwr
Array
(
[0] => page/cat/323123442321-rghe432
)
A couple of words:
'~\b(https?://)www\.myweb\.com/(\S+)~' has ~ as a regex delimiter, so you do not have to escape /
It is declared with a single-quoted literal, so you do not have to use double-escaping for \\S
It matches and captures into capturing groups 2 substrings: \b(https?://) (that matches a whole word http or https followed by ://) and (\S+) (that matches 1 or more non-whitespace characters). These capturing groups are marked with (...) in the pattern and can be accessed via $matches[n] where n is the id of the capturing group.
UPDATE
If you only need to replace the first occurrence of the URL, pass the limit argument to the preg_replace_callback:
$rest = "";
$re = '~\b(https?://)www\.myweb\.com/(\S+\b)~';
$str = "Some text https://www.myweb.com/page/cat/323123442321-rghe432, another http://www.myweb.com/page/cat/323123442321-rghe432 and then another https://www.adifferentsite.com/fsdhjss/erwr";
echo $result = preg_replace_callback($re, function ($m) use (&$rest) {
$rest = $m[2];
return $m[1] . "embed.myweb.com/" . $m[2];
}, $str, 1) . PHP_EOL;
//-LIMIT ^ - HERE -
echo $rest;
See another IDEONE demo

PHP get specific string from url before and after unknown characters

I know it may sound as a common question but I have difficulty understanding this process.
So I have this string:
http://domain.com/campaign/tgadv?redirect
And I need to get only the word "tgadv". But I don't know that the word is "tgadv", it could be whatever.
Also the url itself may change and become:
http://domain.com/campaign/tgadv
or
http://domain.com/campaign/tgadv/
So what I need is to create a function that will get whatever word is after campaign and before any other particular character. That's the logic..
The only certain thing is that the word will come after the word campaign/ and that any other character that will be after the word we are searching is a special one ( i.e. / or ? )
I tried understanding preg_match but really cannot get any good result from it..
Any help would be highly appreciated!
I would not use a regex for that. I would use parse_url and basename:
$bits = parse_url('http://domain.com/campaign/tgadv?redirect');
$filename = basename($bits['path']);
echo $filename;
However, if want a regex solution, use something like this:
$pattern = '~(.*)/(.*)(\?.*)~';
preg_match($pattern, 'http://domain.com/campaign/tgadv?redirect', $matches);
$filename = $matches[2];
echo $filename;
Actually, preg_match sounds like the perfect solution to this problem. I assume you are having problems with the regex?
Try something like this:
<?php
$url = "http://domain.com/campaign/tgadv/";
$pattern = "#campaign/([^/\?]+)#";
preg_match($pattern, $url, $matches);
// $matches[1] will contain tgadv.
$path = "http://domain.com/campaign/tgadv?redirect";
$url_parts = parse_url($path);
$tgadv = strrchr($url_parts['path'], '/');
You don't really need a regex to accomplish this. You can do it using stripos() and substr().
For example:
$str = '....Your string...';
$offset = stripos($str, 'campaign/');
if ( $offset === false ){
//error, end of h4 tag wasn't found
}
$offset += strlen('campaign/');
$newStr = substr($str, $offset);
At this point $newStr will have all the text after 'campaign/'.
You then just need to use a similar process to find the special character position and use substr() to strip the string you want out.
You can also just use the good old string functions in this case, no need to involve regexps.
First find the string /campaign/, then take the substring with everything after it (tgadv/asd/whatever/?redirect), then find the next / or ? after the start of the string, and everything in between will be what you need (tgadv).

Function which searches for a word in a text and highlights all the words which contain it

This function searches for words (from the $words array) inside a text and highlights them.
function highlightWords(Array $words, $text){ // Loop through array of words
foreach($words as $word){ // Highlight word inside original text
$text = str_replace($word, '<span class="highlighted">' . $word . '</span>', $text);
}
return $text; // Return modified text
}
Here is the problem:
Lets say the $words = array("car", "drive");
Is there a way for the function to highlight not only the word car, but also words which contain the letters "car" like: cars, carmania, etc.
Thank you!
What you want is a regular expression, preg_replace or peg_replace_callback more in particular (callback in your case would be recommended)
<?php
$searchString = "The car is driving in the carpark, he's not holding to the right lane.\n";
// define your word list
$toHighlight = array("car","lane");
Because you need a regular expression to search your words and you might want or need variation or changes over time, it's bad practice to hard code it into your search words. Hence it's best to walk over the array with array_map and transform the searchword into the proper regular expression (here just enclosing it with / and adding the "accept everything until punctuation" expression)
$searchFor = array_map('addRegEx',$toHighlight);
// add the regEx to each word, this way you can adapt it without having to correct it everywhere
function addRegEx($word){
return "/" . $word . '[^ ,\,,.,?,\.]*/';
}
Next you wish to replace the word you found with your highlighted version, which means you need a dynamic change: use preg_replace_callback instead of regular preg_replace so that it calls a function for every match it find and uses it to generate the proper result. Here we enclose the found word in its span tags
function highlight($word){
return "<span class='highlight'>$word[0]</span>";
}
$result = preg_replace_callback($searchFor,'highlight',$searchString);
print $result;
yields
The <span class='highlight'>car</span> is driving in the <span class='highlight'>carpark</span>, he's not holding to the right <span class='highlight'>lane</span>.
So just paste these code fragments after the other to get the working code, obviously. ;)
edit: the complete code below was altered a bit = placed in routines for easy use by original requester. + case insensitivity
complete code:
<?php
$searchString = "The car is driving in the carpark, he's not holding to the right lane.\n";
$toHighlight = array("car","lane");
$result = customHighlights($searchString,$toHighlight);
print $result;
// add the regEx to each word, this way you can adapt it without having to correct it everywhere
function addRegEx($word){
return "/" . $word . '[^ ,\,,.,?,\.]*/i';
}
function highlight($word){
return "<span class='highlight'>$word[0]</span>";
}
function customHighlights($searchString,$toHighlight){
// define your word list
$searchFor = array_map('addRegEx',$toHighlight);
$result = preg_replace_callback($searchFor,'highlight',$searchString);
return $result;
}
I haven't tested it, but I think this should do it:-
$text = preg_replace('/\W((^\W)?$word(^\W)?)\W/', '<span class="highlighted">' . $1 . '</span>', $text);
This looks for the string inside a complete bounded word and then puts the span around the whole lot using preg_replace and regular expressions.
function replace($format, $string, array $words)
{
foreach ($words as $word) {
$string = \preg_replace(
sprintf('#\b(?<string>[^\s]*%s[^\s]*)\b#i', \preg_quote($word, '#')),
\sprintf($format, '$1'), $string);
}
return $string;
}
// courtesy of http://slipsum.com/#.T8PmfdVuBcE
$string = "Now that we know who you are, I know who I am. I'm not a mistake! It
all makes sense! In a comic, you know how you can tell who the arch-villain's
going to be? He's the exact opposite of the hero. And most times they're friends,
like you and me! I should've known way back when... You know why, David? Because
of the kids. They called me Mr Glass.";
echo \replace('<span class="red">%s</span>', $string, [
'mistake',
'villain',
'when',
'Mr Glass',
]);
Sine it's using an sprintf format for the surrounding string, you can change your replacement accordingly.
Excuse the 5.4 syntax

If a variable only contains one word

I would like to know how I could find out in PHP if a variable only contains 1 word. It should be able to recognise: "foo" "1326" ";394aa", etc.
It would be something like this:
$txt = "oneword";
if($txt == 1 word){ do.this; }else{ do.that; }
Thanks.
I'm assuming a word is defined as any string delimited by one space symbol
$txt = "multiple words";
if(strpos(trim($txt), ' ') !== false)
{
// multiple words
}
else
{
// one word
}
What defines one word? Are spaces allowed (perhaps for names)? Are hyphens allowed? Punctuation? Your question is not very clearly defined.
Going under the assumption that you just want to determine whether or not your value contains spaces, try using regular expressions:
http://php.net/manual/en/function.preg-match.php
<?php
$txt = "oneword";
if (preg_match("/ /", $txt)) {
echo "Multiple words.";
} else {
echo "One word.";
}
?>
Edit
The benefit to using regular expressions is that if you can become proficient in using them, they will solve a lot of your problems and make changing requirements in the future a lot easier. I would strongly recommend using regular expressions over a simple check for the position of a space, both for the complexity of the problem today (as again, perhaps spaces aren't the only way to delimit words in your requirements), as well as for the flexibility of changing requirements in the future.
Utilize the strpos function included within PHP.
Returns the position as an integer. If needle is not found, strpos()
will return boolean FALSE.
Besides strpos, an alternative would be explode and count:
$txt = trim("oneword secondword");
$words = explode( " ", $txt); // $words[0] = "oneword", $words[1] = "secondword"
if (count($words) == 1)
do this for one word
else
do that for more than one word assuming at least one word is inputted

Categories