Matching word with php

Matching word with php - php

After doing some research I can't seem to find a solution to my problem. I have a list of bad words and I want to be able to see if a user left a comment with any of those words. I have tried different regular expressions with no success. BTW Im no regex guru.
Lets say I have the word $word = 'bi' on my list. And a comment that says: $comment = he is bi, using preg_match($pattern, $comment) where parent has been: 1)$word#i
2)/\s+($word)/\s+/i
3)/\b($word)/\b/i
With this code:
if (preg_match($pattern, $commentdata['comment_content'])) {
echo 'spam';
}
else {
echo 'true'
}
I get:
1)spamthis is also the case for words linke combination which I dont want to block
2)true
3)true
How can I make a pattern that only matches the word and not the word within?

this do the job, you was near the solution:
preg_match("~\b$word\b~i", $comment);
For some particular cases like 'bi-directional' :
you can use instead:
preg_match("~(?<![a-z]-)\b$word\b(?!-[a-z])~i", $comment);

$pattern = "/\b{$word}/\b/i" ;
or
$pattern = "/(\b{$word}\b)/i" ;
Will do the work.

Related

Create a function to find a specific word in the title

I have the following title formation on my website:
It's no use going back to yesterday, because at that time I was... Lewis Carroll
Always is: The phrase… (author).
I want to delete everything after the ellipsis (…), leaving only the sentence as the title. I thought of creating a function in php that would take the parts of the titles, throw them in an array and then I would work each part, identifying the only pattern I have in the title, which is the ellipsis… and then delete everything. But when I do that, in the X space of my array, it returns the following:
was...
In position 8 of the array comes the word and the ellipsis and I don't know how to find a pattern to delete the author of the title, my pattern was the ellipsis. Any idea?
<?php
$a = get_the_title(155571);
$search = '... ';
if(preg_match("/{$search}/i", $a)) {
echo 'true';
}
?>
I tried with the code above and found the ellipsis, but I needed to bring it into an array to delete the part I need. I tried something like this:
<?php
define('WP_USE_THEMES', false);
require('./wp-blog-header.php');
global $wpdb;
$title_array = explode(' ', get_the_title(155571));
$search = '... ';
if (array_key_exists("/{$search}/i",$title_array)) {
echo "true";
}
?>
I started doing it this way, but it doesn't work, any ideas?
Thanks,

If you use regex you need to escape the string as preg_quote() would do, because a dot belongs to the pattern.
But in your simple case, I would not use a regex and just search for the three dots from the end of the string.
Note: When the elipsis come from the browser, there's no way to detect in PHP.
$title = 'The phrase... (author).';
echo getPlainTitle($title);
function getPlainTitle(string $title) {
$rpos = strrpos($title, '...');
return ($rpos === false) ? $title : substr($title, 0, $rpos);
}
will output
The phrase

First of all, since you're working with regular expressions, you need to remember that . has a special meaning there: it means "any character". So /... / just means "any three characters followed by a space", which isn't what you want. To match a literal . you need to escape it as \.
Secondly, rather than searching or splitting, you could achieve what you want by replacing part of the string. For instance, you could find everything after the ellipsis, and replace it with an empty string. To do that you want a pattern of "dot dot dot followed by anything", where "anything" is spelled .*, so \.\.\..*
$title = preg_replace('/\.\.\..*/', '', $title);

Matching a substring (an apostrophe) in a given word using regex

I have a server application which looks up where the stress is in Russian words. The end user writes a word жажда. The server downloads a page from another server which contains the stresses indicated with apostrophes for each case/declension like this жа'жда. I need to find that word in the downloaded page.
In Russian the stress is always written after a vowel. I've been using so far a regex that is a grouping of all possible combinations (жа'жда|жажда'). Is there a more elegant solution using just a regex pattern instead of making a PHP script which creates all these combinations?
EDIT:
I have a word жажда
The downloaded page contains the string жа'жда. (notice the
apostrophe, I do not before-hand know where the apostrophe in the
word is)
I want to match the word with apostrophe (жа'жда).
P.S.: So far I have a PHP script creating the string (жа'жда|жажда') used in regex (apostrophe is only after vowels) which matches it. My goal is to get rid of this script and use just regex in case it's possible.

If I understand your question,
have these options (d'isorder|di'sorder|dis'order|diso'rder|disor'der|disord'er|disorde'r|disorder‌') and one of these is in the downloaded page and I need to find out which one it is
this may suit your needs:
<pre>
<?php
$s = "d'isorder|di'sorder|dis'order|diso'rder|disor'der|disord'er|disorde'r|disorder'|disorde'";
$s = explode("|",$s);
print_r($s);
$matches = preg_grep("#[aeiou]'#", $s);
print_r($matches);
running example: https://eval.in/207282

Uhm... Is this ok with you?
<?php
function find_stresses($word, $haystack) {
$pattern = preg_replace('/[aeiou]/', '\0\'?', $word);
$pattern = "/\b$pattern\b/";
// word = 'disorder', pattern = "diso'?rde'?r"
preg_match_all($pattern, $haystack, $matches);
return $matches[0];
}
$hay = "something diso'rder somethingelse";
find_stresses('disorder', $hay);
// => array(diso'rder)
You didn't specify if there can be more than one match, but if not, you could use preg_match instead of preg_match_all (faster). For example, in Italian language we have àncora and ancòra :P
Obviously if you use preg_match, the result would be a string instead of an array.

Based, on your code, and the requirements that no function is called and disorder is excluded. I think this is what you want. I have added a test vector.
<pre>
<?php
// test code
$downloadedPage = "
there is some disorde'r
there is some disord'er in the example
there is some di'sorder in the example
there also' is some order in the example
there is some disorder in the example
there is some dso'rder in the example
";
$word = 'disorder';
preg_match_all("#".preg_replace("#[aeiou]#", "$0'?", $word)."#iu"
, $downloadedPage
, $result
);
print_r($result);
$result = preg_grep("#'#"
, $result[0]
);
print_r($result);
// the code you need
$word = 'also';
preg_match("#".preg_replace("#[aeiou]#", "$0'?", $word)."#iu"
, $downloadedPage
, $result
);
print_r($result);
$result = preg_grep("#'#"
, $result
);
print_r($result);
Working demo: https://eval.in/207312

PHP regex parse my visible link

I want to check my link in a website, but I also want to check is it visible. I wrote this code:
$content = file_get_contents('tmp/test.html');
$pattern = '/<a\shref="http:\/\/mywebsite.com(.*)">(.*)<\/a>/siU';
$matches = [];
if(preg_match($pattern, $content, $matches)) {
$link = $matches[0];
$displayPattern = '/display(.?):(.?)none/si';
if(preg_match($displayPattern, $link)) {
echo 'not visible';
} else {
echo 'visible';
}
} else {
echo 'not found the link';
}
It works, but not perfect. If the link is like this:
<a class="sg" href="http://mywebsite.com">mywebsite.com</a>
the fist pattern won't work, but if I change the \s to (.*) it gives back string from the first a tag. The second problem is the two pattern. Is there any way to merge the first with negation of the second? The merged pattern has 2 results: visible or not found/invisible.

I'll try to guess.
You are having a problem if your code(one that you fetch with file_get_contents) looks like this
<a class="sg" href="http://mywebsite.com">mywebsite.com</a>
.
.
.
mywebsite.com
Your regex will return everything from first </a> tag because dot matches a new line(I guess you need it turned on, but if you dont, its 's' flag, so remove it)
Therefore
.*
will keep searching everything, so you need to make it greedy
(when its greedy it will stop searching once it finds what its looking for), like this
.*?
Your regex should look like this then
<a.*?href="http:\/\/mywebsite.com(.*?)">(.*?)<\/a>

Search variable content for specific matches

i have the fowling code in my project:
$title = "In this title we have the word GUN"
$needed_words = array('War', 'Gun', 'Shooting');
foreach($needed_words as $needed_word) {
if (preg_match("/\b$needed_word\b/", $title)) {
$the_word = "ECHO THE WORD THATS FIND INSIDE TITLE";
}
}
I want to check if $title contains one of 15 predefined words,
for example lets say:
if $title contains words "War, Gun, Shooting" then i want to assign the word that is find to $the_word
Thanks in advance for your time!

try this
$makearray=array('war','gun','shooting');
$title='gun';
if(in_array($title,$makearray))
{
$if_included='the value you want to give';
echo $if_included;
}
Note:- This will work if your $title contains exactly the same string that is present as one of the value in the array.Otherwise not.

The best approach would be to use regular expressions, as it is most flexible, and allows you to have more controll over the words which you like to match. To be sure that the string contains words like gun (but also guns), shoot (but also shooting) you can do the following:
$words = array(
'war',
'gun',
'shoot'
);
$pattern = '/(' . implode(')|(', $words) . ')/i';
$if_included = (bool) preg_match($pattern, "Some text - here");
var_dump($if_included);
This matches more then it should. For example it will return true also if the string contains a warning (becouse it starts with war) you can improve this by introducing additinal constraints to certain patterns. For example:
$words = array(
'war(?![a-z])', // now it will match "war", but not "warning"
'gun',
'shoot'
);

Function which searches for a word in a text and highlights all the words which contain it

This function searches for words (from the $words array) inside a text and highlights them.
function highlightWords(Array $words, $text){ // Loop through array of words
foreach($words as $word){ // Highlight word inside original text
$text = str_replace($word, '<span class="highlighted">' . $word . '</span>', $text);
}
return $text; // Return modified text
}
Here is the problem:
Lets say the $words = array("car", "drive");
Is there a way for the function to highlight not only the word car, but also words which contain the letters "car" like: cars, carmania, etc.
Thank you!

What you want is a regular expression, preg_replace or peg_replace_callback more in particular (callback in your case would be recommended)
<?php
$searchString = "The car is driving in the carpark, he's not holding to the right lane.\n";
// define your word list
$toHighlight = array("car","lane");
Because you need a regular expression to search your words and you might want or need variation or changes over time, it's bad practice to hard code it into your search words. Hence it's best to walk over the array with array_map and transform the searchword into the proper regular expression (here just enclosing it with / and adding the "accept everything until punctuation" expression)
$searchFor = array_map('addRegEx',$toHighlight);
// add the regEx to each word, this way you can adapt it without having to correct it everywhere
function addRegEx($word){
return "/" . $word . '[^ ,\,,.,?,\.]*/';
}
Next you wish to replace the word you found with your highlighted version, which means you need a dynamic change: use preg_replace_callback instead of regular preg_replace so that it calls a function for every match it find and uses it to generate the proper result. Here we enclose the found word in its span tags
function highlight($word){
return "<span class='highlight'>$word[0]</span>";
}
$result = preg_replace_callback($searchFor,'highlight',$searchString);
print $result;
yields
The <span class='highlight'>car</span> is driving in the <span class='highlight'>carpark</span>, he's not holding to the right <span class='highlight'>lane</span>.
So just paste these code fragments after the other to get the working code, obviously. ;)
edit: the complete code below was altered a bit = placed in routines for easy use by original requester. + case insensitivity
complete code:
<?php
$searchString = "The car is driving in the carpark, he's not holding to the right lane.\n";
$toHighlight = array("car","lane");
$result = customHighlights($searchString,$toHighlight);
print $result;
// add the regEx to each word, this way you can adapt it without having to correct it everywhere
function addRegEx($word){
return "/" . $word . '[^ ,\,,.,?,\.]*/i';
}
function highlight($word){
return "<span class='highlight'>$word[0]</span>";
}
function customHighlights($searchString,$toHighlight){
// define your word list
$searchFor = array_map('addRegEx',$toHighlight);
$result = preg_replace_callback($searchFor,'highlight',$searchString);
return $result;
}

I haven't tested it, but I think this should do it:-
$text = preg_replace('/\W((^\W)?$word(^\W)?)\W/', '<span class="highlighted">' . $1 . '</span>', $text);
This looks for the string inside a complete bounded word and then puts the span around the whole lot using preg_replace and regular expressions.

function replace($format, $string, array $words)
{
foreach ($words as $word) {
$string = \preg_replace(
sprintf('#\b(?<string>[^\s]*%s[^\s]*)\b#i', \preg_quote($word, '#')),
\sprintf($format, '$1'), $string);
}
return $string;
}
// courtesy of http://slipsum.com/#.T8PmfdVuBcE
$string = "Now that we know who you are, I know who I am. I'm not a mistake! It
all makes sense! In a comic, you know how you can tell who the arch-villain's
going to be? He's the exact opposite of the hero. And most times they're friends,
like you and me! I should've known way back when... You know why, David? Because
of the kids. They called me Mr Glass.";
echo \replace('<span class="red">%s</span>', $string, [
'mistake',
'villain',
'when',
'Mr Glass',
]);
Sine it's using an sprintf format for the surrounding string, you can change your replacement accordingly.
Excuse the 5.4 syntax

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Matching word with php - php

this do the job, you was near the solution: preg_match("~\b$word\b~i", $comment); For some particular cases like 'bi-directional' : you can use instead: preg_match("~(?<![a-z]-)\b$word\b(?!-[a-z])~i", $comment);

$pattern = "/\b{$word}/\b/i" ; or $pattern = "/(\b{$word}\b)/i" ; Will do the work.

Related

Create a function to find a specific word in the title

Matching a substring (an apostrophe) in a given word using regex

PHP regex parse my visible link

Search variable content for specific matches

Function which searches for a word in a text and highlights all the words which contain it

Categories

Resources