getting all senteces containing particular word - php

I am trying to get all the sentence from text which contains set of sentences:
Here is my code and
http://ideone.com/fork/O9XtOY
<?php
$var = array('one','of','here','Another');
$str = 'Start of sentence one. This is a wordmatch one two three four! Another, sentence here.';
foreach ($var as $val)
{
$m =$val; // word
$regex = '/[A-Z][^\.\!\;]*('.$m.')[^\.;!]*/';
//
if (preg_match($regex, $str, $match))
{
echo $match[0];
echo "\n";
}
}
Why did it not print last sentence twice though I here and Another both appears in it
How can I skip sentence in the list if it already present? Want to remove the redundancy. I want to store sentence in some data structure/variable to use all such sentences later

I'd say your approach is a bit too convoluted. It's easier to:
first get all sentences,
and then filter this set by your criteria.
E.g:
// keywords to search for
$needles = array('one', 'of', 'here', 'Another');
// input text
$text = 'Start of sentence one. This is a wordmatch one two three four! Another, sentence here.';
// get all sentences (the pattern could be too simple though)
if (preg_match_all('/.+?[!.]\s*/', $text, $match)) {
// select only those fitting the criteria
$hits = array_filter($match[0], function ($sentence) use($needles) {
// check each keyword
foreach ($needles as $needle) {
// return early on first hit (or-condition)
if (false !== strpos($sentence, $needle)) {
return true;
}
}
return false;
});
// log output
print_r($hits);
}
demo: http://ideone.com/pZfOb5
Notes regarding:
if (preg_match_all('/.+?[!.]\s*/', $text, $match)) {
About the pattern:
.+? // select at least one char, ungreedy
[!.] // until one of the given sentence
// delimiters is found (could/should be extended as needed)
\s* // add all following whitespace
array_filter($match[0], function ($sentence) use($needles) {
array_filter just does what it's name suggests. It returns a filtered version of the input array (here $match[0]). The supplied callback (the inline function) get's called for each element of the array and should return true/false for whether the current element should be part of the new array.
The use-syntax allows access to the $needles-array, which is needed inside the function.

This will solve your problem
<?php
$var = array('one','of','here','Another');
$str = 'Start of sentence one. This is a wordmatch one two three four! Another, sentence here.';
foreach ($var as $val)
{
if (stripos($str,$val) !== false)
{
echo $val;
echo "\n";
}
}

Related

Get characters right after match in a foreach loop

On my site I want to detect if someone mentions a username in a comment, like so: what's up /u/username.
How exactly can I extract the characters following /u/ in a foreach loop?
Something like this:
if (strpos($commentString, '/u/') !== false) {
foreach /u/ in $commentString {
$username = the text immediately after /u/, stopping at anything that isn't a letter or a number
}
}
You can use preg_match_all with a regex of
/u/([a-z0-9]+)
to capture the usernames in the text. For example:
$text = "what's up /u/username have you seen /u/user21 today?";
preg_match_all('#/u/([a-z0-9]+)#i', $text, $matches);
foreach ($matches[1] as $user) {
echo "found user $user\n";
}
Output:
found user username
found user user21
Demo on 3v4l.org

preg_match Array String Replacement

I have an array $track['context'] This outputs the following
常州, 的, 妈咪ZA, 已揽件快件已从, 常州, 发出快件到达
Each one of these are tracking details.
I am running the array through the following code to try and run a preg_match for each item that is inside of $track['context'] then replace if a string from $badWords is present
$badWords = ['常州', '的']; // I would want these to end up being ['one', 'two']
$arrayToCheck = $track['context'];
foreach ($badWords as $badWord) {
if (preg_match("/($badWord)/", $arrayToCheck)) {
// Do I run my preg_match function here?
}
}
I would suggest a data structure for the bad words, where the word is the key and the replacement the value in an associative array.
Then you could loop over your content array, and do the replacement with a callback function:
// Sample data:
$track['context'] = array(
'qdf 常州',
'fdhlkjfq fdkq ',
'的 fdsqfsf'
);
// Make a translation table for the bad words:
$badWords = [
'常州' => 'one',
'的' => 'two'
];
// Build a regular expression that matches any of the above words:
$regexp = "/\b(" . implode('|', array_map('preg_quote', array_keys($badWords))) . ")\b/u";
// Iterate over the content
foreach ($track['context'] as &$subject) {
$subject = preg_replace_callback($regexp, function($matches) use ($badWords) {
// Replace the matched bad word with what we have mapped for it:
return $badWords[$matches[0]];
}, $subject);
}
// Output results:
print_r ($track['context']);
See it run on eval.in
First of all, make $badWords an array, like:
$badWords = array('bad_word1', 'bad_word2');
Second, I would use the strpos function, which is used to find the occurrence of one string inside other.
Lastly, remember to set $noBadWordsFound to false in your code.

Skip the sentence that has certain first word

I want to check the first word of some sentences. If the first word are For, And, Nor, But, Or, etc, I want to skip the sentence.
Here's the code :
<?php
$sentence = 'For me more';
$arr = explode(' ',trim($sentence));
if(stripos($arr[0],'for') or stripos($arr[0],'but') or stripos($arr[0],'it')){
//doing something
}
?>
Blank result, Whats wrong ? thank you :)
Here, stripos will return 0 if the word is found (found at position 0).
It returns false if the word is not found.
You should write :
if(stripos($arr[0],'for') !== false or stripos($arr[0],'but') !== false or stripos($arr[0],'it') !== false){
//skip
}
Stripos returns the position on the first occurrence of the needle in the haystack
The first occurrence is at position 0, which evaluates to false.
Try this as an alternative
$sentence = 'For me more';
// make all words lowercase
$arr = explode(' ', strtolower(trim($sentence)));
if(in_array($arr[0], array('for', 'but', 'it'))) {
//doing something
echo "found: $sentence";
} else {
echo 'failed';
}
Perhaps use preg_filter if you are going to know what the string to be evaluated is (i.e. you don't need to parse out sentences).
$filter_array = array(
'/^for\s/i',
'/^and\s/i',
'/^nor\s/i',
// etc.
}
$sentence = 'For me more';
$result = preg_filter(trim($sentence), '', $filter_array);
if ($result === null) {
// this sentence did not match the filters
}
This allows you to determine a set of filter regex patterns to see if you have a match. Note that in this case I just used '' as "replacement" value, as you don't really care about actually making a replacement, this function just gives you a nice way to pas in an array of regular expressions.

mb_eregi_replace multiple matches get them

$string = 'test check one two test3';
$result = mb_eregi_replace ( 'test|test2|test3' , '<$1>' ,$string ,'i');
echo $result;
This should deliver: <test> check one two <test3>
Is it possible to get, that test and test3 was found, without using another match function ?
You can use preg_replace_callback instead:
$string = 'test check one two test3';
$matches = array();
$result = preg_replace_callback('/test|test2|test3/i' , function($match) use ($matches) {
$matches[] = $match;
return '<'.$match[0].'>';
}, $string);
echo $result;
Here preg_replace_callback will call the passed callback function for each match of the pattern (note that its syntax differs from POSIX). In this case the callback function is an anonymous function that adds the match to the $matches array and returns the substitution string that the matches are to be replaced by.
Another approach would be to use preg_split to split the string at the matched delimiters while also capturing the delimiters:
$parts = preg_split('/test|test2|test3/i', $string, null, PREG_SPLIT_DELIM_CAPTURE);
The result is an array of alternating non-matching and matching parts.
As far as I know, eregi is deprecated.
You could do something like this:
<?php
$str = 'test check one two test3';
$to_match = array("test", "test2", "test3");
$rep = array();
foreach($to_match as $val){
$rep[$val] = "<$val>";
}
echo strtr($str, $rep);
?>
This too allows you to easily add more strings to replace.
Hi following function used to found the any word from string
<?php
function searchword($string, $words)
{
$matchFound = count($words);// use tha no of word you want to search
$tempMatch = 0;
foreach ( $words as $word )
{
preg_match('/'.$word.'/',$string,$matches);
//print_r($matches);
if(!empty($matches))
{
$tempMatch++;
}
}
if($tempMatch==$matchFound)
{
return "found";
}
else
{
return "notFound";
}
}
$string = "test check one two test3";
/*** an array of words to highlight ***/
$words = array('test', 'test3');
$string = searchword($string, $words);
echo $string;
?>
If your string is utf-8, you could use preg_replace instead
$string = 'test check one two test3';
$result = preg_replace('/(test3)|(test2)|(test)/ui' , '<$1>' ,$string);
echo $result;
Oviously with this kind of data to match the result will be suboptimal
<test> check one two <test>3
You'll need a longer approach than a direct search and replace with regular expressions (surely if your patterns are prefixes of other patterns)
To begin with, the code you want to enhance does not seem to comply with its initial purpose (not at least in my computer). You can try something like this:
$string = 'test check one two test3';
$result = mb_eregi_replace('(test|test2|test3)', '<\1>', $string);
echo $result;
I've removed the i flag (which of course makes little sense here). Still, you'd still need to make the expression greedy.
As for the original question, here's a little proof of concept:
function replace($match){
$GLOBALS['matches'][] = $match;
return "<$match>";
}
$string = 'test check one two test3';
$matches = array();
$result = mb_eregi_replace('(test|test2|test3)', 'replace(\'\1\')', $string, 'e');
var_dump($result, $matches);
Please note this code is horrible and potentially insecure. I'd honestly go with the preg_replace_callback() solution proposed by Gumbo.

determine if a string contains one of a set of words in an array

I need a simple word filter that will kill a script if it detects a filtered word in a string.
say my words are as below
$showstopper = array(badword1, badword2, badword3, badword4);
$yourmouth = "im gonna badword3 you up";
if(something($yourmouth, $showstopper)){
//stop the show
}
You could implode the array of badwords into a regular expression, and see if it matches against the haystack. Or you could simply cycle through the array, and check each word individually.
From the comments:
$re = "/(" . implode("|", $showstopper) . ")/"; // '/(badword1|badword2)/'
if (preg_match($re, $yourmouth) > 0) { die("foulmouth"); }
in_array() is your friend
$yourmouth_array = explode(' ',$yourmouth);
foreach($yourmouth_array as $key=>$w){
if (in_array($w,$showstopper){
// stop the show, like, replace that element with '***'
$yourmouth_array[$key]= '***';
}
}
$yourmouth = implode(' ',$yourmouth_array);
You might want to benchmark this vs the foreach and preg_match approaches.
$showstopper = array('badword1', 'badword2', 'badword3', 'badword4');
$yourmouth = "im gonna badword3 you up";
$check = str_replace($showstopper, '****', $yourmouth, $count);
if($count > 0) {
//stop the show
}
A fast solution involves checking the key as this does not need to iterate over the array. It would require a modification of your bad words list, however.
$showstopper = array('badword1' => 1, 'badword2' => 1, 'badword3' => 1, 'badword4' => 1);
$yourmouth = "im gonna badword3 you up";
// split words on space
$words = explode(' ', $yourmouth);
foreach($words as $word) {
// filter extraneous characters out of the word
$word = preg_replace('/[^A-Za-z0-9]*/', '', $word);
// check for bad word match
if (isset($showstopper[$word])) {
die('game over');
}
}
The preg_replace ensures users don't abuse your filter by typing something like bad_word3. It also ensures the array key check doesn't bomb.
not sure why you would need to do this but heres a way to check and get the bad words that were used
$showstopper = array(badword1, badword2, badword3, badword4);
$yourmouth = "im gonna badword3 you up badword1";
function badWordCheck( $var ) {
global $yourmouth;
if (strpos($yourmouth, $var)) {
return true;
}
}
print_r(array_filter($showstopper, 'badWordCheck'));
array_filter() returns an array of bad words, so if the count() of it is 0 nothign bad was said

Categories