Check if a string contain multiple specific words - php

How to check, if a string contain multiple specific words?
I can check single words using following code:
$data = "text text text text text text text bad text text naughty";
if (strpos($data, 'bad') !== false) {
echo 'true';
}
But, I want to add more words to check. Something like this:
$data = "text text text text text text text bad text text naughty";
if (strpos($data, 'bad || naughty') !== false) {
echo 'true';
}
(if any of these words is found then it should return true)
But, above code does not work correctly. Any idea, what I'm doing wrong?

For this, you will need Regular Expressions and the preg_match function.
Something like:
if(preg_match('(bad|naughty)', $data) === 1) { }
The reason your attempt didn't work
Regular Expressions are parsed by the PHP regex engine. The problem with your syntax is that you used the || operator. This is not a regex operator, so it is counted as part of the string.
As correctly stated above, if it's counted as part of the string you're looking to match: 'bad || naughty' as a string, rather than an expression!

You can't do something like this:
if (strpos($data, 'bad || naughty') !== false) {
instead, you can use regex:
if(preg_match("/(bad|naughty|other)/i", $data)){
//one of these string found
}

strpos does search the exact string you pass as second parameter. If you want to check for multiple words you have to resort to different tools
regular expressions
if(preg_match("/\b(bad|naughty)\b/", $data)){
echo "Found";
}
(preg_match return 1 if there is a match in the string, 0 otherwise).
multiple str_pos calls
if (strpos($data, 'bad')!==false or strpos($data, 'naughty')!== false) {
echo "Found";
}
explode
if (count(array_intersect(explode(' ', $data),array('bad','naugthy')))) {
echo "Found";
}
The preferred solution, to me, should be the first. It is clear, maybe not so efficient due to the regex use but it does not report false positives and, for example, it will not trigger the echo if the string contains the word badmington
The regular expression can become a burden to create if it a lot of words (nothing you cannot solve with a line of php though $regex = '/\b('.join('|', $badWords).')\b/';
The second one is straight forward but can't differentiate bad from badmington.
The third split the string in words if they are separated by a space, a tab char will ruins your results.

if(preg_match('[bad|naughty]', $data) === true) { }
The above is not quite correct.
"preg_match() returns 1 if the pattern matches given subject, 0 if it does not, or FALSE if an error occurred."
So it should be just:
if(preg_match('[bad|naughty]', $data)) { }

substr_count()
I want to add one more way doing it with substr_count() (above all other answers):
if (substr_count($data, 'bad') || substr_count($data, 'naughty')){
echo "Found";
}
substr_count() is counting for how many times the string appears, so when it's 0 then you know that it was not found.
I would say this way is more readable than using str_pos() (which was mentioned in one of the answers) :
if (strpos($data, 'bad')!==false || strpos($data, 'naughty')!== false) {
echo "Found";
}

You have to strpos each word. Now you are checking if there is a string that states
'bad || naughty'
which doesn't exist.

A simple solution using an array for the words to be tested and the array_reduce() function:
$words_in_data = array_reduce( array( 'bad', 'naughty' ), function ( $carry, $check ) use ( $data ) {
return ! $carry ? false !== strpos( $data, $check ) : $carry;
} );
Then you can simply use:
if( $words_in_data ){
echo 'true';
}

Here is a function that can perform this operation without using regular expressions which could be slower. Instead of passing a single string for the task, pass an array like
if (strposMultiple($data, ['bad', 'naughty']) !== false) {
//...
}
Here is the function:
function strposMultiple($haystack, $needle, $offset = 0) {
if(is_string($needle))
return strpos($haystack, $needle, $offset);
else {
$min = false;
foreach($needle as $n) {
$pos = strpos($haystack, $n, $offset);
if($min === false || $pos < $min) {
$min = $pos;
}
}
return $min;
}
}

Related

Strpos returns false positives, what is a better php function to search for a specific word? [duplicate]

This question's answers are a community effort. Edit existing answers to improve this post. It is not currently accepting new answers or interactions.
Consider:
$a = 'How are you?';
if ($a contains 'are')
echo 'true';
Suppose I have the code above, what is the correct way to write the statement if ($a contains 'are')?
Now with PHP 8 you can do this using str_contains:
if (str_contains('How are you', 'are')) {
echo 'true';
}
RFC
Before PHP 8
You can use the strpos() function which is used to find the occurrence of one string inside another one:
$haystack = 'How are you?';
$needle = 'are';
if (strpos($haystack, $needle) !== false) {
echo 'true';
}
Note that the use of !== false is deliberate (neither != false nor === true will return the desired result); strpos() returns either the offset at which the needle string begins in the haystack string, or the boolean false if the needle isn't found. Since 0 is a valid offset and 0 is "falsey", we can't use simpler constructs like !strpos($a, 'are').
You could use regular expressions as it's better for word matching compared to strpos, as mentioned by other users. A strpos check for are will also return true for strings such as: fare, care, stare, etc. These unintended matches can simply be avoided in regular expression by using word boundaries.
A simple match for are could look something like this:
$a = 'How are you?';
if (preg_match('/\bare\b/', $a)) {
echo 'true';
}
On the performance side, strpos is about three times faster. When I did one million compares at once, it took preg_match 1.5 seconds to finish and for strpos it took 0.5 seconds.
Edit:
In order to search any part of the string, not just word by word, I would recommend using a regular expression like
$a = 'How are you?';
$search = 'are y';
if(preg_match("/{$search}/i", $a)) {
echo 'true';
}
The i at the end of regular expression changes regular expression to be case-insensitive, if you do not want that, you can leave it out.
Now, this can be quite problematic in some cases as the $search string isn't sanitized in any way, I mean, it might not pass the check in some cases as if $search is a user input they can add some string that might behave like some different regular expression...
Also, here's a great tool for testing and seeing explanations of various regular expressions Regex101
To combine both sets of functionality into a single multi-purpose function (including with selectable case sensitivity), you could use something like this:
function FindString($needle,$haystack,$i,$word)
{ // $i should be "" or "i" for case insensitive
if (strtoupper($word)=="W")
{ // if $word is "W" then word search instead of string in string search.
if (preg_match("/\b{$needle}\b/{$i}", $haystack))
{
return true;
}
}
else
{
if(preg_match("/{$needle}/{$i}", $haystack))
{
return true;
}
}
return false;
// Put quotes around true and false above to return them as strings instead of as bools/ints.
}
One more thing to take in mind, is that \b will not work in different languages other than english.
The explanation for this and the solution is taken from here:
\b represents the beginning or end of a word (Word Boundary). This
regex would match apple in an apple pie, but wouldn’t match apple in
pineapple, applecarts or bakeapples.
How about “café”? How can we extract the word “café” in regex?
Actually, \bcafé\b wouldn’t work. Why? Because “café” contains
non-ASCII character: é. \b can’t be simply used with Unicode such as
समुद्र, 감사, месяц and 😉 .
When you want to extract Unicode characters, you should directly
define characters which represent word boundaries.
The answer: (?<=[\s,.:;"']|^)UNICODE_WORD(?=[\s,.:;"']|$)
So in order to use the answer in PHP, you can use this function:
function contains($str, array $arr) {
// Works in Hebrew and any other unicode characters
// Thanks https://medium.com/#shiba1014/regex-word-boundaries-with-unicode-207794f6e7ed
// Thanks https://www.phpliveregex.com/
if (preg_match('/(?<=[\s,.:;"\']|^)' . $word . '(?=[\s,.:;"\']|$)/', $str)) return true;
}
And if you want to search for array of words, you can use this:
function arrayContainsWord($str, array $arr)
{
foreach ($arr as $word) {
// Works in Hebrew and any other unicode characters
// Thanks https://medium.com/#shiba1014/regex-word-boundaries-with-unicode-207794f6e7ed
// Thanks https://www.phpliveregex.com/
if (preg_match('/(?<=[\s,.:;"\']|^)' . $word . '(?=[\s,.:;"\']|$)/', $str)) return true;
}
return false;
}
As of PHP 8.0.0 you can now use str_contains
<?php
if (str_contains('abc', '')) {
echo "Checking the existence of the empty string will always"
return true;
}
Here is a little utility function that is useful in situations like this
// returns true if $needle is a substring of $haystack
function contains($needle, $haystack)
{
return strpos($haystack, $needle) !== false;
}
To determine whether a string contains another string you can use the PHP function strpos().
int strpos ( string $haystack , mixed $needle [, int $offset = 0 ] )`
<?php
$haystack = 'how are you';
$needle = 'are';
if (strpos($haystack,$needle) !== false) {
echo "$haystack contains $needle";
}
?>
CAUTION:
If the needle you are searching for is at the beginning of the haystack it will return position 0, if you do a == compare that will not work, you will need to do a ===
A == sign is a comparison and tests whether the variable / expression / constant to the left has the same value as the variable / expression / constant to the right.
A === sign is a comparison to see whether two variables / expresions / constants are equal AND have the same type - i.e. both are strings or both are integers.
One of the advantages of using this approach is that every PHP version supports this function, unlike str_contains().
While most of these answers will tell you if a substring appears in your string, that's usually not what you want if you're looking for a particular word, and not a substring.
What's the difference? Substrings can appear within other words:
The "are" at the beginning of "area"
The "are" at the end of "hare"
The "are" in the middle of "fares"
One way to mitigate this would be to use a regular expression coupled with word boundaries (\b):
function containsWord($str, $word)
{
return !!preg_match('#\\b' . preg_quote($word, '#') . '\\b#i', $str);
}
This method doesn't have the same false positives noted above, but it does have some edge cases of its own. Word boundaries match on non-word characters (\W), which are going to be anything that isn't a-z, A-Z, 0-9, or _. That means digits and underscores are going to be counted as word characters and scenarios like this will fail:
The "are" in "What _are_ you thinking?"
The "are" in "lol u dunno wut those are4?"
If you want anything more accurate than this, you'll have to start doing English language syntax parsing, and that's a pretty big can of worms (and assumes proper use of syntax, anyway, which isn't always a given).
Look at strpos():
<?php
$mystring = 'abc';
$findme = 'a';
$pos = strpos($mystring, $findme);
// Note our use of ===. Simply, == would not work as expected
// because the position of 'a' was the 0th (first) character.
if ($pos === false) {
echo "The string '$findme' was not found in the string '$mystring'.";
} else {
echo "The string '$findme' was found in the string '$mystring',";
echo " and exists at position $pos.";
}
Using strstr() or stristr() if your search should be case insensitive would be another option.
Peer to SamGoody and Lego Stormtroopr comments.
If you are looking for a PHP algorithm to rank search results based on proximity/relevance of multiple words
here comes a quick and easy way of generating search results with PHP only:
Issues with the other boolean search methods such as strpos(), preg_match(), strstr() or stristr()
can't search for multiple words
results are unranked
PHP method based on Vector Space Model and tf-idf (term frequency–inverse document frequency):
It sounds difficult but is surprisingly easy.
If we want to search for multiple words in a string the core problem is how we assign a weight to each one of them?
If we could weight the terms in a string based on how representative they are of the string as a whole,
we could order our results by the ones that best match the query.
This is the idea of the vector space model, not far from how SQL full-text search works:
function get_corpus_index($corpus = array(), $separator=' ') {
$dictionary = array();
$doc_count = array();
foreach($corpus as $doc_id => $doc) {
$terms = explode($separator, $doc);
$doc_count[$doc_id] = count($terms);
// tf–idf, short for term frequency–inverse document frequency,
// according to wikipedia is a numerical statistic that is intended to reflect
// how important a word is to a document in a corpus
foreach($terms as $term) {
if(!isset($dictionary[$term])) {
$dictionary[$term] = array('document_frequency' => 0, 'postings' => array());
}
if(!isset($dictionary[$term]['postings'][$doc_id])) {
$dictionary[$term]['document_frequency']++;
$dictionary[$term]['postings'][$doc_id] = array('term_frequency' => 0);
}
$dictionary[$term]['postings'][$doc_id]['term_frequency']++;
}
//from http://phpir.com/simple-search-the-vector-space-model/
}
return array('doc_count' => $doc_count, 'dictionary' => $dictionary);
}
function get_similar_documents($query='', $corpus=array(), $separator=' '){
$similar_documents=array();
if($query!=''&&!empty($corpus)){
$words=explode($separator,$query);
$corpus=get_corpus_index($corpus, $separator);
$doc_count=count($corpus['doc_count']);
foreach($words as $word) {
if(isset($corpus['dictionary'][$word])){
$entry = $corpus['dictionary'][$word];
foreach($entry['postings'] as $doc_id => $posting) {
//get term frequency–inverse document frequency
$score=$posting['term_frequency'] * log($doc_count + 1 / $entry['document_frequency'] + 1, 2);
if(isset($similar_documents[$doc_id])){
$similar_documents[$doc_id]+=$score;
}
else{
$similar_documents[$doc_id]=$score;
}
}
}
}
// length normalise
foreach($similar_documents as $doc_id => $score) {
$similar_documents[$doc_id] = $score/$corpus['doc_count'][$doc_id];
}
// sort from high to low
arsort($similar_documents);
}
return $similar_documents;
}
CASE 1
$query = 'are';
$corpus = array(
1 => 'How are you?',
);
$match_results=get_similar_documents($query,$corpus);
echo '<pre>';
print_r($match_results);
echo '</pre>';
RESULT
Array
(
[1] => 0.52832083357372
)
CASE 2
$query = 'are';
$corpus = array(
1 => 'how are you today?',
2 => 'how do you do',
3 => 'here you are! how are you? Are we done yet?'
);
$match_results=get_similar_documents($query,$corpus);
echo '<pre>';
print_r($match_results);
echo '</pre>';
RESULTS
Array
(
[1] => 0.54248125036058
[3] => 0.21699250014423
)
CASE 3
$query = 'we are done';
$corpus = array(
1 => 'how are you today?',
2 => 'how do you do',
3 => 'here you are! how are you? Are we done yet?'
);
$match_results=get_similar_documents($query,$corpus);
echo '<pre>';
print_r($match_results);
echo '</pre>';
RESULTS
Array
(
[3] => 0.6813781191217
[1] => 0.54248125036058
)
There are plenty of improvements to be made
but the model provides a way of getting good results from natural queries,
which don't have boolean operators such as strpos(), preg_match(), strstr() or stristr().
NOTA BENE
Optionally eliminating redundancy prior to search the words
thereby reducing index size and resulting in less storage requirement
less disk I/O
faster indexing and a consequently faster search.
1. Normalisation
Convert all text to lower case
2. Stopword elimination
Eliminate words from the text which carry no real meaning (like 'and', 'or', 'the', 'for', etc.)
3. Dictionary substitution
Replace words with others which have an identical or similar meaning.
(ex:replace instances of 'hungrily' and 'hungry' with 'hunger')
Further algorithmic measures (snowball) may be performed to further reduce words to their essential meaning.
The replacement of colour names with their hexadecimal equivalents
The reduction of numeric values by reducing precision are other ways of normalising the text.
RESOURCES
http://linuxgazette.net/164/sephton.html
http://snowball.tartarus.org/
MySQL Fulltext Search Score Explained
http://dev.mysql.com/doc/internals/en/full-text-search.html
http://en.wikipedia.org/wiki/Vector_space_model
http://en.wikipedia.org/wiki/Tf%E2%80%93idf
http://phpir.com/simple-search-the-vector-space-model/
Make use of case-insensitve matching using stripos():
if (stripos($string,$stringToSearch) !== false) {
echo 'true';
}
If you want to avoid the "falsey" and "truthy" problem, you can use substr_count:
if (substr_count($a, 'are') > 0) {
echo "at least one 'are' is present!";
}
It's a bit slower than strpos but it avoids the comparison problems.
if (preg_match('/(are)/', $a)) {
echo 'true';
}
Another option is to use the strstr() function. Something like:
if (strlen(strstr($haystack,$needle))>0) {
// Needle Found
}
Point to note: The strstr() function is case-sensitive. For a case-insensitive search, use the stristr() function.
I'm a bit impressed that none of the answers here that used strpos, strstr and similar functions mentioned Multibyte String Functions yet (2015-05-08).
Basically, if you're having trouble finding words with characters specific to some languages, such as German, French, Portuguese, Spanish, etc. (e.g.: ä, é, ô, ç, º, ñ), you may want to precede the functions with mb_. Therefore, the accepted answer would use mb_strpos or mb_stripos (for case-insensitive matching) instead:
if (mb_strpos($a,'are') !== false) {
echo 'true';
}
If you cannot guarantee that all your data is 100% in UTF-8, you may want to use the mb_ functions.
A good article to understand why is The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) by Joel Spolsky.
In PHP, the best way to verify if a string contains a certain substring, is to use a simple helper function like this:
function contains($haystack, $needle, $caseSensitive = false) {
return $caseSensitive ?
(strpos($haystack, $needle) === FALSE ? FALSE : TRUE):
(stripos($haystack, $needle) === FALSE ? FALSE : TRUE);
}
Explanation:
strpos finds the position of the first occurrence of a case-sensitive substring in a string.
stripos finds the position of the first occurrence of a case-insensitive substring in a string.
myFunction($haystack, $needle) === FALSE ? FALSE : TRUE ensures that myFunction always returns a boolean and fixes unexpected behavior when the index of the substring is 0.
$caseSensitive ? A : B selects either strpos or stripos to do the work, depending on the value of $caseSensitive.
Output:
var_dump(contains('bare','are')); // Outputs: bool(true)
var_dump(contains('stare', 'are')); // Outputs: bool(true)
var_dump(contains('stare', 'Are')); // Outputs: bool(true)
var_dump(contains('stare', 'Are', true)); // Outputs: bool(false)
var_dump(contains('hair', 'are')); // Outputs: bool(false)
var_dump(contains('aren\'t', 'are')); // Outputs: bool(true)
var_dump(contains('Aren\'t', 'are')); // Outputs: bool(true)
var_dump(contains('Aren\'t', 'are', true)); // Outputs: bool(false)
var_dump(contains('aren\'t', 'Are')); // Outputs: bool(true)
var_dump(contains('aren\'t', 'Are', true)); // Outputs: bool(false)
var_dump(contains('broad', 'are')); // Outputs: bool(false)
var_dump(contains('border', 'are')); // Outputs: bool(false)
You can use the strstr function:
$haystack = "I know programming";
$needle = "know";
$flag = strstr($haystack, $needle);
if ($flag){
echo "true";
}
Without using an inbuilt function:
$haystack = "hello world";
$needle = "llo";
$i = $j = 0;
while (isset($needle[$i])) {
while (isset($haystack[$j]) && ($needle[$i] != $haystack[$j])) {
$j++;
$i = 0;
}
if (!isset($haystack[$j])) {
break;
}
$i++;
$j++;
}
if (!isset($needle[$i])) {
echo "YES";
}
else{
echo "NO ";
}
The function below also works and does not depend on any other function; it uses only native PHP string manipulation. Personally, I do not recommend this, but you can see how it works:
<?php
if (!function_exists('is_str_contain')) {
function is_str_contain($string, $keyword)
{
if (empty($string) || empty($keyword)) return false;
$keyword_first_char = $keyword[0];
$keyword_length = strlen($keyword);
$string_length = strlen($string);
// case 1
if ($string_length < $keyword_length) return false;
// case 2
if ($string_length == $keyword_length) {
if ($string == $keyword) return true;
else return false;
}
// case 3
if ($keyword_length == 1) {
for ($i = 0; $i < $string_length; $i++) {
// Check if keyword's first char == string's first char
if ($keyword_first_char == $string[$i]) {
return true;
}
}
}
// case 4
if ($keyword_length > 1) {
for ($i = 0; $i < $string_length; $i++) {
/*
the remaining part of the string is equal or greater than the keyword
*/
if (($string_length + 1 - $i) >= $keyword_length) {
// Check if keyword's first char == string's first char
if ($keyword_first_char == $string[$i]) {
$match = 1;
for ($j = 1; $j < $keyword_length; $j++) {
if (($i + $j < $string_length) && $keyword[$j] == $string[$i + $j]) {
$match++;
}
else {
return false;
}
}
if ($match == $keyword_length) {
return true;
}
// end if first match found
}
// end if remaining part
}
else {
return false;
}
// end for loop
}
// end case4
}
return false;
}
}
Test:
var_dump(is_str_contain("test", "t")); //true
var_dump(is_str_contain("test", "")); //false
var_dump(is_str_contain("test", "test")); //true
var_dump(is_str_contain("test", "testa")); //flase
var_dump(is_str_contain("a----z", "a")); //true
var_dump(is_str_contain("a----z", "z")); //true
var_dump(is_str_contain("mystringss", "strings")); //true
Lot of answers that use substr_count checks if the result is >0. But since the if statement considers zero the same as false, you can avoid that check and write directly:
if (substr_count($a, 'are')) {
To check if not present, add the ! operator:
if (!substr_count($a, 'are')) {
I had some trouble with this, and finally I chose to create my own solution. Without using regular expression engine:
function contains($text, $word)
{
$found = false;
$spaceArray = explode(' ', $text);
$nonBreakingSpaceArray = explode(chr(160), $text);
if (in_array($word, $spaceArray) ||
in_array($word, $nonBreakingSpaceArray)
) {
$found = true;
}
return $found;
}
You may notice that the previous solutions are not an answer for the word being used as a prefix for another. In order to use your example:
$a = 'How are you?';
$b = "a skirt that flares from the waist";
$c = "are";
With the samples above, both $a and $b contains $c, but you may want your function to tell you that only $a contains $c.
Another option to finding the occurrence of a word from a string using strstr() and stristr() is like the following:
<?php
$a = 'How are you?';
if (strstr($a,'are')) // Case sensitive
echo 'true';
if (stristr($a,'are')) // Case insensitive
echo 'true';
?>
It can be done in three different ways:
$a = 'How are you?';
1- stristr()
if (strlen(stristr($a,"are"))>0) {
echo "true"; // are Found
}
2- strpos()
if (strpos($a, "are") !== false) {
echo "true"; // are Found
}
3- preg_match()
if( preg_match("are",$a) === 1) {
echo "true"; // are Found
}
The short-hand version
$result = false!==strpos($a, 'are');
Do not use preg_match() if you only want to check if one string is contained in another string. Use strpos() or strstr() instead as they will be faster. (http://in2.php.net/preg_match)
if (strpos($text, 'string_name') !== false){
echo 'get the string';
}
In order to find a 'word', rather than the occurrence of a series of letters that could in fact be a part of another word, the following would be a good solution.
$string = 'How are you?';
$array = explode(" ", $string);
if (in_array('are', $array) ) {
echo 'Found the word';
}
You should use case Insensitive format,so if the entered value is in small or caps it wont matter.
<?php
$grass = "This is pratik joshi";
$needle = "pratik";
if (stripos($grass,$needle) !== false) {
/*If i EXCLUDE : !== false then if string is found at 0th location,
still it will say STRING NOT FOUND as it will return '0' and it
will goto else and will say NOT Found though it is found at 0th location.*/
echo 'Contains word';
}else{
echo "does NOT contain word";
}
?>
Here stripos finds needle in heystack without considering case (small/caps).
PHPCode Sample with output
Maybe you could use something like this:
<?php
findWord('Test all OK');
function findWord($text) {
if (strstr($text, 'ok')) {
echo 'Found a word';
}
else
{
echo 'Did not find a word';
}
}
?>
If you want to check if the string contains several specifics words, you can do:
$badWords = array("dette", "capitale", "rembourser", "ivoire", "mandat");
$string = "a string with the word ivoire";
$matchFound = preg_match_all("/\b(" . implode($badWords,"|") . ")\b/i", $string, $matches);
if ($matchFound) {
echo "a bad word has been found";
}
else {
echo "your string is okay";
}
This is useful to avoid spam when sending emails for example.
The strpos function works fine, but if you want to do case-insensitive checking for a word in a paragraph then you can make use of the stripos function of PHP.
For example,
$result = stripos("I love PHP, I love PHP too!", "php");
if ($result === false) {
// Word does not exist
}
else {
// Word exists
}
Find the position of the first occurrence of a case-insensitive substring in a string.
If the word doesn't exist in the string then it will return false else it will return the position of the word.
A string can be checked with the below function:
function either_String_existor_not($str, $character) {
return strpos($str, $character) !== false;
}
You need to use identical/not identical operators because strpos can return 0 as it's index value. If you like ternary operators, consider using the following (seems a little backwards I'll admit):
echo FALSE === strpos($a,'are') ? 'false': 'true';
Check if string contains specific words?
This means the string has to be resolved into words (see note below).
One way to do this and to specify the separators is using preg_split (doc):
<?php
function contains_word($str, $word) {
// split string into words
// separators are substrings of at least one non-word character
$arr = preg_split('/\W+/', $str, NULL, PREG_SPLIT_NO_EMPTY);
// now the words can be examined each
foreach ($arr as $value) {
if ($value === $word) {
return true;
}
}
return false;
}
function test($str, $word) {
if (contains_word($str, $word)) {
echo "string '" . $str . "' contains word '" . $word . "'\n";
} else {
echo "string '" . $str . "' does not contain word '" . $word . "'\n" ;
}
}
$a = 'How are you?';
test($a, 'are');
test($a, 'ar');
test($a, 'hare');
?>
A run gives
$ php -f test.php
string 'How are you?' contains word 'are'
string 'How are you?' does not contain word 'ar'
string 'How are you?' does not contain word 'hare'
Note: Here we do not mean word for every sequence of symbols.
A practical definition of word is in the sense the PCRE regular expression engine, where words are substrings consisting of word characters only, being separated by non-word characters.
A "word" character is any letter or digit or the underscore character,
that is, any character which can be part of a Perl " word ". The
definition of letters and digits is controlled by PCRE's character
tables, and may vary if locale-specific matching is taking place (..)

Finding a substring within a PHP array?

I have a PHP script that loops through each row of a CSV file and organizes each line into an array:
$counter = 0;
$file = file($ReturnFile);
foreach($file as $k){
if(preg_match('/"/', $k)==1){
$csv[] = explode(',', $k);
$counter++;
}
}
...
while($x<$counter){
$line=$csv[$x];
This works; my question is about how to find a substring within each line. This:
foreach($line as $value){
if($value==$name_search){
// action
works if the value of $line is exactly equal to the value of $name_search ($name_search is a person's last name). However, this doesn't work if there is a space or additional characters in the value of $line (for example: $line equal to "Wilson (ID: 345)" or "Wilson " won't match a $name_search value of "Wilson".
So far I've tried:
if(strpos($value, $res_name_search) !== false){
if(substr($value, 0, strrpos($value, ' '))==$res_name_search){
if(substr(strval($value), 0, strrpos(strval($value), ' '))==$res_name_search){
without success ... Do I have a syntax error and/or is there a better way to accomplish this?
I think you have inverted the parameters. The following should work:
if (strpos($res_name_search, $value) !== false)
A minor note: use stripos for case-insensitive search.
Try to use strpos like this: if (strpos($res_name_search, $value))
use php TRIM function
Convert to either of the lowercase or uppercase before compairing
use var_dump to check the data type
instead of using var_dump type cast $value and $name_search to STRING
also check ===
Remove spaces (if required)
Use regular expression to remove (, ), :, -, ; etc...
and of course apply function strpos
You can apply above mentioned points in your logic (Order of points may be different)
Try this:
$str = 'This is my test: wilson ';
$search = "wilson";
if(strpos(strtolower($str), strtolower($search)) !== false){
echo 'found it';
}
You can also try this:
if (preg_match('/'.strtolower($res_name_search).'/', strtolower($value)))
This is the sort of situations for which the built-in PHP function stristr exists. This function is the case-insensitive equivalent of the strstr function which, according to the docs:
Returns part of haystack string starting from and including the first occurrence of needle to the end of haystack.
Using this function, achieving such a task becomes as easy as:
foreach($line as $value){
if( stristr($value, $name_search) !== FALSE){
// substring was found in search string, perform your action
You can read more about it in the official documentation
I hope this helps.
I advice you to use fgetcsv function, this will return you an array of your columns for each iteration as follow :
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
// ...
}
If in your CSV file, the column NAME is in position 2 for example and you want to know if token exist, just use
if (strpos('Wilson', $data[2] !== FALSE) {
// do your job
}
if you want to deal with case-insentivie, use stripos function

Problems with search & replace by using substr_replace

I have tried many ways, some work partially, some don't work at all. This one doesn't work at all because of the while condition which returns false all the time for some reason, it doesn't starts replacing.
What I basically want is I put into a string, then I search for a word and replace it with a different word. I managed to do that to the point that the only problem was not replacing the word if it was starting on the 0th position on the string, it was still echoing the 0th letter from the string and continued it with the new word. E.g. : "old is old" and I want to replace old with new, it would echo "onew is new".
Please tell me if I should have done anything differently as well, in order to have a more clean and perfectly optimized code for speeding up the website.
Thank you.
<?php
$offset = 0;
if (isset($_POST['user_input']) && !empty ($_POST['user_input'])) {
$initial_string = $_POST['user_input'];
$string_length = strlen($initial_string);
$new_string = $initial_string;
if (isset($_POST['search_input']) && !empty ($_POST['search_input'])) {
$search_input = $_POST['search_input'];
$search_input_length = strlen($search_input);
} else {
echo 'Please write the string that you want to replace into the Search input'.'<br>'.PHP_EOL;
}
if (isset($_POST['replace_input']) && !empty ($_POST['replace_input'])) {
$replace_input = $_POST['replace_input'];
$replace_input_length = strlen($replace_input);
} else {
echo 'Please write the string that you want to switch to into the Replace input'.'<br>'.PHP_EOL;
}
while (strpos($new_string,$search_input,$offset) === true) {
$strpos = strpos($new_string,$search_input,$offset);
if ($offset<$string_length) {
$new_string = substr_replace($new_string,$replace_input,$strpos,$search_input_length);
$offset = $offset + $replace_input_length;
} else {
break;
}
}
}
echo $new_string;
?>
<hr>
<form action="index.php" method="POST">
<textarea name="user_input" rows="7" cols="30"></textarea><Br>
Search: <input type="value" name="search_input"><br>
Replace: <input type="value" name="replace_input"><br>
<input type="submit" value="submit">
</form>
There are many things wrong with your code. These are some important things to take care of:
isset($var) && !empty($var) is redundant. empty($var) also checks if the variable is set and returns true if it is not. Just !empty($var) will suffice.
You're checking if strpos() returns the boolean value true. strpos() never returns true. It either returns the position of the needle in the haystack, or false if the needle was not found in the haystack.
Fixing your current code
Change the while condition to check if strpos() returns a non-false value (which is the case when a match is found):
while (strpos($new_string, $search_input, $offset) !== false)
{
$strpos = strpos($new_string, $search_input, $offset);
if ($offset < $string_length)
{
$new_string = substr_replace($new_string, $replace_input, $strpos, $search_input_length);
$offset = $offset + $replace_input_length;
}
else
{
break;
}
}
This should correctly output:
new is new
Working demo
A better solution
Your current code seems unnecessarily complicated. Essentially, you're just trying to replace all the occurrences of a substring in a string. This is exactly what str_replace() does. Use that function instead. Your code can then be simplified to just:
if (validation goes here) {
$new_string = str_replace($search_input, $replace_input, $new_string);
}
If I understand you want to replace all the occurrences of a given substring (inside a larger string) with another substring.
In order to do that you can simply use str_replace
http://www.php.net/manual/en/function.str-replace.php
Replace the while loop with
$new_string = str_replace( $search_input, $replace_input, $initial_string );
Its possible that i don't i don't understand exactly what you are trying to do, but it seems a little over-complicated. Why not just run a single str_replace over the whole user_input?
if (!empty($_POST['user_input'] && !empty ($_POST['search_input'] && !empty ($_POST['replace_input'])) {
$str = str_replace($_POST['search_input'], $_POST['replace_input'], $_POST['user_input']);
die(var_dump($str));
} else {
die('error');
}

No success by using strpos when searching a string

I have the following code/string:
$ids="#222#,#333#,#555#";
When I'm searching for a part using:
if(strpos($ids,"#222#"))
it won't find it. But when I'm searching without the hashes, it works using:
if(strpos($ids,"222"))
I've already tried using strval for the search parameter, but this won't work also.
strpos starts counting from 0, and returns false if nothing is found. You need to check if it's false with === like this...
if (strpos($ids, '#222#') === false) // not found
Or use !== if you want the opposite test...
if (strpos($ids, '#222#') !== false) // found
See the PHP Manual entry for more information
You are not explecitely testing for FALSE when using strpos. Use it like this:
if(strpos($string, '#222#') !== FALSE) {
// found
} else {
// not found
}
Explanation: You are using it like this:
if(strpos($string, '#222#')) {
// found
}
What is the problem with this? Answer: strpos() will return the position in string where the substring was found. In you case 0 as its at the beginning of the string. But 0 will be treated as false by PHP unless you issue an explicit check with === or !==.
It is working as expected. strpos() returns 0 because the string you're searching for is at the beginning of the word. You need to do an equality search:
Update your if() statement as follows:
if(strpos($ids, '#222') !== false)
{
// string was found!
}
Try with this :
$ids="#222#,#333#,#555#";
if(strpos($ids,"#222#") !== false)
{
echo "found";
}
You should use !== because the position of '#222#' is the 0th (first) character.

How do I check if a string contains a specific word?

This question's answers are a community effort. Edit existing answers to improve this post. It is not currently accepting new answers or interactions.
Consider:
$a = 'How are you?';
if ($a contains 'are')
echo 'true';
Suppose I have the code above, what is the correct way to write the statement if ($a contains 'are')?
Now with PHP 8 you can do this using str_contains:
if (str_contains('How are you', 'are')) {
echo 'true';
}
RFC
Before PHP 8
You can use the strpos() function which is used to find the occurrence of one string inside another one:
$haystack = 'How are you?';
$needle = 'are';
if (strpos($haystack, $needle) !== false) {
echo 'true';
}
Note that the use of !== false is deliberate (neither != false nor === true will return the desired result); strpos() returns either the offset at which the needle string begins in the haystack string, or the boolean false if the needle isn't found. Since 0 is a valid offset and 0 is "falsey", we can't use simpler constructs like !strpos($a, 'are').
You could use regular expressions as it's better for word matching compared to strpos, as mentioned by other users. A strpos check for are will also return true for strings such as: fare, care, stare, etc. These unintended matches can simply be avoided in regular expression by using word boundaries.
A simple match for are could look something like this:
$a = 'How are you?';
if (preg_match('/\bare\b/', $a)) {
echo 'true';
}
On the performance side, strpos is about three times faster. When I did one million compares at once, it took preg_match 1.5 seconds to finish and for strpos it took 0.5 seconds.
Edit:
In order to search any part of the string, not just word by word, I would recommend using a regular expression like
$a = 'How are you?';
$search = 'are y';
if(preg_match("/{$search}/i", $a)) {
echo 'true';
}
The i at the end of regular expression changes regular expression to be case-insensitive, if you do not want that, you can leave it out.
Now, this can be quite problematic in some cases as the $search string isn't sanitized in any way, I mean, it might not pass the check in some cases as if $search is a user input they can add some string that might behave like some different regular expression...
Also, here's a great tool for testing and seeing explanations of various regular expressions Regex101
To combine both sets of functionality into a single multi-purpose function (including with selectable case sensitivity), you could use something like this:
function FindString($needle,$haystack,$i,$word)
{ // $i should be "" or "i" for case insensitive
if (strtoupper($word)=="W")
{ // if $word is "W" then word search instead of string in string search.
if (preg_match("/\b{$needle}\b/{$i}", $haystack))
{
return true;
}
}
else
{
if(preg_match("/{$needle}/{$i}", $haystack))
{
return true;
}
}
return false;
// Put quotes around true and false above to return them as strings instead of as bools/ints.
}
One more thing to take in mind, is that \b will not work in different languages other than english.
The explanation for this and the solution is taken from here:
\b represents the beginning or end of a word (Word Boundary). This
regex would match apple in an apple pie, but wouldn’t match apple in
pineapple, applecarts or bakeapples.
How about “café”? How can we extract the word “café” in regex?
Actually, \bcafé\b wouldn’t work. Why? Because “café” contains
non-ASCII character: é. \b can’t be simply used with Unicode such as
समुद्र, 감사, месяц and 😉 .
When you want to extract Unicode characters, you should directly
define characters which represent word boundaries.
The answer: (?<=[\s,.:;"']|^)UNICODE_WORD(?=[\s,.:;"']|$)
So in order to use the answer in PHP, you can use this function:
function contains($str, array $arr) {
// Works in Hebrew and any other unicode characters
// Thanks https://medium.com/#shiba1014/regex-word-boundaries-with-unicode-207794f6e7ed
// Thanks https://www.phpliveregex.com/
if (preg_match('/(?<=[\s,.:;"\']|^)' . $word . '(?=[\s,.:;"\']|$)/', $str)) return true;
}
And if you want to search for array of words, you can use this:
function arrayContainsWord($str, array $arr)
{
foreach ($arr as $word) {
// Works in Hebrew and any other unicode characters
// Thanks https://medium.com/#shiba1014/regex-word-boundaries-with-unicode-207794f6e7ed
// Thanks https://www.phpliveregex.com/
if (preg_match('/(?<=[\s,.:;"\']|^)' . $word . '(?=[\s,.:;"\']|$)/', $str)) return true;
}
return false;
}
As of PHP 8.0.0 you can now use str_contains
<?php
if (str_contains('abc', '')) {
echo "Checking the existence of the empty string will always"
return true;
}
Here is a little utility function that is useful in situations like this
// returns true if $needle is a substring of $haystack
function contains($needle, $haystack)
{
return strpos($haystack, $needle) !== false;
}
To determine whether a string contains another string you can use the PHP function strpos().
int strpos ( string $haystack , mixed $needle [, int $offset = 0 ] )`
<?php
$haystack = 'how are you';
$needle = 'are';
if (strpos($haystack,$needle) !== false) {
echo "$haystack contains $needle";
}
?>
CAUTION:
If the needle you are searching for is at the beginning of the haystack it will return position 0, if you do a == compare that will not work, you will need to do a ===
A == sign is a comparison and tests whether the variable / expression / constant to the left has the same value as the variable / expression / constant to the right.
A === sign is a comparison to see whether two variables / expresions / constants are equal AND have the same type - i.e. both are strings or both are integers.
One of the advantages of using this approach is that every PHP version supports this function, unlike str_contains().
While most of these answers will tell you if a substring appears in your string, that's usually not what you want if you're looking for a particular word, and not a substring.
What's the difference? Substrings can appear within other words:
The "are" at the beginning of "area"
The "are" at the end of "hare"
The "are" in the middle of "fares"
One way to mitigate this would be to use a regular expression coupled with word boundaries (\b):
function containsWord($str, $word)
{
return !!preg_match('#\\b' . preg_quote($word, '#') . '\\b#i', $str);
}
This method doesn't have the same false positives noted above, but it does have some edge cases of its own. Word boundaries match on non-word characters (\W), which are going to be anything that isn't a-z, A-Z, 0-9, or _. That means digits and underscores are going to be counted as word characters and scenarios like this will fail:
The "are" in "What _are_ you thinking?"
The "are" in "lol u dunno wut those are4?"
If you want anything more accurate than this, you'll have to start doing English language syntax parsing, and that's a pretty big can of worms (and assumes proper use of syntax, anyway, which isn't always a given).
Look at strpos():
<?php
$mystring = 'abc';
$findme = 'a';
$pos = strpos($mystring, $findme);
// Note our use of ===. Simply, == would not work as expected
// because the position of 'a' was the 0th (first) character.
if ($pos === false) {
echo "The string '$findme' was not found in the string '$mystring'.";
} else {
echo "The string '$findme' was found in the string '$mystring',";
echo " and exists at position $pos.";
}
Using strstr() or stristr() if your search should be case insensitive would be another option.
Peer to SamGoody and Lego Stormtroopr comments.
If you are looking for a PHP algorithm to rank search results based on proximity/relevance of multiple words
here comes a quick and easy way of generating search results with PHP only:
Issues with the other boolean search methods such as strpos(), preg_match(), strstr() or stristr()
can't search for multiple words
results are unranked
PHP method based on Vector Space Model and tf-idf (term frequency–inverse document frequency):
It sounds difficult but is surprisingly easy.
If we want to search for multiple words in a string the core problem is how we assign a weight to each one of them?
If we could weight the terms in a string based on how representative they are of the string as a whole,
we could order our results by the ones that best match the query.
This is the idea of the vector space model, not far from how SQL full-text search works:
function get_corpus_index($corpus = array(), $separator=' ') {
$dictionary = array();
$doc_count = array();
foreach($corpus as $doc_id => $doc) {
$terms = explode($separator, $doc);
$doc_count[$doc_id] = count($terms);
// tf–idf, short for term frequency–inverse document frequency,
// according to wikipedia is a numerical statistic that is intended to reflect
// how important a word is to a document in a corpus
foreach($terms as $term) {
if(!isset($dictionary[$term])) {
$dictionary[$term] = array('document_frequency' => 0, 'postings' => array());
}
if(!isset($dictionary[$term]['postings'][$doc_id])) {
$dictionary[$term]['document_frequency']++;
$dictionary[$term]['postings'][$doc_id] = array('term_frequency' => 0);
}
$dictionary[$term]['postings'][$doc_id]['term_frequency']++;
}
//from http://phpir.com/simple-search-the-vector-space-model/
}
return array('doc_count' => $doc_count, 'dictionary' => $dictionary);
}
function get_similar_documents($query='', $corpus=array(), $separator=' '){
$similar_documents=array();
if($query!=''&&!empty($corpus)){
$words=explode($separator,$query);
$corpus=get_corpus_index($corpus, $separator);
$doc_count=count($corpus['doc_count']);
foreach($words as $word) {
if(isset($corpus['dictionary'][$word])){
$entry = $corpus['dictionary'][$word];
foreach($entry['postings'] as $doc_id => $posting) {
//get term frequency–inverse document frequency
$score=$posting['term_frequency'] * log($doc_count + 1 / $entry['document_frequency'] + 1, 2);
if(isset($similar_documents[$doc_id])){
$similar_documents[$doc_id]+=$score;
}
else{
$similar_documents[$doc_id]=$score;
}
}
}
}
// length normalise
foreach($similar_documents as $doc_id => $score) {
$similar_documents[$doc_id] = $score/$corpus['doc_count'][$doc_id];
}
// sort from high to low
arsort($similar_documents);
}
return $similar_documents;
}
CASE 1
$query = 'are';
$corpus = array(
1 => 'How are you?',
);
$match_results=get_similar_documents($query,$corpus);
echo '<pre>';
print_r($match_results);
echo '</pre>';
RESULT
Array
(
[1] => 0.52832083357372
)
CASE 2
$query = 'are';
$corpus = array(
1 => 'how are you today?',
2 => 'how do you do',
3 => 'here you are! how are you? Are we done yet?'
);
$match_results=get_similar_documents($query,$corpus);
echo '<pre>';
print_r($match_results);
echo '</pre>';
RESULTS
Array
(
[1] => 0.54248125036058
[3] => 0.21699250014423
)
CASE 3
$query = 'we are done';
$corpus = array(
1 => 'how are you today?',
2 => 'how do you do',
3 => 'here you are! how are you? Are we done yet?'
);
$match_results=get_similar_documents($query,$corpus);
echo '<pre>';
print_r($match_results);
echo '</pre>';
RESULTS
Array
(
[3] => 0.6813781191217
[1] => 0.54248125036058
)
There are plenty of improvements to be made
but the model provides a way of getting good results from natural queries,
which don't have boolean operators such as strpos(), preg_match(), strstr() or stristr().
NOTA BENE
Optionally eliminating redundancy prior to search the words
thereby reducing index size and resulting in less storage requirement
less disk I/O
faster indexing and a consequently faster search.
1. Normalisation
Convert all text to lower case
2. Stopword elimination
Eliminate words from the text which carry no real meaning (like 'and', 'or', 'the', 'for', etc.)
3. Dictionary substitution
Replace words with others which have an identical or similar meaning.
(ex:replace instances of 'hungrily' and 'hungry' with 'hunger')
Further algorithmic measures (snowball) may be performed to further reduce words to their essential meaning.
The replacement of colour names with their hexadecimal equivalents
The reduction of numeric values by reducing precision are other ways of normalising the text.
RESOURCES
http://linuxgazette.net/164/sephton.html
http://snowball.tartarus.org/
MySQL Fulltext Search Score Explained
http://dev.mysql.com/doc/internals/en/full-text-search.html
http://en.wikipedia.org/wiki/Vector_space_model
http://en.wikipedia.org/wiki/Tf%E2%80%93idf
http://phpir.com/simple-search-the-vector-space-model/
Make use of case-insensitve matching using stripos():
if (stripos($string,$stringToSearch) !== false) {
echo 'true';
}
If you want to avoid the "falsey" and "truthy" problem, you can use substr_count:
if (substr_count($a, 'are') > 0) {
echo "at least one 'are' is present!";
}
It's a bit slower than strpos but it avoids the comparison problems.
if (preg_match('/(are)/', $a)) {
echo 'true';
}
Another option is to use the strstr() function. Something like:
if (strlen(strstr($haystack,$needle))>0) {
// Needle Found
}
Point to note: The strstr() function is case-sensitive. For a case-insensitive search, use the stristr() function.
I'm a bit impressed that none of the answers here that used strpos, strstr and similar functions mentioned Multibyte String Functions yet (2015-05-08).
Basically, if you're having trouble finding words with characters specific to some languages, such as German, French, Portuguese, Spanish, etc. (e.g.: ä, é, ô, ç, º, ñ), you may want to precede the functions with mb_. Therefore, the accepted answer would use mb_strpos or mb_stripos (for case-insensitive matching) instead:
if (mb_strpos($a,'are') !== false) {
echo 'true';
}
If you cannot guarantee that all your data is 100% in UTF-8, you may want to use the mb_ functions.
A good article to understand why is The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) by Joel Spolsky.
In PHP, the best way to verify if a string contains a certain substring, is to use a simple helper function like this:
function contains($haystack, $needle, $caseSensitive = false) {
return $caseSensitive ?
(strpos($haystack, $needle) === FALSE ? FALSE : TRUE):
(stripos($haystack, $needle) === FALSE ? FALSE : TRUE);
}
Explanation:
strpos finds the position of the first occurrence of a case-sensitive substring in a string.
stripos finds the position of the first occurrence of a case-insensitive substring in a string.
myFunction($haystack, $needle) === FALSE ? FALSE : TRUE ensures that myFunction always returns a boolean and fixes unexpected behavior when the index of the substring is 0.
$caseSensitive ? A : B selects either strpos or stripos to do the work, depending on the value of $caseSensitive.
Output:
var_dump(contains('bare','are')); // Outputs: bool(true)
var_dump(contains('stare', 'are')); // Outputs: bool(true)
var_dump(contains('stare', 'Are')); // Outputs: bool(true)
var_dump(contains('stare', 'Are', true)); // Outputs: bool(false)
var_dump(contains('hair', 'are')); // Outputs: bool(false)
var_dump(contains('aren\'t', 'are')); // Outputs: bool(true)
var_dump(contains('Aren\'t', 'are')); // Outputs: bool(true)
var_dump(contains('Aren\'t', 'are', true)); // Outputs: bool(false)
var_dump(contains('aren\'t', 'Are')); // Outputs: bool(true)
var_dump(contains('aren\'t', 'Are', true)); // Outputs: bool(false)
var_dump(contains('broad', 'are')); // Outputs: bool(false)
var_dump(contains('border', 'are')); // Outputs: bool(false)
You can use the strstr function:
$haystack = "I know programming";
$needle = "know";
$flag = strstr($haystack, $needle);
if ($flag){
echo "true";
}
Without using an inbuilt function:
$haystack = "hello world";
$needle = "llo";
$i = $j = 0;
while (isset($needle[$i])) {
while (isset($haystack[$j]) && ($needle[$i] != $haystack[$j])) {
$j++;
$i = 0;
}
if (!isset($haystack[$j])) {
break;
}
$i++;
$j++;
}
if (!isset($needle[$i])) {
echo "YES";
}
else{
echo "NO ";
}
The function below also works and does not depend on any other function; it uses only native PHP string manipulation. Personally, I do not recommend this, but you can see how it works:
<?php
if (!function_exists('is_str_contain')) {
function is_str_contain($string, $keyword)
{
if (empty($string) || empty($keyword)) return false;
$keyword_first_char = $keyword[0];
$keyword_length = strlen($keyword);
$string_length = strlen($string);
// case 1
if ($string_length < $keyword_length) return false;
// case 2
if ($string_length == $keyword_length) {
if ($string == $keyword) return true;
else return false;
}
// case 3
if ($keyword_length == 1) {
for ($i = 0; $i < $string_length; $i++) {
// Check if keyword's first char == string's first char
if ($keyword_first_char == $string[$i]) {
return true;
}
}
}
// case 4
if ($keyword_length > 1) {
for ($i = 0; $i < $string_length; $i++) {
/*
the remaining part of the string is equal or greater than the keyword
*/
if (($string_length + 1 - $i) >= $keyword_length) {
// Check if keyword's first char == string's first char
if ($keyword_first_char == $string[$i]) {
$match = 1;
for ($j = 1; $j < $keyword_length; $j++) {
if (($i + $j < $string_length) && $keyword[$j] == $string[$i + $j]) {
$match++;
}
else {
return false;
}
}
if ($match == $keyword_length) {
return true;
}
// end if first match found
}
// end if remaining part
}
else {
return false;
}
// end for loop
}
// end case4
}
return false;
}
}
Test:
var_dump(is_str_contain("test", "t")); //true
var_dump(is_str_contain("test", "")); //false
var_dump(is_str_contain("test", "test")); //true
var_dump(is_str_contain("test", "testa")); //flase
var_dump(is_str_contain("a----z", "a")); //true
var_dump(is_str_contain("a----z", "z")); //true
var_dump(is_str_contain("mystringss", "strings")); //true
Lot of answers that use substr_count checks if the result is >0. But since the if statement considers zero the same as false, you can avoid that check and write directly:
if (substr_count($a, 'are')) {
To check if not present, add the ! operator:
if (!substr_count($a, 'are')) {
I had some trouble with this, and finally I chose to create my own solution. Without using regular expression engine:
function contains($text, $word)
{
$found = false;
$spaceArray = explode(' ', $text);
$nonBreakingSpaceArray = explode(chr(160), $text);
if (in_array($word, $spaceArray) ||
in_array($word, $nonBreakingSpaceArray)
) {
$found = true;
}
return $found;
}
You may notice that the previous solutions are not an answer for the word being used as a prefix for another. In order to use your example:
$a = 'How are you?';
$b = "a skirt that flares from the waist";
$c = "are";
With the samples above, both $a and $b contains $c, but you may want your function to tell you that only $a contains $c.
Another option to finding the occurrence of a word from a string using strstr() and stristr() is like the following:
<?php
$a = 'How are you?';
if (strstr($a,'are')) // Case sensitive
echo 'true';
if (stristr($a,'are')) // Case insensitive
echo 'true';
?>
It can be done in three different ways:
$a = 'How are you?';
1- stristr()
if (strlen(stristr($a,"are"))>0) {
echo "true"; // are Found
}
2- strpos()
if (strpos($a, "are") !== false) {
echo "true"; // are Found
}
3- preg_match()
if( preg_match("are",$a) === 1) {
echo "true"; // are Found
}
The short-hand version
$result = false!==strpos($a, 'are');
Do not use preg_match() if you only want to check if one string is contained in another string. Use strpos() or strstr() instead as they will be faster. (http://in2.php.net/preg_match)
if (strpos($text, 'string_name') !== false){
echo 'get the string';
}
In order to find a 'word', rather than the occurrence of a series of letters that could in fact be a part of another word, the following would be a good solution.
$string = 'How are you?';
$array = explode(" ", $string);
if (in_array('are', $array) ) {
echo 'Found the word';
}
You should use case Insensitive format,so if the entered value is in small or caps it wont matter.
<?php
$grass = "This is pratik joshi";
$needle = "pratik";
if (stripos($grass,$needle) !== false) {
/*If i EXCLUDE : !== false then if string is found at 0th location,
still it will say STRING NOT FOUND as it will return '0' and it
will goto else and will say NOT Found though it is found at 0th location.*/
echo 'Contains word';
}else{
echo "does NOT contain word";
}
?>
Here stripos finds needle in heystack without considering case (small/caps).
PHPCode Sample with output
Maybe you could use something like this:
<?php
findWord('Test all OK');
function findWord($text) {
if (strstr($text, 'ok')) {
echo 'Found a word';
}
else
{
echo 'Did not find a word';
}
}
?>
If you want to check if the string contains several specifics words, you can do:
$badWords = array("dette", "capitale", "rembourser", "ivoire", "mandat");
$string = "a string with the word ivoire";
$matchFound = preg_match_all("/\b(" . implode($badWords,"|") . ")\b/i", $string, $matches);
if ($matchFound) {
echo "a bad word has been found";
}
else {
echo "your string is okay";
}
This is useful to avoid spam when sending emails for example.
The strpos function works fine, but if you want to do case-insensitive checking for a word in a paragraph then you can make use of the stripos function of PHP.
For example,
$result = stripos("I love PHP, I love PHP too!", "php");
if ($result === false) {
// Word does not exist
}
else {
// Word exists
}
Find the position of the first occurrence of a case-insensitive substring in a string.
If the word doesn't exist in the string then it will return false else it will return the position of the word.
A string can be checked with the below function:
function either_String_existor_not($str, $character) {
return strpos($str, $character) !== false;
}
You need to use identical/not identical operators because strpos can return 0 as it's index value. If you like ternary operators, consider using the following (seems a little backwards I'll admit):
echo FALSE === strpos($a,'are') ? 'false': 'true';
Check if string contains specific words?
This means the string has to be resolved into words (see note below).
One way to do this and to specify the separators is using preg_split (doc):
<?php
function contains_word($str, $word) {
// split string into words
// separators are substrings of at least one non-word character
$arr = preg_split('/\W+/', $str, NULL, PREG_SPLIT_NO_EMPTY);
// now the words can be examined each
foreach ($arr as $value) {
if ($value === $word) {
return true;
}
}
return false;
}
function test($str, $word) {
if (contains_word($str, $word)) {
echo "string '" . $str . "' contains word '" . $word . "'\n";
} else {
echo "string '" . $str . "' does not contain word '" . $word . "'\n" ;
}
}
$a = 'How are you?';
test($a, 'are');
test($a, 'ar');
test($a, 'hare');
?>
A run gives
$ php -f test.php
string 'How are you?' contains word 'are'
string 'How are you?' does not contain word 'ar'
string 'How are you?' does not contain word 'hare'
Note: Here we do not mean word for every sequence of symbols.
A practical definition of word is in the sense the PCRE regular expression engine, where words are substrings consisting of word characters only, being separated by non-word characters.
A "word" character is any letter or digit or the underscore character,
that is, any character which can be part of a Perl " word ". The
definition of letters and digits is controlled by PCRE's character
tables, and may vary if locale-specific matching is taking place (..)

Categories