How to search array of string in another string in PHP? - php

Firstly, I want to inform that, what I need is the reverse of in_array PHP function.
I need to search all items of array in the string if any of them found, function will return true otherwise return false.
I need the fastest solution to this problem, off course this can be succeeded by iterating the array and using the strpos function.
Any suggestions are welcome.
Example Data:
$string = 'Alice goes to school every day';
$searchWords = array('basket','school','tree');
returns true
$string = 'Alice goes to school every day';
$searchWords = array('basket','cat','tree');
returns false

You should try with a preg_match:
if (preg_match('/' . implode('|', $searchWords) . '/', $string)) return true;
After some comments here a properly escaped solution:
function contains($string, Array $search, $caseInsensitive = false) {
$exp = '/'
. implode('|', array_map('preg_quote', $search))
. ($caseInsensitive ? '/i' : '/');
return preg_match($exp, $string) ? true : false;
}

function searchWords($string,$words)
{
foreach($words as $word)
{
if(stristr($string," " . $word . " ")) //spaces either side to force a word
{
return true;
}
}
return false;
}
Usage:
$string = 'Alice goes to school every day';
$searchWords = array('basket','cat','tree');
if(searchWords($string,$searchWords))
{
//matches
}
Also take note that the function stristr is used to make it not case-sensitive

As per the example of malko, but with properly escaping the values.
function contains( $string, array $search ) {
return 0 !== preg_match(
'/' . implode( '|', preg_quote( $search, '/' ) ) . '/',
$string
);
}

If string can be exploded using space following will work:
var_dump(array_intersect(explode(' ', $str), $searchWords) != null);
OUTPUT: for 2 examples you've provided:
bool(true)
bool(false)
Update:
If string cannot be exploded using space character, then use code like this to split string on any end of word character:
var_dump(array_intersect(preg_split('~\b~', $str), $searchWords) != null);

There is always debate over what is faster so I thought I'd run some tests using different methods.
Tests Run:
strpos
preg_match with foreach loop
preg_match with regex or
indexed search with string to explode
indexed search as array (string already exploded)
Two sets of tests where run. One on a large text document (114,350 words) and one on a small text document (120 words). Within each set, all tests were run 100 times and then an average was taken. Tests did not ignore case, which doing so would have made them all faster. Test for which the index was searched were pre-indexed. I wrote the code for indexing myself, and I'm sure it was less efficient, but indexing for the large file took 17.92 seconds and for the small file it took 0.001 seconds.
Terms searched for included: gazerbeam (NOT found in the document), legally (found in the document), and target (NOT found in the document).
Results in seconds to complete a single test, sorted by speed:
Large File:
0.0000455808639526 (index without explode)
0.0009979915618897 (preg_match using regex or)
0.0011657214164734 (strpos)
0.0023632574081421 (preg_match using foreach loop)
0.0051533532142639 (index with explode)
Small File
0.000003724098205566 (strpos)
0.000005958080291748 (preg_match using regex or)
0.000012607574462891 (preg_match using foreach loop)
0.000021204948425293 (index without explode)
0.000060625076293945 (index with explode)
Notice that strpos is faster than preg_match (using regex or) for small files, but slower for large files. Other factors, such as the number of search terms will of course affect this.
Algorithms Used:
//strpos
$str = file_get_contents('text.txt');
$t = microtime(true);
foreach ($search as $word) if (strpos($str, $word)) break;
$strpos += microtime(true) - $t;
//preg_match
$str = file_get_contents('text.txt');
$t = microtime(true);
foreach ($search as $word) if (preg_match('/' . preg_quote($word) . '/', $str)) break;
$pregmatch += microtime(true) - $t;
//preg_match (regex or)
$str = file_get_contents('text.txt');
$orstr = preg_quote(implode('|', $search));
$t = microtime(true);
if preg_match('/' . $orstr . '/', $str) {};
$pregmatchor += microtime(true) - $t;
//index with explode
$str = file_get_contents('textindex.txt');
$t = microtime(true);
$ar = explode(" ", $str);
foreach ($search as $word) {
$start = 0;
$end = count($ar);
do {
$diff = $end - $start;
$pos = floor($diff / 2) + $start;
$temp = $ar[$pos];
if ($word < $temp) {
$end = $pos;
} elseif ($word > $temp) {
$start = $pos + 1;
} elseif ($temp == $word) {
$found = 'true';
break;
}
} while ($diff > 0);
}
$indexwith += microtime(true) - $t;
//index without explode (already in array)
$str = file_get_contents('textindex.txt');
$found = 'false';
$ar = explode(" ", $str);
$t = microtime(true);
foreach ($search as $word) {
$start = 0;
$end = count($ar);
do {
$diff = $end - $start;
$pos = floor($diff / 2) + $start;
$temp = $ar[$pos];
if ($word < $temp) {
$end = $pos;
} elseif ($word > $temp) {
$start = $pos + 1;
} elseif ($temp == $word) {
$found = 'true';
break;
}
} while ($diff > 0);
}
$indexwithout += microtime(true) - $t;

try this:
$string = 'Alice goes to school every day';
$words = split(" ", $string);
$searchWords = array('basket','school','tree');
for($x = 0,$l = count($words); $x < $l;) {
if(in_array($words[$x++], $searchWords)) {
//....
}
}

Below prints the frequency of number of elements found from the array in the string
function inString($str, $arr, $matches=false)
{
$str = explode(" ", $str);
$c = 0;
for($i = 0; $i<count($str); $i++)
{
if(in_array($str[$i], $arr) )
{$c++;if($matches == false)break;}
}
return $c;
}

Below link will help you : just need to customize as you required.
Check if array element exists in string
customized:
function result_arrayInString($prdterms,208){
if(arrayInString($prdterms,208)){
return true;
}else{
return false;
}
}
This may be helpful to you.

Related

Need help escaping html special character in regex and php [duplicate]

Want to improve this post? Provide detailed answers to this question, including citations and an explanation of why your answer is correct. Answers without enough detail may be edited or deleted.
I need a function that returns the substring between two words (or two characters).
I'm wondering whether there is a php function that achieves that. I do not want to think about regex (well, I could do one but really don't think it's the best way to go). Thinking of strpos and substr functions.
Here's an example:
$string = "foo I wanna a cake foo";
We call the function: $substring = getInnerSubstring($string,"foo");
It returns: " I wanna a cake ".
Update:
Well, till now, I can just get a substring beteen two words in just one string, do you permit to let me go a bit farther and ask if I can extend the use of getInnerSubstring($str,$delim) to get any strings that are between delim value, example:
$string =" foo I like php foo, but foo I also like asp foo, foo I feel hero foo";
I get an array like {"I like php", "I also like asp", "I feel hero"}.
If the strings are different (ie: [foo] & [/foo]), take a look at this post from Justin Cook.
I copy his code below:
function get_string_between($string, $start, $end){
$string = ' ' . $string;
$ini = strpos($string, $start);
if ($ini == 0) return '';
$ini += strlen($start);
$len = strpos($string, $end, $ini) - $ini;
return substr($string, $ini, $len);
}
$fullstring = 'this is my [tag]dog[/tag]';
$parsed = get_string_between($fullstring, '[tag]', '[/tag]');
echo $parsed; // (result = dog)
Regular expressions is the way to go:
$str = 'before-str-after';
if (preg_match('/before-(.*?)-after/', $str, $match) == 1) {
echo $match[1];
}
onlinePhp
function getBetween($string, $start = "", $end = ""){
if (strpos($string, $start)) { // required if $start not exist in $string
$startCharCount = strpos($string, $start) + strlen($start);
$firstSubStr = substr($string, $startCharCount, strlen($string));
$endCharCount = strpos($firstSubStr, $end);
if ($endCharCount == 0) {
$endCharCount = strlen($firstSubStr);
}
return substr($firstSubStr, 0, $endCharCount);
} else {
return '';
}
}
Sample use:
echo getBetween("abc","a","c"); // returns: 'b'
echo getBetween("hello","h","o"); // returns: 'ell'
echo getBetween("World","a","r"); // returns: ''
use strstr php function twice.
$value = "This is a great day to be alive";
$value = strstr($value, "is"); //gets all text from needle on
$value = strstr($value, "be", true); //gets all text before needle
echo $value;
outputs:
"is a great day to"
function getInnerSubstring($string,$delim){
// "foo a foo" becomes: array(""," a ","")
$string = explode($delim, $string, 3); // also, we only need 2 items at most
// we check whether the 2nd is set and return it, otherwise we return an empty string
return isset($string[1]) ? $string[1] : '';
}
Example of use:
var_dump(getInnerSubstring('foo Hello world foo','foo'));
// prints: string(13) " Hello world "
If you want to remove surrounding whitespace, use trim. Example:
var_dump(trim(getInnerSubstring('foo Hello world foo','foo')));
// prints: string(11) "Hello world"
function getInbetweenStrings($start, $end, $str){
$matches = array();
$regex = "/$start([a-zA-Z0-9_]*)$end/";
preg_match_all($regex, $str, $matches);
return $matches[1];
}
for examle you want the array of strings(keys) between ## in following
example, where '/' doesn't fall in-between
$str = "C://##ad_custom_attr1##/##upn##/##samaccountname##";
$str_arr = getInbetweenStrings('##', '##', $str);
print_r($str_arr);
I like the regular expression solutions but none of the others suit me.
If you know there is only gonna be 1 result you can use the following:
$between = preg_replace('/(.*)BEFORE(.*)AFTER(.*)/s', '\2', $string);
Change BEFORE and AFTER to the desired delimiters.
Also keep in mind this function will return the whole string in case nothing matched.
This solution is multiline but you can play with the modifiers depending on your needs.
Not a php pro. but i recently ran into this wall too and this is what i came up with.
function tag_contents($string, $tag_open, $tag_close){
foreach (explode($tag_open, $string) as $key => $value) {
if(strpos($value, $tag_close) !== FALSE){
$result[] = substr($value, 0, strpos($value, $tag_close));;
}
}
return $result;
}
$string = "i love cute animals, like [animal]cat[/animal],
[animal]dog[/animal] and [animal]panda[/animal]!!!";
echo "<pre>";
print_r(tag_contents($string , "[animal]" , "[/animal]"));
echo "</pre>";
//result
Array
(
[0] => cat
[1] => dog
[2] => panda
)
A vast majority of answers here don't answer the edited part, I guess they were added before. It can be done with regex, as one answer mentions. I had a different approach.
This function searches $string and finds the first string between $start and $end strings, starting at $offset position. It then updates the $offset position to point to the start of the result. If $includeDelimiters is true, it includes the delimiters in the result.
If the $start or $end string are not found, it returns null. It also returns null if $string, $start, or $end are an empty string.
function str_between(string $string, string $start, string $end, bool $includeDelimiters = false, int &$offset = 0): ?string
{
if ($string === '' || $start === '' || $end === '') return null;
$startLength = strlen($start);
$endLength = strlen($end);
$startPos = strpos($string, $start, $offset);
if ($startPos === false) return null;
$endPos = strpos($string, $end, $startPos + $startLength);
if ($endPos === false) return null;
$length = $endPos - $startPos + ($includeDelimiters ? $endLength : -$startLength);
if (!$length) return '';
$offset = $startPos + ($includeDelimiters ? 0 : $startLength);
$result = substr($string, $offset, $length);
return ($result !== false ? $result : null);
}
The following function finds all strings that are between two strings (no overlaps). It requires the previous function, and the arguments are the same. After execution, $offset points to the start of the last found result string.
function str_between_all(string $string, string $start, string $end, bool $includeDelimiters = false, int &$offset = 0): ?array
{
$strings = [];
$length = strlen($string);
while ($offset < $length)
{
$found = str_between($string, $start, $end, $includeDelimiters, $offset);
if ($found === null) break;
$strings[] = $found;
$offset += strlen($includeDelimiters ? $found : $start . $found . $end); // move offset to the end of the newfound string
}
return $strings;
}
Examples:
str_between_all('foo 1 bar 2 foo 3 bar', 'foo', 'bar') gives [' 1 ', ' 3 '].
str_between_all('foo 1 bar 2', 'foo', 'bar') gives [' 1 '].
str_between_all('foo 1 foo 2 foo 3 foo', 'foo', 'foo') gives [' 1 ', ' 3 '].
str_between_all('foo 1 bar', 'foo', 'foo') gives [].
If you're using foo as a delimiter, then look at explode()
<?php
function getBetween($content,$start,$end){
$r = explode($start, $content);
if (isset($r[1])){
$r = explode($end, $r[1]);
return $r[0];
}
return '';
}
?>
Example:
<?php
$content = "Try to find the guy in the middle with this function!";
$start = "Try to find ";
$end = " with this function!";
$output = getBetween($content,$start,$end);
echo $output;
?>
This will return "the guy in the middle".
Simple, short, and sweet. It's up to you to make any enhancements.
function getStringBetween($str, $start, $end)
{
$pos1 = strpos($str, $start);
$pos2 = strpos($str, $end);
return substr($str, $pos1+1, $pos2-($pos1+1));
}
If you have multiple recurrences from a single string and you have different [start] and [\end] pattern.
Here's a function which output an array.
function get_string_between($string, $start, $end){
$split_string = explode($end,$string);
foreach($split_string as $data) {
$str_pos = strpos($data,$start);
$last_pos = strlen($data);
$capture_len = $last_pos - $str_pos;
$return[] = substr($data,$str_pos+1,$capture_len);
}
return $return;
}
Here's a function
function getInnerSubstring($string, $boundstring, $trimit=false) {
$res = false;
$bstart = strpos($string, $boundstring);
if ($bstart >= 0) {
$bend = strrpos($string, $boundstring);
if ($bend >= 0 && $bend > $bstart)
$res = substr($string, $bstart+strlen($boundstring), $bend-$bstart-strlen($boundstring));
}
return $trimit ? trim($res) : $res;
}
Use it like
$string = "foo I wanna a cake foo";
$substring = getInnerSubstring($string, "foo");
echo $substring;
Output (note that it returns spaces in front and at the and of your string if exist)
I wanna a cake
If you want to trim result use function like
$substring = getInnerSubstring($string, "foo", true);
Result: This function will return false if $boundstring was not found in $string or if $boundstring exists only once in $string, otherwise it returns substring between first and last occurrence of $boundstring in $string.
References
strpos()
strrpos()
substr()
trim()
Improvement of Alejandro's answer. You can leave the $start or $end arguments empty and it will use the start or end of the string.
echo get_string_between("Hello my name is bob", "my", ""); //output: " name is bob"
private function get_string_between($string, $start, $end){ // Get
if($start != ''){ //If $start is empty, use start of the string
$string = ' ' . $string;
$ini = strpos($string, $start);
if ($ini == 0) return '';
$ini += strlen($start);
}
else{
$ini = 0;
}
if ($end == '') { //If $end is blank, use end of string
return substr($string, $ini);
}
else{
$len = strpos($string, $end, $ini) - $ini; //Work out length of string
return substr($string, $ini, $len);
}
}
private function getStringBetween(string $from, string $to, string $haystack): string
{
$fromPosition = strpos($haystack, $from) + strlen($from);
$toPosition = strpos($haystack, $to, $fromPosition);
$betweenLength = $toPosition - $fromPosition;
return substr($haystack, $fromPosition, $betweenLength);
}
Use:
<?php
$str = "...server daemon started with pid=6849 (parent=6848).";
$from = "pid=";
$to = "(";
echo getStringBetween($str,$from,$to);
function getStringBetween($str,$from,$to)
{
$sub = substr($str, strpos($str,$from)+strlen($from),strlen($str));
return substr($sub,0,strpos($sub,$to));
}
?>
A bit improved code from GarciaWebDev and Henry Wang. If empty $start or $end is given, function returns values from the beginning or to the end of the $string. Also Inclusive option is available, whether we want to include search result or not:
function get_string_between ($string, $start, $end, $inclusive = false){
$string = " ".$string;
if ($start == "") { $ini = 0; }
else { $ini = strpos($string, $start); }
if ($end == "") { $len = strlen($string); }
else { $len = strpos($string, $end, $ini) - $ini;}
if (!$inclusive) { $ini += strlen($start); }
else { $len += strlen($end); }
return substr($string, $ini, $len);
}
I have to add something to the post of Julius Tilvikas. I looked for a solution like this one he described in his post. But i think there is a mistake. I don't get realy the string between two string, i also get more with this solution, because i have to substract the lenght of the start-string. When do this, i realy get the String between two strings.
Here are my changes of his solution:
function get_string_between ($string, $start, $end, $inclusive = false){
$string = " ".$string;
if ($start == "") { $ini = 0; }
else { $ini = strpos($string, $start); }
if ($end == "") { $len = strlen($string); }
else { $len = strpos($string, $end, $ini) - $ini - strlen($start);}
if (!$inclusive) { $ini += strlen($start); }
else { $len += strlen($end); }
return substr($string, $ini, $len);
}
Greetz
V
Try this, Its work for me, get data between test word.
$str = "Xdata test HD01 test 1data";
$result = explode('test',$str);
print_r($result);
echo $result[1];
In PHP's strpos style this will return false if the start mark sm or the end mark em are not found.
This result (false) is different from an empty string that is what you get if there is nothing between the start and end marks.
function between( $str, $sm, $em )
{
$s = strpos( $str, $sm );
if( $s === false ) return false;
$s += strlen( $sm );
$e = strpos( $str, $em, $s );
if( $e === false ) return false;
return substr( $str, $s, $e - $s );
}
The function will return only the first match.
It's obvious but worth mentioning that the function will first look for sm and then for em.
This implies you may not get the desired result/behaviour if em has to be searched first and then the string have to be parsed backward in search of sm.
This is the function I'm using for this. I combined two answers in one function for single or multiple delimiters.
function getStringBetweenDelimiters($p_string, $p_from, $p_to, $p_multiple=false){
//checking for valid main string
if (strlen($p_string) > 0) {
//checking for multiple strings
if ($p_multiple) {
// getting list of results by end delimiter
$result_list = explode($p_to, $p_string);
//looping through result list array
foreach ( $result_list AS $rlkey => $rlrow) {
// getting result start position
$result_start_pos = strpos($rlrow, $p_from);
// calculating result length
$result_len = strlen($rlrow) - $result_start_pos;
// return only valid rows
if ($result_start_pos > 0) {
// cleanying result string + removing $p_from text from result
$result[] = substr($rlrow, $result_start_pos + strlen($p_from), $result_len);
}// end if
} // end foreach
// if single string
} else {
// result start point + removing $p_from text from result
$result_start_pos = strpos($p_string, $p_from) + strlen($p_from);
// lenght of result string
$result_length = strpos($p_string, $p_to, $result_start_pos);
// cleaning result string
$result = substr($p_string, $result_start_pos+1, $result_length );
} // end if else
// if empty main string
} else {
$result = false;
} // end if else
return $result;
} // end func. get string between
For simple use (returns two):
$result = getStringBetweenDelimiters(" one two three ", 'one', 'three');
For getting each row in a table to result array :
$result = getStringBetweenDelimiters($table, '<tr>', '</tr>', true);
an edited version of what Alejandro García Iglesias put.
This allows you to pick a specific location of the string you want to get based on the number of times the result is found.
function get_string_between_pos($string, $start, $end, $pos){
$cPos = 0;
$ini = 0;
$result = '';
for($i = 0; $i < $pos; $i++){
$ini = strpos($string, $start, $cPos);
if ($ini == 0) return '';
$ini += strlen($start);
$len = strpos($string, $end, $ini) - $ini;
$result = substr($string, $ini, $len);
$cPos = $ini + $len;
}
return $result;
}
usage:
$text = 'string has start test 1 end and start test 2 end and start test 3 end to print';
//get $result = "test 1"
$result = $this->get_string_between_pos($text, 'start', 'end', 1);
//get $result = "test 2"
$result = $this->get_string_between_pos($text, 'start', 'end', 2);
//get $result = "test 3"
$result = $this->get_string_between_pos($text, 'start', 'end', 3);
strpos has an additional optional input to start its search at a specific point. so I store the previous position in $cPos so when the for loop checks again, it starts at the end of where it left off.
easy solution using substr
$posStart = stripos($string, $start) + strlen($start);
$length = stripos($string, $end) - $posStart;
$substring = substr($string, $posStart, $length);
Use:
function getdatabetween($string, $start, $end){
$sp = strpos($string, $start)+strlen($start);
$ep = strpos($string, $end)-strlen($start);
$data = trim(substr($string, $sp, $ep));
return trim($data);
}
$dt = "Find string between two strings in PHP";
echo getdatabetween($dt, 'Find', 'in PHP');
I had some problems with the get_string_between() function, used here. So I came with my own version. Maybe it could help people in the same case as mine.
protected function string_between($string, $start, $end, $inclusive = false) {
$fragments = explode($start, $string, 2);
if (isset($fragments[1])) {
$fragments = explode($end, $fragments[1], 2);
if ($inclusive) {
return $start.$fragments[0].$end;
} else {
return $fragments[0];
}
}
return false;
}
wrote these some time back, found it very useful for a wide range of applications.
<?php
// substr_getbykeys() - Returns everything in a source string that exists between the first occurance of each of the two key substrings
// - only returns first match, and can be used in loops to iterate through large datasets
// - arg 1 is the first substring to look for
// - arg 2 is the second substring to look for
// - arg 3 is the source string the search is performed on.
// - arg 4 is boolean and allows you to determine if returned result should include the search keys.
// - arg 5 is boolean and can be used to determine whether search should be case-sensative or not.
//
function substr_getbykeys($key1, $key2, $source, $returnkeys, $casematters) {
if ($casematters === true) {
$start = strpos($source, $key1);
$end = strpos($source, $key2);
} else {
$start = stripos($source, $key1);
$end = stripos($source, $key2);
}
if ($start === false || $end === false) { return false; }
if ($start > $end) {
$temp = $start;
$start = $end;
$end = $temp;
}
if ( $returnkeys === true) {
$length = ($end + strlen($key2)) - $start;
} else {
$start = $start + strlen($key1);
$length = $end - $start;
}
return substr($source, $start, $length);
}
// substr_delbykeys() - Returns a copy of source string with everything between the first occurance of both key substrings removed
// - only returns first match, and can be used in loops to iterate through large datasets
// - arg 1 is the first key substring to look for
// - arg 2 is the second key substring to look for
// - arg 3 is the source string the search is performed on.
// - arg 4 is boolean and allows you to determine if returned result should include the search keys.
// - arg 5 is boolean and can be used to determine whether search should be case-sensative or not.
//
function substr_delbykeys($key1, $key2, $source, $returnkeys, $casematters) {
if ($casematters === true) {
$start = strpos($source, $key1);
$end = strpos($source, $key2);
} else {
$start = stripos($source, $key1);
$end = stripos($source, $key2);
}
if ($start === false || $end === false) { return false; }
if ($start > $end) {
$temp = $start;
$start = $end;
$end = $temp;
}
if ( $returnkeys === true) {
$start = $start + strlen($key1);
$length = $end - $start;
} else {
$length = ($end + strlen($key2)) - $start;
}
return substr_replace($source, '', $start, $length);
}
?>
With some error catching. Specifically, most of the functions presented require $end to exist, when in fact in my case I needed it to be optional. Use this is $end is optional, and evaluate for FALSE if $start doesn't exist at all:
function get_string_between( $string, $start, $end ){
$string = " " . $string;
$start_ini = strpos( $string, $start );
$end = strpos( $string, $end, $start+1 );
if ($start && $end) {
return substr( $string, $start_ini + strlen($start), strlen( $string )-( $start_ini + $end ) );
} elseif ( $start && !$end ) {
return substr( $string, $start_ini + strlen($start) );
} else {
return FALSE;
}
}
UTF-8 version of #Alejandro Iglesias answer, will work for non-latin characters:
function get_string_between($string, $start, $end){
$string = ' ' . $string;
$ini = mb_strpos($string, $start, 0, 'UTF-8');
if ($ini == 0) return '';
$ini += mb_strlen($start, 'UTF-8');
$len = mb_strpos($string, $end, $ini, 'UTF-8') - $ini;
return mb_substr($string, $ini, $len, 'UTF-8');
}
$fullstring = 'this is my [tag]dog[/tag]';
$parsed = get_string_between($fullstring, '[tag]', '[/tag]');
echo $parsed; // (result = dog)
I use
if (count(explode("<TAG>", $input))>1){
$content = explode("</TAG>",explode("<TAG>", $input)[1])[0];
}else{
$content = "";
}
Subtitue <TAG> for whatever delimiter you want.

keyword highlight is highlighting the highlights in PHP preg_replace()

I have a small search engine doing its thing, and want to highlight the results. I thought I had it all worked out till a set of keywords I used today blew it out of the water.
The issue is that preg_replace() is looping through the replacements, and later replacements are replacing the text I inserted into previous ones. Confused? Here is my pseudo function:
public function highlightKeywords ($data, $keywords = array()) {
$find = array();
$replace = array();
$begin = "<span class=\"keywordHighlight\">";
$end = "</span>";
foreach ($keywords as $kw) {
$find[] = '/' . str_replace("/", "\/", $kw) . '/iu';
$replace[] = $begin . "\$0" . $end;
}
return preg_replace($find, $replace, $data);
}
OK, so it works when searching for "fred" and "dagg" but sadly, when searching for "class" and "lass" and "as" it strikes a real issue when highlighting "Joseph's Class Group"
Joseph's <span class="keywordHighlight">Cl</span><span <span c<span <span class="keywordHighlight">cl</span>ass="keywordHighlight">lass</span>="keywordHighlight">c<span <span class="keywordHighlight">cl</span>ass="keywordHighlight">lass</span></span>="keywordHighlight">ass</span> Group
How would I get the latter replacements to only work on the non-HTML components, but to also allow the tagging of the whole match? e.g. if I was searching for "cla" and "lass" I would want "class" to be highlighted in full as both the search terms are in it, even though they overlap, and the highlighting that was applied to the first match has "class" in it, but that shouldn't be highlighted.
Sigh.
I would rather use a PHP solution than a jQuery (or any client-side) one.
Note: I have tried to sort the keywords by length, doing the long ones first, but that means the cross-over searches do not highlight, meaning with "cla" and "lass" only part of the word "class" would highlight, and it still murdered the replacement tags :(
EDIT: I have messed about, starting with pencil & paper, and wild ramblings, and come up with some very unglamorous code to solve this issue. It's not great, so suggestions to trim/speed this up would still be greatly appreciated :)
public function highlightKeywords ($data, $keywords = array()) {
$find = array();
$replace = array();
$begin = "<span class=\"keywordHighlight\">";
$end = "</span>";
$hits = array();
foreach ($keywords as $kw) {
$offset = 0;
while (($pos = stripos($data, $kw, $offset)) !== false) {
$hits[] = array($pos, $pos + strlen($kw));
$offset = $pos + 1;
}
}
if ($hits) {
usort($hits, function($a, $b) {
if ($a[0] == $b[0]) {
return 0;
}
return ($a[0] < $b[0]) ? -1 : 1;
});
$thisthat = array(0 => $begin, 1 => $end);
for ($i = 0; $i < count($hits); $i++) {
foreach ($thisthat as $key => $val) {
$pos = $hits[$i][$key];
$data = substr($data, 0, $pos) . $val . substr($data, $pos);
for ($j = 0; $j < count($hits); $j++) {
if ($hits[$j][0] >= $pos) {
$hits[$j][0] += strlen($val);
}
if ($hits[$j][1] >= $pos) {
$hits[$j][1] += strlen($val);
}
}
}
}
}
return $data;
}
I've used the following to address this problem:
<?php
$protected_matches = array();
function protect(&$matches) {
global $protected_matches;
return "\0" . array_push($protected_matches, $matches[0]) . "\0";
}
function restore(&$matches) {
global $protected_matches;
return '<span class="keywordHighlight">' .
$protected_matches[$matches[1] - 1] . '</span>';
}
preg_replace_callback('/\x0(\d+)\x0/', 'restore',
preg_replace_callback($patterns, 'protect', $target_string));
The first preg_replace_callback pulls out all matches and replaces them with nul-byte-wrapped placeholders; the second pass replaces them with the span tags.
Edit: Forgot to mention that $patterns was sorted by string length, longest to shortest.
Edit; another solution
<?php
function highlightKeywords($data, $keywords = array(),
$prefix = '<span class="hilite">', $suffix = '</span>') {
$datacopy = strtolower($data);
$keywords = array_map('strtolower', $keywords);
$start = array();
$end = array();
foreach ($keywords as $keyword) {
$offset = 0;
$length = strlen($keyword);
while (($pos = strpos($datacopy, $keyword, $offset)) !== false) {
$start[] = $pos;
$end[] = $offset = $pos + $length;
}
}
if (!count($start)) return $data;
sort($start);
sort($end);
// Merge and sort start/end using negative values to identify endpoints
$zipper = array();
$i = 0;
$n = count($end);
while ($i < $n)
$zipper[] = count($start) && $start[0] <= $end[$i]
? array_shift($start)
: -$end[$i++];
// EXAMPLE:
// [ 9, 10, -14, -14, 81, 82, 86, -86, -86, -90, 99, -103 ]
// take 9, discard 10, take -14, take -14, create pair,
// take 81, discard 82, discard 86, take -86, take -86, take -90, create pair
// take 99, take -103, create pair
// result: [9,14], [81,90], [99,103]
// Generate non-overlapping start/end pairs
$a = array_shift($zipper);
$z = $x = null;
while ($x = array_shift($zipper)) {
if ($x < 0)
$z = $x;
else if ($z) {
$spans[] = array($a, -$z);
$a = $x;
$z = null;
}
}
$spans[] = array($a, -$z);
// Insert the prefix/suffix in the start/end locations
$n = count($spans);
while ($n--)
$data = substr($data, 0, $spans[$n][0])
. $prefix
. substr($data, $spans[$n][0], $spans[$n][1] - $spans[$n][0])
. $suffix
. substr($data, $spans[$n][1]);
return $data;
}
I had to revisit this subject myself today and wrote a better version of the above. I'll include it here. It's the same idea only easier to read and should perform better since it uses arrays instead of concatenation.
<?php
function highlight_range_sort($a, $b) {
$A = abs($a);
$B = abs($b);
if ($A == $B)
return $a < $b ? 1 : 0;
else
return $A < $B ? -1 : 1;
}
function highlightKeywords($data, $keywords = array(),
$prefix = '<span class="highlight">', $suffix = '</span>') {
$datacopy = strtolower($data);
$keywords = array_map('strtolower', $keywords);
// this will contain offset ranges to be highlighted
// positive offset indicates start
// negative offset indicates end
$ranges = array();
// find start/end offsets for each keyword
foreach ($keywords as $keyword) {
$offset = 0;
$length = strlen($keyword);
while (($pos = strpos($datacopy, $keyword, $offset)) !== false) {
$ranges[] = $pos;
$ranges[] = -($offset = $pos + $length);
}
}
if (!count($ranges))
return $data;
// sort offsets by abs(), positive
usort($ranges, 'highlight_range_sort');
// combine overlapping ranges by keeping lesser
// positive and negative numbers
$i = 0;
while ($i < count($ranges) - 1) {
if ($ranges[$i] < 0) {
if ($ranges[$i + 1] < 0)
array_splice($ranges, $i, 1);
else
$i++;
} else if ($ranges[$i + 1] < 0)
$i++;
else
array_splice($ranges, $i + 1, 1);
}
// create substrings
$ranges[] = strlen($data);
$substrings = array(substr($data, 0, $ranges[0]));
for ($i = 0, $n = count($ranges) - 1; $i < $n; $i += 2) {
// prefix + highlighted_text + suffix + regular_text
$substrings[] = $prefix;
$substrings[] = substr($data, $ranges[$i], -$ranges[$i + 1] - $ranges[$i]);
$substrings[] = $suffix;
$substrings[] = substr($data, -$ranges[$i + 1], $ranges[$i + 2] + $ranges[$i + 1]);
}
// join and return substrings
return implode('', $substrings);
}
// Example usage:
echo highlightKeywords("This is a test.\n", array("is"), '(', ')');
echo highlightKeywords("Classes are as hard as they say.\n", array("as", "class"), '(', ')');
// Output:
// Th(is) (is) a test.
// (Class)es are (as) hard (as) they say.
OP - something that's not clear in the question is whether $data can contain HTML from the get-go. Can you clarify this?
If $data can contain HTML itself, you are getting into the realms attempting to parse a non-regular language with a regular language parser, and that's not going to work out well.
In such a case, I would suggest loading the $data HTML into a PHP DOMDocument, getting hold of all of the textNodes and running one of the other perfectly good answers on the contents of each text block in turn.

How to get a substring between two strings in PHP?

Want to improve this post? Provide detailed answers to this question, including citations and an explanation of why your answer is correct. Answers without enough detail may be edited or deleted.
I need a function that returns the substring between two words (or two characters).
I'm wondering whether there is a php function that achieves that. I do not want to think about regex (well, I could do one but really don't think it's the best way to go). Thinking of strpos and substr functions.
Here's an example:
$string = "foo I wanna a cake foo";
We call the function: $substring = getInnerSubstring($string,"foo");
It returns: " I wanna a cake ".
Update:
Well, till now, I can just get a substring beteen two words in just one string, do you permit to let me go a bit farther and ask if I can extend the use of getInnerSubstring($str,$delim) to get any strings that are between delim value, example:
$string =" foo I like php foo, but foo I also like asp foo, foo I feel hero foo";
I get an array like {"I like php", "I also like asp", "I feel hero"}.
If the strings are different (ie: [foo] & [/foo]), take a look at this post from Justin Cook.
I copy his code below:
function get_string_between($string, $start, $end){
$string = ' ' . $string;
$ini = strpos($string, $start);
if ($ini == 0) return '';
$ini += strlen($start);
$len = strpos($string, $end, $ini) - $ini;
return substr($string, $ini, $len);
}
$fullstring = 'this is my [tag]dog[/tag]';
$parsed = get_string_between($fullstring, '[tag]', '[/tag]');
echo $parsed; // (result = dog)
Regular expressions is the way to go:
$str = 'before-str-after';
if (preg_match('/before-(.*?)-after/', $str, $match) == 1) {
echo $match[1];
}
onlinePhp
function getBetween($string, $start = "", $end = ""){
if (strpos($string, $start)) { // required if $start not exist in $string
$startCharCount = strpos($string, $start) + strlen($start);
$firstSubStr = substr($string, $startCharCount, strlen($string));
$endCharCount = strpos($firstSubStr, $end);
if ($endCharCount == 0) {
$endCharCount = strlen($firstSubStr);
}
return substr($firstSubStr, 0, $endCharCount);
} else {
return '';
}
}
Sample use:
echo getBetween("abc","a","c"); // returns: 'b'
echo getBetween("hello","h","o"); // returns: 'ell'
echo getBetween("World","a","r"); // returns: ''
use strstr php function twice.
$value = "This is a great day to be alive";
$value = strstr($value, "is"); //gets all text from needle on
$value = strstr($value, "be", true); //gets all text before needle
echo $value;
outputs:
"is a great day to"
function getInnerSubstring($string,$delim){
// "foo a foo" becomes: array(""," a ","")
$string = explode($delim, $string, 3); // also, we only need 2 items at most
// we check whether the 2nd is set and return it, otherwise we return an empty string
return isset($string[1]) ? $string[1] : '';
}
Example of use:
var_dump(getInnerSubstring('foo Hello world foo','foo'));
// prints: string(13) " Hello world "
If you want to remove surrounding whitespace, use trim. Example:
var_dump(trim(getInnerSubstring('foo Hello world foo','foo')));
// prints: string(11) "Hello world"
function getInbetweenStrings($start, $end, $str){
$matches = array();
$regex = "/$start([a-zA-Z0-9_]*)$end/";
preg_match_all($regex, $str, $matches);
return $matches[1];
}
for examle you want the array of strings(keys) between ## in following
example, where '/' doesn't fall in-between
$str = "C://##ad_custom_attr1##/##upn##/##samaccountname##";
$str_arr = getInbetweenStrings('##', '##', $str);
print_r($str_arr);
I like the regular expression solutions but none of the others suit me.
If you know there is only gonna be 1 result you can use the following:
$between = preg_replace('/(.*)BEFORE(.*)AFTER(.*)/s', '\2', $string);
Change BEFORE and AFTER to the desired delimiters.
Also keep in mind this function will return the whole string in case nothing matched.
This solution is multiline but you can play with the modifiers depending on your needs.
Not a php pro. but i recently ran into this wall too and this is what i came up with.
function tag_contents($string, $tag_open, $tag_close){
foreach (explode($tag_open, $string) as $key => $value) {
if(strpos($value, $tag_close) !== FALSE){
$result[] = substr($value, 0, strpos($value, $tag_close));;
}
}
return $result;
}
$string = "i love cute animals, like [animal]cat[/animal],
[animal]dog[/animal] and [animal]panda[/animal]!!!";
echo "<pre>";
print_r(tag_contents($string , "[animal]" , "[/animal]"));
echo "</pre>";
//result
Array
(
[0] => cat
[1] => dog
[2] => panda
)
A vast majority of answers here don't answer the edited part, I guess they were added before. It can be done with regex, as one answer mentions. I had a different approach.
This function searches $string and finds the first string between $start and $end strings, starting at $offset position. It then updates the $offset position to point to the start of the result. If $includeDelimiters is true, it includes the delimiters in the result.
If the $start or $end string are not found, it returns null. It also returns null if $string, $start, or $end are an empty string.
function str_between(string $string, string $start, string $end, bool $includeDelimiters = false, int &$offset = 0): ?string
{
if ($string === '' || $start === '' || $end === '') return null;
$startLength = strlen($start);
$endLength = strlen($end);
$startPos = strpos($string, $start, $offset);
if ($startPos === false) return null;
$endPos = strpos($string, $end, $startPos + $startLength);
if ($endPos === false) return null;
$length = $endPos - $startPos + ($includeDelimiters ? $endLength : -$startLength);
if (!$length) return '';
$offset = $startPos + ($includeDelimiters ? 0 : $startLength);
$result = substr($string, $offset, $length);
return ($result !== false ? $result : null);
}
The following function finds all strings that are between two strings (no overlaps). It requires the previous function, and the arguments are the same. After execution, $offset points to the start of the last found result string.
function str_between_all(string $string, string $start, string $end, bool $includeDelimiters = false, int &$offset = 0): ?array
{
$strings = [];
$length = strlen($string);
while ($offset < $length)
{
$found = str_between($string, $start, $end, $includeDelimiters, $offset);
if ($found === null) break;
$strings[] = $found;
$offset += strlen($includeDelimiters ? $found : $start . $found . $end); // move offset to the end of the newfound string
}
return $strings;
}
Examples:
str_between_all('foo 1 bar 2 foo 3 bar', 'foo', 'bar') gives [' 1 ', ' 3 '].
str_between_all('foo 1 bar 2', 'foo', 'bar') gives [' 1 '].
str_between_all('foo 1 foo 2 foo 3 foo', 'foo', 'foo') gives [' 1 ', ' 3 '].
str_between_all('foo 1 bar', 'foo', 'foo') gives [].
If you're using foo as a delimiter, then look at explode()
<?php
function getBetween($content,$start,$end){
$r = explode($start, $content);
if (isset($r[1])){
$r = explode($end, $r[1]);
return $r[0];
}
return '';
}
?>
Example:
<?php
$content = "Try to find the guy in the middle with this function!";
$start = "Try to find ";
$end = " with this function!";
$output = getBetween($content,$start,$end);
echo $output;
?>
This will return "the guy in the middle".
Simple, short, and sweet. It's up to you to make any enhancements.
function getStringBetween($str, $start, $end)
{
$pos1 = strpos($str, $start);
$pos2 = strpos($str, $end);
return substr($str, $pos1+1, $pos2-($pos1+1));
}
If you have multiple recurrences from a single string and you have different [start] and [\end] pattern.
Here's a function which output an array.
function get_string_between($string, $start, $end){
$split_string = explode($end,$string);
foreach($split_string as $data) {
$str_pos = strpos($data,$start);
$last_pos = strlen($data);
$capture_len = $last_pos - $str_pos;
$return[] = substr($data,$str_pos+1,$capture_len);
}
return $return;
}
Here's a function
function getInnerSubstring($string, $boundstring, $trimit=false) {
$res = false;
$bstart = strpos($string, $boundstring);
if ($bstart >= 0) {
$bend = strrpos($string, $boundstring);
if ($bend >= 0 && $bend > $bstart)
$res = substr($string, $bstart+strlen($boundstring), $bend-$bstart-strlen($boundstring));
}
return $trimit ? trim($res) : $res;
}
Use it like
$string = "foo I wanna a cake foo";
$substring = getInnerSubstring($string, "foo");
echo $substring;
Output (note that it returns spaces in front and at the and of your string if exist)
I wanna a cake
If you want to trim result use function like
$substring = getInnerSubstring($string, "foo", true);
Result: This function will return false if $boundstring was not found in $string or if $boundstring exists only once in $string, otherwise it returns substring between first and last occurrence of $boundstring in $string.
References
strpos()
strrpos()
substr()
trim()
Improvement of Alejandro's answer. You can leave the $start or $end arguments empty and it will use the start or end of the string.
echo get_string_between("Hello my name is bob", "my", ""); //output: " name is bob"
private function get_string_between($string, $start, $end){ // Get
if($start != ''){ //If $start is empty, use start of the string
$string = ' ' . $string;
$ini = strpos($string, $start);
if ($ini == 0) return '';
$ini += strlen($start);
}
else{
$ini = 0;
}
if ($end == '') { //If $end is blank, use end of string
return substr($string, $ini);
}
else{
$len = strpos($string, $end, $ini) - $ini; //Work out length of string
return substr($string, $ini, $len);
}
}
private function getStringBetween(string $from, string $to, string $haystack): string
{
$fromPosition = strpos($haystack, $from) + strlen($from);
$toPosition = strpos($haystack, $to, $fromPosition);
$betweenLength = $toPosition - $fromPosition;
return substr($haystack, $fromPosition, $betweenLength);
}
Use:
<?php
$str = "...server daemon started with pid=6849 (parent=6848).";
$from = "pid=";
$to = "(";
echo getStringBetween($str,$from,$to);
function getStringBetween($str,$from,$to)
{
$sub = substr($str, strpos($str,$from)+strlen($from),strlen($str));
return substr($sub,0,strpos($sub,$to));
}
?>
A bit improved code from GarciaWebDev and Henry Wang. If empty $start or $end is given, function returns values from the beginning or to the end of the $string. Also Inclusive option is available, whether we want to include search result or not:
function get_string_between ($string, $start, $end, $inclusive = false){
$string = " ".$string;
if ($start == "") { $ini = 0; }
else { $ini = strpos($string, $start); }
if ($end == "") { $len = strlen($string); }
else { $len = strpos($string, $end, $ini) - $ini;}
if (!$inclusive) { $ini += strlen($start); }
else { $len += strlen($end); }
return substr($string, $ini, $len);
}
I have to add something to the post of Julius Tilvikas. I looked for a solution like this one he described in his post. But i think there is a mistake. I don't get realy the string between two string, i also get more with this solution, because i have to substract the lenght of the start-string. When do this, i realy get the String between two strings.
Here are my changes of his solution:
function get_string_between ($string, $start, $end, $inclusive = false){
$string = " ".$string;
if ($start == "") { $ini = 0; }
else { $ini = strpos($string, $start); }
if ($end == "") { $len = strlen($string); }
else { $len = strpos($string, $end, $ini) - $ini - strlen($start);}
if (!$inclusive) { $ini += strlen($start); }
else { $len += strlen($end); }
return substr($string, $ini, $len);
}
Greetz
V
Try this, Its work for me, get data between test word.
$str = "Xdata test HD01 test 1data";
$result = explode('test',$str);
print_r($result);
echo $result[1];
In PHP's strpos style this will return false if the start mark sm or the end mark em are not found.
This result (false) is different from an empty string that is what you get if there is nothing between the start and end marks.
function between( $str, $sm, $em )
{
$s = strpos( $str, $sm );
if( $s === false ) return false;
$s += strlen( $sm );
$e = strpos( $str, $em, $s );
if( $e === false ) return false;
return substr( $str, $s, $e - $s );
}
The function will return only the first match.
It's obvious but worth mentioning that the function will first look for sm and then for em.
This implies you may not get the desired result/behaviour if em has to be searched first and then the string have to be parsed backward in search of sm.
This is the function I'm using for this. I combined two answers in one function for single or multiple delimiters.
function getStringBetweenDelimiters($p_string, $p_from, $p_to, $p_multiple=false){
//checking for valid main string
if (strlen($p_string) > 0) {
//checking for multiple strings
if ($p_multiple) {
// getting list of results by end delimiter
$result_list = explode($p_to, $p_string);
//looping through result list array
foreach ( $result_list AS $rlkey => $rlrow) {
// getting result start position
$result_start_pos = strpos($rlrow, $p_from);
// calculating result length
$result_len = strlen($rlrow) - $result_start_pos;
// return only valid rows
if ($result_start_pos > 0) {
// cleanying result string + removing $p_from text from result
$result[] = substr($rlrow, $result_start_pos + strlen($p_from), $result_len);
}// end if
} // end foreach
// if single string
} else {
// result start point + removing $p_from text from result
$result_start_pos = strpos($p_string, $p_from) + strlen($p_from);
// lenght of result string
$result_length = strpos($p_string, $p_to, $result_start_pos);
// cleaning result string
$result = substr($p_string, $result_start_pos+1, $result_length );
} // end if else
// if empty main string
} else {
$result = false;
} // end if else
return $result;
} // end func. get string between
For simple use (returns two):
$result = getStringBetweenDelimiters(" one two three ", 'one', 'three');
For getting each row in a table to result array :
$result = getStringBetweenDelimiters($table, '<tr>', '</tr>', true);
an edited version of what Alejandro García Iglesias put.
This allows you to pick a specific location of the string you want to get based on the number of times the result is found.
function get_string_between_pos($string, $start, $end, $pos){
$cPos = 0;
$ini = 0;
$result = '';
for($i = 0; $i < $pos; $i++){
$ini = strpos($string, $start, $cPos);
if ($ini == 0) return '';
$ini += strlen($start);
$len = strpos($string, $end, $ini) - $ini;
$result = substr($string, $ini, $len);
$cPos = $ini + $len;
}
return $result;
}
usage:
$text = 'string has start test 1 end and start test 2 end and start test 3 end to print';
//get $result = "test 1"
$result = $this->get_string_between_pos($text, 'start', 'end', 1);
//get $result = "test 2"
$result = $this->get_string_between_pos($text, 'start', 'end', 2);
//get $result = "test 3"
$result = $this->get_string_between_pos($text, 'start', 'end', 3);
strpos has an additional optional input to start its search at a specific point. so I store the previous position in $cPos so when the for loop checks again, it starts at the end of where it left off.
easy solution using substr
$posStart = stripos($string, $start) + strlen($start);
$length = stripos($string, $end) - $posStart;
$substring = substr($string, $posStart, $length);
Use:
function getdatabetween($string, $start, $end){
$sp = strpos($string, $start)+strlen($start);
$ep = strpos($string, $end)-strlen($start);
$data = trim(substr($string, $sp, $ep));
return trim($data);
}
$dt = "Find string between two strings in PHP";
echo getdatabetween($dt, 'Find', 'in PHP');
I had some problems with the get_string_between() function, used here. So I came with my own version. Maybe it could help people in the same case as mine.
protected function string_between($string, $start, $end, $inclusive = false) {
$fragments = explode($start, $string, 2);
if (isset($fragments[1])) {
$fragments = explode($end, $fragments[1], 2);
if ($inclusive) {
return $start.$fragments[0].$end;
} else {
return $fragments[0];
}
}
return false;
}
wrote these some time back, found it very useful for a wide range of applications.
<?php
// substr_getbykeys() - Returns everything in a source string that exists between the first occurance of each of the two key substrings
// - only returns first match, and can be used in loops to iterate through large datasets
// - arg 1 is the first substring to look for
// - arg 2 is the second substring to look for
// - arg 3 is the source string the search is performed on.
// - arg 4 is boolean and allows you to determine if returned result should include the search keys.
// - arg 5 is boolean and can be used to determine whether search should be case-sensative or not.
//
function substr_getbykeys($key1, $key2, $source, $returnkeys, $casematters) {
if ($casematters === true) {
$start = strpos($source, $key1);
$end = strpos($source, $key2);
} else {
$start = stripos($source, $key1);
$end = stripos($source, $key2);
}
if ($start === false || $end === false) { return false; }
if ($start > $end) {
$temp = $start;
$start = $end;
$end = $temp;
}
if ( $returnkeys === true) {
$length = ($end + strlen($key2)) - $start;
} else {
$start = $start + strlen($key1);
$length = $end - $start;
}
return substr($source, $start, $length);
}
// substr_delbykeys() - Returns a copy of source string with everything between the first occurance of both key substrings removed
// - only returns first match, and can be used in loops to iterate through large datasets
// - arg 1 is the first key substring to look for
// - arg 2 is the second key substring to look for
// - arg 3 is the source string the search is performed on.
// - arg 4 is boolean and allows you to determine if returned result should include the search keys.
// - arg 5 is boolean and can be used to determine whether search should be case-sensative or not.
//
function substr_delbykeys($key1, $key2, $source, $returnkeys, $casematters) {
if ($casematters === true) {
$start = strpos($source, $key1);
$end = strpos($source, $key2);
} else {
$start = stripos($source, $key1);
$end = stripos($source, $key2);
}
if ($start === false || $end === false) { return false; }
if ($start > $end) {
$temp = $start;
$start = $end;
$end = $temp;
}
if ( $returnkeys === true) {
$start = $start + strlen($key1);
$length = $end - $start;
} else {
$length = ($end + strlen($key2)) - $start;
}
return substr_replace($source, '', $start, $length);
}
?>
With some error catching. Specifically, most of the functions presented require $end to exist, when in fact in my case I needed it to be optional. Use this is $end is optional, and evaluate for FALSE if $start doesn't exist at all:
function get_string_between( $string, $start, $end ){
$string = " " . $string;
$start_ini = strpos( $string, $start );
$end = strpos( $string, $end, $start+1 );
if ($start && $end) {
return substr( $string, $start_ini + strlen($start), strlen( $string )-( $start_ini + $end ) );
} elseif ( $start && !$end ) {
return substr( $string, $start_ini + strlen($start) );
} else {
return FALSE;
}
}
UTF-8 version of #Alejandro Iglesias answer, will work for non-latin characters:
function get_string_between($string, $start, $end){
$string = ' ' . $string;
$ini = mb_strpos($string, $start, 0, 'UTF-8');
if ($ini == 0) return '';
$ini += mb_strlen($start, 'UTF-8');
$len = mb_strpos($string, $end, $ini, 'UTF-8') - $ini;
return mb_substr($string, $ini, $len, 'UTF-8');
}
$fullstring = 'this is my [tag]dog[/tag]';
$parsed = get_string_between($fullstring, '[tag]', '[/tag]');
echo $parsed; // (result = dog)
I use
if (count(explode("<TAG>", $input))>1){
$content = explode("</TAG>",explode("<TAG>", $input)[1])[0];
}else{
$content = "";
}
Subtitue <TAG> for whatever delimiter you want.

Remove a string from the beginning of a string

I have a string that looks like this:
$str = "bla_string_bla_bla_bla";
How can I remove the first bla_; but only if it's found at the beginning of the string?
With str_replace(), it removes all bla_'s.
Plain form, without regex:
$prefix = 'bla_';
$str = 'bla_string_bla_bla_bla';
if (substr($str, 0, strlen($prefix)) == $prefix) {
$str = substr($str, strlen($prefix));
}
Takes: 0.0369 ms (0.000,036,954 seconds)
And with:
$prefix = 'bla_';
$str = 'bla_string_bla_bla_bla';
$str = preg_replace('/^' . preg_quote($prefix, '/') . '/', '', $str);
Takes: 0.1749 ms (0.000,174,999 seconds) the 1st run (compiling), and 0.0510 ms (0.000,051,021 seconds) after.
Profiled on my server, obviously.
You can use regular expressions with the caret symbol (^) which anchors the match to the beginning of the string:
$str = preg_replace('/^bla_/', '', $str);
function remove_prefix($text, $prefix) {
if(0 === strpos($text, $prefix))
$text = substr($text, strlen($prefix)).'';
return $text;
}
Here's an even faster approach:
// strpos is faster than an unnecessary substr() and is built just for that
if (strpos($str, $prefix) === 0) $str = substr($str, strlen($prefix));
Here.
$array = explode("_", $string);
if($array[0] == "bla") array_shift($array);
$string = implode("_", $array);
In PHP 8+ we can simplify using the str_starts_with() function:
$str = "bla_string_bla_bla_bla";
$prefix = "bla_";
if (str_starts_with($str, $prefix)) {
$str = substr($str, strlen($prefix));
}
https://www.php.net/manual/en/function.str-starts-with.php
EDIT: Fixed a typo (closing bracket) in the example code.
Nice speed, but this is hard-coded to depend on the needle ending with _. Is there a general version? – toddmo Jun 29 at 23:26
A general version:
$parts = explode($start, $full, 2);
if ($parts[0] === '') {
$end = $parts[1];
} else {
$fail = true;
}
Some benchmarks:
<?php
$iters = 100000;
$start = "/aaaaaaa/bbbbbbbbbb";
$full = "/aaaaaaa/bbbbbbbbbb/cccccccccc/dddddddddd/eeeeeeeeee";
$end = '';
$fail = false;
$t0 = microtime(true);
for ($i = 0; $i < $iters; $i++) {
if (strpos($full, $start) === 0) {
$end = substr($full, strlen($start));
} else {
$fail = true;
}
}
$t = microtime(true) - $t0;
printf("%16s : %f s\n", "strpos+strlen", $t);
$t0 = microtime(true);
for ($i = 0; $i < $iters; $i++) {
$parts = explode($start, $full, 2);
if ($parts[0] === '') {
$end = $parts[1];
} else {
$fail = true;
}
}
$t = microtime(true) - $t0;
printf("%16s : %f s\n", "explode", $t);
On my quite old home PC:
$ php bench.php
Outputs:
strpos+strlen : 0.158388 s
explode : 0.126772 s
Lots of different answers here. All seemingly based on string analysis. Here is my take on this using PHP explode to break up the string into an array of exactly two values and cleanly returning only the second value:
$str = "bla_string_bla_bla_bla";
$str_parts = explode('bla_', $str, 2);
$str_parts = array_filter($str_parts);
$final = array_shift($str_parts);
echo $final;
Output will be:
string_bla_bla_bla
Symfony users can install the string component and use trimPrefix()
u('file-image-0001.png')->trimPrefix('file-'); // 'image-0001.png'
I think substr_replace does what you want, where you can limit your replace to part of your string:
http://nl3.php.net/manual/en/function.substr-replace.php (This will enable you to only look at the beginning of the string.)
You could use the count parameter of str_replace ( http://nl3.php.net/manual/en/function.str-replace.php ), this will allow you to limit the number of replacements, starting from the left, but it will not enforce it to be at the beginning.

PHP Multiple Occurences Of Words Within A String

I need to check a string to see if any word in it has multiple occurences. So basically I will accept:
"google makes love"
but I don't accept:
"google makes google love" or "google makes love love google" etc.
Any ideas? Really don't know any way to approach this, any help would be greatly appreciated.
Based on Wicked Flea code:
function single_use_of_words($str) {
$words = explode(' ', trim($str)); //Trim to prevent any extra blank
if (count(array_unique($words)) == count($words)) {
return true; //Same amount of words
}
return false;
}
Try this:
function single_use_of_words($str) {
$words = explode(' ', $str);
$words = array_unique($words);
return implode(' ', $words);
}
No need for loops or arrays:
<?php
$needle = 'cat';
$haystack = 'cat in the cat hat';
if ( occursMoreThanOnce($haystack, $needle) ) {
echo 'Success';
}
function occursMoreThanOnce($haystack, $needle) {
return strpos($haystack, $needle) !== strrpos($haystack, $needle);
}
?>
<?php
$words = preg_split('\b', $string, PREG_SPLIT_NO_EMPTY);
$wordsUnique = array_unique($words);
if (count($words) != count($wordsUnique)) {
echo 'Duplicate word found!';
}
?>
The regular expression way would definitely be my choice.
I did a little test on a string of 320 words with Veynom's function and a regular expression
function preg( $txt ) {
return !preg_match( '/\b(\w+)\b.*?\1/', $txt );
}
Here's the test
$time['preg'] = microtime( true );
for( $i = 0; $i < 1000; $i++ ) {
preg( $txt );
}
$time['preg'] = microtime( true ) - $time['preg'];
$time['veynom-thewickedflea'] = microtime( true );
for( $i = 0; $i < 1000; $i++ ) {
single_use_of_words( $txt );
}
$time['veynom-thewickedflea'] = microtime( true ) - $time['veynom-thewickedflea'];
print_r( $time );
And here's the result I got
Array
(
[preg] => 0.197616815567
[veynom-thewickedflea] => 0.487532138824
)
Which suggests that the RegExp solution, as well as being a lot more concise is more than twice as fast. ( for a string of 320 words anr 1000 iterations )
When I run the test over 10 000 iterations I get
Array
(
[preg] => 1.51235699654
[veynom-thewickedflea] => 4.99487900734
)
The non RegExp solution also uses a lot more memory.
So.. Regular Expressions for me cos they've got a full tank of gas
EDIT
The text I tested against has duplicate words, If it doesn't, the results may be different. I'll post another set of results.
Update
With the duplicates stripped out ( now 186 words ) the results for 1000 iterations is:
Array
(
[preg] => 0.235826015472
[veynom-thewickedflea] => 0.2528860569
)
About evens
function Accept($str)
{
$words = explode(" ", trim($str));
$len = count($words);
for ($i = 0; $i < $len; $i++)
{
for ($p = 0; $p < $len; $p++)
{
if ($p != $i && $words[$i] == $words[$p])
{
return false;
}
}
}
return true;
}
EDIT
Entire test script. Note, when printing "false" php just prints nothing but true is printed as "1".
<?php
function Accept($str)
{
$words = explode(" ", trim($str));
$len = count($words);
for ($i = 0; $i < $len; $i++)
{
for ($p = 0; $p < $len; $p++)
{
if ($p != $i && $words[$i] == $words[$p])
{
return false;
}
}
}
return true;
}
echo Accept("google makes love"), ", ", Accept("google makes google love"), ", ",
Accept("google makes love love google"), ", ", Accept("babe health insurance babe");
?>
Prints the correct output:
1, , ,
This seems fairly fast. It would be interesting to see (for all the answers) how the memory usage and time taken increase as you increase the length of the input string.
function check($str) {
//remove double spaces
$c = 1;
while ($c) $str = str_replace(' ', ' ', $str, $c);
//split into array of words
$words = explode(' ', $str);
foreach ($words as $key => $word) {
//remove current word from array
unset($words[$key]);
//if it still exists in the array it must be duplicated
if (in_array($word, $words)) {
return false;
}
}
return true;
}
Edit
Fixed issue with multiple spaces. I'm not sure whether it is better to remove these at the start (as I have) or check each word is non-empty in the foreach.
The simplest method is to loop through each word and check against all previous words for duplicates.
Regular expression with backreferencing
http://www.regular-expressions.info/php.html
http://www.regular-expressions.info/named.html

Categories