Php search string (with wildcards)

Php search string (with wildcards) - php

Is there a way to put a wildcard in a string? The reason why I am asking is because currently I have a function to search for a substring between two substrings (i.e grab the contents between "my" and "has fleas" in the sentence "my dog has fleas", resulting in "dog").
function get_string_between($string, $start, $end){
$string = " ".$string;
$ini = strpos($string,$start);
if ($ini == 0) return "";
$ini += strlen($start);
$len = strpos($string,$end,$ini) - $ini;
return substr($string,$ini,$len);
}
What I want to do is have it search with a wildcard in the string. So say I search between "%WILDCARD%" and "has fleas" in the sentence "My dog has fleas" - it would still output "dog".
I don't know if I explained it too well but hopefully someone will understand me :P. Thank you very much for reading!

This is one of the few cases where regular expressions are actually helpful. :)
if (preg_match('/my (\w+) has/', $str, $matches)) {
echo $matches[1];
}
See the documentation for preg_match.

wildcard pattern could be converted to regex pattern like this
function wildcard_match($pattern, $subject) {
$pattern = strtr($pattern, array(
'*' => '.*?', // 0 or more (lazy) - asterisk (*)
'?' => '.', // 1 character - question mark (?)
));
return preg_match("/$pattern/", $subject);
}
if string contents special characters, e.g. \.+*?^$|{}/'#, they should be \-escaped
don't tested:
function wildcard_match($pattern, $subject) {
// quotemeta function has most similar behavior,
// it escapes \.+*?^$[](), but doesn't escape |{}/'#
// we don't include * and ?
$special_chars = "\.+^$[]()|{}/'#";
$special_chars = str_split($special_chars);
$escape = array();
foreach ($special_chars as $char) $escape[$char] = "\\$char";
$pattern = strtr($pattern, $escape);
$pattern = strtr($pattern, array(
'*' => '.*?', // 0 or more (lazy) - asterisk (*)
'?' => '.', // 1 character - question mark (?)
));
return preg_match("/$pattern/", $subject);
}

Use a regex.
$string = "My dog has fleas";
if (preg_match("/\S+ (\S+) has fleas/", $string, $matches))
echo ($matches[1]);
else
echo ("Not found");
\S means any non-space character, + means one or more of the previous thing, so \S+ means match one or more non-space characters. (…) means capture the content of the submatch and put into the $matches array.

I agree that regex are much more flexible than wildcards, but sometimes all you want is a simple way to define patterns. For people looking for a portable solution (not *NIX only) here is my implementation of the function:
function wild_compare($wild, $string) {
$wild_i = 0;
$string_i = 0;
$wild_len = strlen($wild);
$string_len = strlen($string);
while ($string_i < $string_len && $wild[$wild_i] != '*') {
if (($wild[$wild_i] != $string[$string_i]) && ($wild[$wild_i] != '?')) {
return 0;
}
$wild_i++;
$string_i++;
}
$mp = 0;
$cp = 0;
while ($string_i < $string_len) {
if ($wild[$wild_i] == '*') {
if (++$wild_i == $wild_len) {
return 1;
}
$mp = $wild_i;
$cp = $string_i + 1;
}
else
if (($wild[$wild_i] == $string[$string_i]) || ($wild[$wild_i] == '?')) {
$wild_i++;
$string_i++;
}
else {
$wild_i = $mp;
$string_i = $cp++;
}
}
while ($wild[$wild_i] == '*') {
$wild_i++;
}
return $wild_i == $wild_len ? 1 : 0;
}
Naturally the PHP implementation is slower than fnmatch(), but it would work on any platform.
It can be used like this:
if (wild_compare('regex are * useful', 'regex are always useful') == 1) {
echo "I'm glad we agree on this";
}

If you insist to use a wildcard (and yes, PREG is much better) you can use the function
fnmatch.
ex:
if (fnmatch('my * has', $str)) { }

Related

PHP Preg Replace. Remove strings inside {~ string ~} pattern, but skip <pre>{~ string ~}</pre> [duplicate]

I am using a WordPress plugin named Acronyms (https://wordpress.org/plugins/acronyms/). This plugin replaces acronyms with their description. It uses a PHP PREG_REPLACE function.
The issue is that it replaces the acronyms contained in a <pre> tag, which I use to present a source code.
Could you modify this expression so that it won't replace acronyms contained inside <pre> tags (not only directly, but in any moment)? Is it possible?
The PHP code is:
$text = preg_replace(
"|(?!<[^<>]*?)(?<![?.&])\b$acronym\b(?!:)(?![^<>]*?>)|msU"
, "<acronym title=\"$fulltext\">$acronym</acronym>"
, $text
);

You can use a PCRE SKIP/FAIL regex trick (also works in PHP) to tell the regex engine to only match something if it is not inside some delimiters:
(?s)<pre[^<]*>.*?<\/pre>(*SKIP)(*F)|\b$acronym\b
This means: skip all substrings starting with <pre> and ending with </pre>, and only then match $acronym as a whole word.
See demo on regex101.com
Here is a sample PHP demo:
<?php
$acronym = "ASCII";
$fulltext = "American Standard Code for Information Interchange";
$re = "/(?s)<pre[^<]*>.*?<\\/pre>(*SKIP)(*F)|\\b$acronym\\b/";
$str = "<pre>ASCII\nSometext\nMoretext</pre>More text \nASCII\nMore text<pre>More\nlines\nASCII\nlines</pre>";
$subst = "<acronym title=\"$fulltext\">$acronym</acronym>";
$result = preg_replace($re, $subst, $str);
echo $result;
Output:
<pre>ASCII</pre><acronym title="American Standard Code for Information Interchange">ASCII</acronym><pre>ASCII</pre>

It is also possible to use preg_split and keep the code block as a group, only replace the non-code block part then combine it back as a complete string:
function replace($s) {
return str_replace('"', '"', $s); // do something with `$s`
}
$text = 'Your text goes here...';
$parts = preg_split('#(<\/?[-:\w]+(?:\s[^<>]+?)?>)#', $text, null, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
$text = "";
$x = 0;
foreach ($parts as $v) {
if (trim($v) === "") {
$text .= $v;
continue;
}
if ($v[0] === '<' && substr($v, -1) === '>') {
if (preg_match('#^<(\/)?(?:code|pre)(?:\s[^<>]+?)?>$#', $v, $m)) {
$x = isset($m[1]) && $m[1] === '/' ? 0 : 1;
}
$text .= $v; // this is a HTML tag…
} else {
$text .= !$x ? replace($v) : $v; // process or skip…
}
}
return $text;
Taken from here.

URL conversion (into link and urldecode)

I want to ask 2 questions about url conversion in php.
1 question: I need to convert text into link. I've done my own preg and also read many forums, but all solutions are connected with www. or (ht|f)tp(s), but I need preg that will convert domain names even without www and http in text, for example:
I like stackoverflow.com very much
into
I like <a href='http://stackoverflow.com'>stackoverflow.com</a> very much
Sure it must consider points and commas and etc., like:
I like stackoverflow.com.
into
I like <a href='http://stackoverflow.com'>stackoverflow.com</a>.
And one more question: links with url-encoded symbols on wiki are displayed as they are, but on other sites they are displayed like url-encoded string (%XX%XX%XX). How did wiki do this? Thanks!

For your first question, I would not recommend you to that, it is very difficult to know if the a word containing a dot is a domain name or not, and people often forget to put a space after the dot in the middle of a paragraph.
for your second question, it is simple, you url encode the link in the href but not between the open and close a tag. For example :
http://site.na/test.php?t=sqdl&54"dfd°=+

The function auto_link from CodeIgniter URL helper can help you:
if ( ! function_exists('auto_link'))
{
function auto_link($str, $type = 'both', $popup = FALSE)
{
if ($type != 'email')
{
if (preg_match_all("#(^|\s|\()((http(s?)://)|(www\.))(\w+[^\s\)\<]+)#i", $str, $matches))
{
$pop = ($popup == TRUE) ? " target=\"_blank\" " : "";
for ($i = 0; $i < count($matches['0']); $i++)
{
$period = '';
if (preg_match("|\.$|", $matches['6'][$i]))
{
$period = '.';
$matches['6'][$i] = substr($matches['6'][$i], 0, -1);
}
$str = str_replace($matches['0'][$i],
$matches['1'][$i].'<a href="http'.
$matches['4'][$i].'://'.
$matches['5'][$i].
$matches['6'][$i].'"'.$pop.'>http'.
$matches['4'][$i].'://'.
$matches['5'][$i].
$matches['6'][$i].'</a>'.
$period, $str);
}
}
}
if ($type != 'url')
{
if (preg_match_all("/([a-zA-Z0-9_\.\-\+]+)#([a-zA-Z0-9\-]+)\.([a-zA-Z0-9\-\.]*)/i", $str, $matches))
{
for ($i = 0; $i < count($matches['0']); $i++)
{
$period = '';
if (preg_match("|\.$|", $matches['3'][$i]))
{
$period = '.';
$matches['3'][$i] = substr($matches['3'][$i], 0, -1);
}
$str = str_replace($matches['0'][$i], safe_mailto($matches['1'][$i].'#'.$matches['2'][$i].'.'.$matches['3'][$i]).$period, $str);
}
}
}
return $str;
}
}

Search For Characters in a String

$str1 = "HEADINGLEY";
$str2 = "HDNGLY";
how can i seach string 1 to see if it contains all the characters from string 2 in the order they exist in string 2 and return a true or false value ?
Any help appreciated :D
if this helps, this is the code i have todate..
echo preg_match('/.*H.*D.*N.*G.*L.*Y.*/', $str1, $matches);
returns 0

I guess I'd use preg_match
http://us.php.net/manual/en/function.preg-match.php
...and turning $str2 into a regular expression using str_split to get an array and implode to turn it back into a string with ".*" as glue between characters.
http://us.php.net/manual/en/function.str-split.php
http://us.php.net/manual/en/function.implode.php
Upate -- I should have suggested str_split instead of explode, and here is the code that will get you a regular expression from $str2
$str2pat = "/.*" . implode(".*", str_split($str2)) . ".*/";

You could use some regular expression match like .*H.*D.*N.*G.*L.*Y.* and preg_match.
Should be easy enough to add the .* in the second string.

This code is in C# as I don't know PHP but should be easy to understand.
int index1 = 0, index2 = 0;
while (index1 < str1.Length && index2 < str2.Length)
{
if (str1[index1] == str2[index2]) index2++;
index1++;
}
if (index2 == str2.Length) return true;

Imagine that your string is array and after that use array_diff function.
http://hu2.php.net/array_diff
$str1 = "HEADINGLEY";
$str2 = "HDNGLY";
$arr1 = str_split($str1, 1);
$arr2 = str_split($str2, 1);
$arr2_uniques = array_diff($arr2, $arr1);
return array() === $arr2_uniques;

Possibly this snippet will do it:
var contains = true;
for (var i=0; i < str2.length; i++)
{
var char = str2.charAt(i);
var indexOfChar = str1.indexOf(char);
if (indexOfChar < 0)
{
contains = false;
}
str1 = str1.substr(indexOfChar);
}

You can use pre_match function, like that:
$split_str2 = str_split($str2);
// Create pattern from str2
$pattern = "/" . implode(".*", $split_str2) . "/";
// Find
$result = preg_match($pattern, $str1);

You can probably use some regular expression but be weary of strings that contained regex specific characters. This solution does it programmatically.
function contains_all_characters($str1, $str2)
{
$i1 = strlen($str1) - 1;
$i2 = strlen($str2) - 1;
while ($i2 >= 0)
{
if ($i1 == -1) return false;
if ($str1[$i1--] == $str2[$i2]) $i2--;
}
return true;
}

You can try:
$str = 'HEADINGLEY';
if (preg_match('/[^H]*H[^D]*D[^N]*N[^G]*G[^L]*L[^Y]*Y/', $str, $m))
var_dump($m[0]);
Update: Even better is to build regex like this and then use it:
$str = 'HEADINGLEY';
$pattern = 'HDNGLY';
$regex = '#' . preg_replace('#(.)#', '[^$1]*$1', $pattern) . '#';
if (preg_match($regex, $str, $m))
var_dump($m[0]);
OUTPUT:
string(10) "HEADINGLEY"

Convert lib_string to string w/o Regex

I need to convert lib_someString to someString inside a block of text using str_replace [not regex].
Here's an example to give an exact sense what I mean: lib_12345 => 12345. I need to do this for a bunch of instances in a block of text.
Below is my attempt. Problem I'm getting is that my function is not doing anything (I just get lib_id returned).
function extractLibId($val){ // function to get the "12345" in the above example
$lclRetVal = substr($val, 5, strlen($val));
return $lclRetVal;
}
function Lib($text){ // does the replace for all lib_ instances in the text
$lclVar = "lib_";
$text = str_replace($lclVar, "<a href='".extractLibId($lclVar)."'>".extractLibId($lclVar)."</a>", $text);
return $text;
}

Regexp gonna be faster and more clear, you will have no need to call your function for every possible 'lib_' string:
function Lib($text) {
$count = null;
return preg_replace('/lib_([0-9]+)/', '$1', $text, -1, $count);
}
$text = 'some text lib_123123 goes here lib_111';
$text = Lib($text);
Without regexp, but every time Lib2 will be called somewhere will die cute kitten:
function extractLibId($val) {
$lclRetVal = substr($val, 4);
return $lclRetVal;
}
function Lib2($text) {
$count = null;
while (($pos = strpos($text, 'lib_')) !== false) {
$end = $pos;
while (!in_array($text[$end], array(' ', ',', '.')) && $end < strlen($text))
$end++;
$sub = substr($text, $pos, $end - $pos);
$text = str_replace($sub, ''.extractLibId($sub).'', $text);
}
return $text;
}
$text = 'some text lib_123123 goes here lib_111';
$text = Lib2($text);
Use preg_replace.

Although it is possible to do what you need without regular expressions, you say you don't want to use them because of performance reasons. I doubt the other solution will be faster, so here is a simple regex to benchmark against:
echo preg_replace("/lib_(\w+)/", '$1', $str);
As shown here: http://codepad.org/xGj78r9r

Ignoring how ridiculous area of optimizing this is, even the simplest implementation with minimal validation already takes only 33% less time than a regex
<?php
function uselessFunction( $val ) {
if( strpos( $val, "lib_" ) !== 0 ) {
return $val;
}
$str = substr( $val, 4 );
return "{$str}";
}
$l = 100000;
$now = microtime(TRUE);
while( $l-- ) {
preg_replace( '/^lib_(.*)$/', "$1", 'lib_someString' );
}
echo (microtime(TRUE)-$now)."\n";
//0.191093
$l = 100000;
$now = microtime(TRUE);
while( $l-- ) {
uselessFunction( "lib_someString" );
}
echo (microtime(TRUE)-$now);
//0.127598
?>

If you're restricted from using a regex, you're going to have difficult time searching for a string you describe as "someString", i.e. not precisely known in advance. If you know the string is exactly lib_12345, for example, then set $lclVar to that string. On the other hand, if you don't know the exact string in advance, you'll have to use a regex via preg_replace() or a similar function.

How do I return a part of text with a certain word in the middle?

If this is the input string:
$input = 'In biology (botany), a "fruit" is a part of a flowering
plant that derives from specific tissues of the flower, mainly one or
more ovaries. Taken strictly, this definition excludes many structures
that are "fruits" in the common sense of the term, such as those
produced by non-flowering plants';
And now I want to perform a search on the word tissues and consequently return only a part of the string, defined by where the result is, like this:
$output = '... of a flowering plant that derives from specific tissues of the flower, mainly one or more ovaries ...';
The search term may be in the middle.
How do I perform the aforementioned?

An alternative to my other answer using preg_match:
$word = 'tissues'
$matches = array();
$found = preg_match("/\b(.{0,30}$word.{0,30})\b/i", $string, $matches);
if ($found == 0) {
// string not found
} else {
$output = $matches[1];
}
This may be better as it uses word boundaries.
EDIT: To surround the search term with a tag, you'll need to slightly alter the regex. This should do it:
$word = 'tissues'
$matches = array();
$found = preg_match("/\b(.{0,30})$word(.{0,30})\b/i", $string, $matches);
if ($found == 0) {
// string not found
} else {
$output = $matches[1] . "<strong>$word</strong>" . $matches[2];
}

User strpos to find the location of the word and substr to extract the quote. For example:
$word = 'tissues'
$pos = strpos($string, $word);
if ($pos === FALSE) {
// string not found
} else {
$start = $pos - 30;
if ($start < 0)
$start = 0;
$output = substr($string, $start, 70);
}
Use stripos for case insensitive search.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Php search string (with wildcards) - php

This is one of the few cases where regular expressions are actually helpful. :) if (preg_match('/my (\w+) has/', $str, $matches)) { echo $matches[1]; } See the documentation for preg_match.

If you insist to use a wildcard (and yes, PREG is much better) you can use the function fnmatch. ex: if (fnmatch('my * has', $str)) { }

Related

PHP Preg Replace. Remove strings inside {~ string ~} pattern, but skip <pre>{~ string ~}</pre> [duplicate]

URL conversion (into link and urldecode)

Search For Characters in a String

Convert lib_string to string w/o Regex

How do I return a part of text with a certain word in the middle?

Categories

Resources