"preg_replace()" not working properly. - php

I've a text file that I extracted all the domain addresses strting with http://
now I want to replace all the http://. in my matches array with "" but nothing is happening im not even getting an an error
$list = file_get_contents( 'file.txt' );
preg_match_all( "/http:\/\/.([a-z]{1,24}).([a-z^0-9-]{1,23}).([a-z]{1,3})/", $list, $matches );
for ($i=0; $i>=50; $i++) {
$pattern = array();
$replacement = array();
$pattern[0][$i] = "/http:\/\/.[w-w]{1,3}/";
$replacement[0][$i] = '';
preg_replace( $pattern[0][$i], $replacement[0][$i], $matches[0][$i] );
}
print_r($matches);

Your loop never runs because 0 >= 50 yields false. That said, what you're looking for is a map operation:
$matches = array_map(function($match) {
return preg_replace('~^http://w{1,3}~', '', $match);
}, $matches[0]);
print_r($matches);

preg_match_all has also problem. The period in regular expression matches any character.
$list = file_get_contents( 'file.txt' );
preg_match_all( "/http:\/\/([a-z]{1,24})\.([a-z^0-9-]{1,23})\.([a-z]{1,3})/", $list, $matches );
$pattern = "/http:\/\/(.[w-w]{1,3})/";
$replacement = '$1';
$matches[0] = preg_replace( $pattern, $replacement, $matches[0] );
print_r($matches);

Related

Warnings in Regular Expression with Posix Collating Elements

I am trying to run following regular expression based function in php where in the end am returning the output.
function vg_excerpt_more( $output ) {
$string = $output;
$pattern_auto_excerpt = '#([...]</p>)$#';
$pattern_manual_excerpt = '#(</p>)$#';
$replacement = ' [Continue...]</p>';
if ( preg_match( $pattern_auto_excerpt, $string ) ) {
$pattern = $pattern_auto_excerpt;
} else if ( preg_match( $pattern_manual_excerpt, $string ) ) {
$pattern = $pattern_manual_excerpt;
}
$output = preg_replace( $pattern, $replacement, $string );
return $output;
}
add_filter( 'the_excerpt', 'vg_excerpt_more' );
add_filter( 'excerpt_more', 'vg_excerpt_more' );
Well, the string could either end in [...]</p> OR </p> so I have to check the two cases.
The problem is, it is throwing warnings as -
WARNING: PREG_MATCH(): COMPILATION FAILED: POSIX COLLATING ELEMENTS
ARE NOT SUPPORTED AT OFFSET 1 in - 'preg_match( $pattern_auto_excerpt,
$string )'
and
WARNING: PREG_REPLACE(): EMPTY REGULAR EXPRESSION in - '$output =
preg_replace( $pattern, $replacement, $string );'
EDIT:
After useful replies by #user1852180 I moved ahead and did this -
function vg_excerpt_more( $output ) {
$string = $output;
$pattern = '';
// $pattern_auto_excerpt = '#(\[...\]</p>)$#';
$pattern_auto_excerpt = '#(\[(?:\.|…)+\])#';
$pattern_manual_excerpt = '#(</p>)$#';
$replacement = ' [Continue...]</p>';
if ( preg_match( $pattern_auto_excerpt, $string ) ) {
$pattern = '#(\[(?:\.|…)+\]</p>)$#';
if ( preg_match( $pattern, $string ) ) {
return preg_replace( $pattern, $replacement, $string ) . "Dummy2";
}
} else if ( preg_match( $pattern_manual_excerpt, $string ) ) {
$pattern = $pattern_manual_excerpt;
return preg_replace( $pattern, $replacement, $string ) . "Dummy";
}
return $output;
}
add_filter( 'the_excerpt', 'vg_excerpt_more' );
add_filter( 'excerpt_more', 'vg_excerpt_more' );
But am still seeing [...] in the frontend along with the replacement.
PS. It also never prints 'Dummy2', always 'Dummy'.
You need to escape the brackets in the first pattern, and the dot:
$pattern_auto_excerpt = '#(\[(?:\.|…)+\]</p>)$#';
You don't need to use the if/else to check if it has [...], let regex check that with the question mark:
function vg_excerpt_more( $output ) {
$pattern = '#(?:\[(?:\.|…)+\])?</p>$#';
$replacement = ' [Continue...]</p>';
return preg_replace( $pattern, $replacement, $output );
}

PHP Get the word replaced by preg_replace

How can I get the replaced word by preg_replace() function.
preg_replace('/[#]+([A-Za-z0-9-_]+)/', '$0', $post );
I want to get $1 variable so that I can user it further.
Capture it before you replace the expression:
// This is where the match will be kept
$matches = array();
$pattern = '/[#]+([A-Za-z0-9-_]+)/';
// Check if there are matches and capture the user (first group)
if (preg_match($pattern, $post, $matches)) {
// First match is the user
$user = $matches[1];
// Do the replace
preg_replace($pattern, '$0', $post );
}
This isn't possible with preg_replace() as it returns the finished string/array, but does not preserve the replaced phrases. You can use preg_replace_callback() to manually achieve this.
$pattern = '/[#]+([A-Za-z0-9-_]+)/';
$subject = '#jurgemaister foo #hynner';
$tokens = array();
$result = preg_replace_callback(
$pattern,
function($matches) use(&$tokens) {
$tokens[] = $matches[1];
return ''.$matches[0].'';
},
$subject
);
echo $result;
// #jurgemaister foo #hynner
print_r($tokens);
// Array
// (
// [0] => jurgemaister
// [1] => hynner
// )
You should use preg_match in addition to preg_replace. preg_replace is just for replacing.
$regex = '/[#]+([A-Za-z0-9-_]+)/';
preg_match($regex, $post, $matches);
preg_replace($regex, '$0', $post );
You can't do that with preg_replace, but you can do it with preg_replace_callback:
preg_replace_callback($regex, function($matches){
notify_user($matches[1]);
return "<a href='/$matches[1]' target='_blank'>$matches[0]</a>";
}, $post);
replace notify_user with whatever you would call to notify the user.
This can also be modified to check whether the user exists and replace only valid mentions.

PHP: How to find text NOT between particular tags?

Example input string: "[A][B][C]test1[/B][/C][/A] [A][B]test2[/B][/A] test3"
I need to find out what parts of text are NOT between the A, B and C tags. So, for example, in the above string it's 'test2' and 'test3'. 'test2' doesn't have the C tag and 'test3' doesn't have any tag at all.
If can also be nested like this:
Example input string2: "[A][B][C]test1[/B][/C][/A] [A][B]test2[C]test4[/C][/B][/A] test3"
In this example "test4" was added but "test4" has the A,B and C tag so the output wouldn't change.
Anyone got an idea how I could parse this?
This solution is not clean but it does the trick
$string = "[A][B][C]test1[/B][/C][/A] [A][B]test2[/B][/A] test3" ;
$string = preg_replace('/<A[^>]*>([\s\S]*?)<\/A[^>]*>/', '', strtr($string, array("["=>"<","]"=>">")));
$string = trim($string);
var_dump($string);
Output
string 'test3' (length=5)
Considering the fact that everyone of you tags is in [A][/A] What you can do is: Explode the [/A] and verify if each array contains the [A] tag like so:
$string = "[A][B][C]test1[/B][/C][/A] [A][B]test2[/B][/A] test3";
$found = ''; // this will be equal to test3
$boom = explode('[/A]', $string);
foreach ($boom as $val) {
if (strpos($val, '[A] ') !== false) { $found = $val; break; }
}
echo $found; // test3
try the below code
$str = 'test0[A]test1[B][C]test2[/B][/C][/A] [A][B]test3[/B][/A] test4';
$matches = array();
// Find and remove the unneeded strings
$pattern = '/(\[A\]|\[B\]|\[C\])[^\[]*(\[A\]|\[B\]|\[C\])[^\[]*(\[A\]|\[B\]|\[C\])([^\[]*)(\[\/A\]|\[\/B\]|\[\/C\])[^\[]*(\[\/A\]|\[\/B\]|\[\/C\])[^\[]*(\[\/A\]|\[\/B\]|\[\/C\])/';
preg_match_all( $pattern, $str, $matches );
$stripped_str = $str;
foreach ($matches[0] as $key=>$matched_pattern) {
$matched_pattern_str = str_replace($matches[4][$key], '', $matched_pattern); // matched pattern with text between A,B,C tags removed
$stripped_str = str_replace($matched_pattern, $matched_pattern_str, $stripped_str); // replace pattern string in text with stripped pattern string
}
// Get required strings
$pattern = '/(\[A\]|\[B\]|\[C\]|\[\/A\]|\[\/B\]|\[\/C\])([^\[]+)(\[A\]|\[B\]|\[C\]|\[\/A\]|\[\/B\]|\[\/C\])/';
preg_match_all( $pattern, $stripped_str, $matches );
$required_strings = array();
foreach ($matches[2] as $match) {
if (trim($match) != '') {
$required_strings[] = $match;
}
}
// Special case, possible string on start and end
$pattern = '/^([^\[]*)(\[A\]|\[B\]|\[C\]).*(\[\/A\]|\[\/B\]|\[\/C\])([^\[]*)$/';
preg_match( $pattern, $stripped_str, $matches );
if (trim($matches[1]) != '') {
$required_strings[] = $matches[1];
}
if (trim($matches[4]) != '') {
$required_strings[] = $matches[4];
}
print_r($required_strings);

how to do a preg_replace on a string in php?

i have some simple code that does a preg match:
$bad_words = array('dic', 'tit', 'fuc',); //for this example i replaced the bad words
for($i = 0; $i < sizeof($bad_words); $i++)
{
if(preg_match("/$bad_words[$i]/", $str, $matches))
{
$rep = str_pad('', strlen($bad_words[$i]), '*');
$str = str_replace($bad_words[$i], $rep, $str);
}
}
echo $str;
So, if $str was "dic" the result will be '*' and so on.
Now there is a small problem if $str == f.u.c. The solution might be to use:
$pattern = '~f(.*)u(.*)c(.*)~i';
$replacement = '***';
$foo = preg_replace($pattern, $replacement, $str);
In this case i will get ***, in any case. My issue is putting all this code together.
I've tried:
$pattern = '~f(.*)u(.*)c(.*)~i';
$replacement = 'fuc';
$fuc = preg_replace($pattern, $replacement, $str);
$bad_words = array('dic', 'tit', $fuc,);
for($i = 0; $i < sizeof($bad_words); $i++)
{
if(preg_match("/$bad_words[$i]/", $str, $matches))
{
$rep = str_pad('', strlen($bad_words[$i]), '*');
$str = str_replace($bad_words[$i], $rep, $str);
}
}
echo $str;
The idea is that $fuc becomes fuc then I place it in the array then the array does its jobs, but this doesn't seem to work.
First of all, you can do all of the bad word replacements with one (dynamically generated) regex, like this:
$bad_words = array('dic', 'tit', 'fuc',);
$str = preg_replace_callback("/\b(?:" . implode( '|', $bad_words) . ")\b/",
function( $match) {
return str_repeat( '*', strlen( $match[0]));
}, $str);
Now, you have the problem of people adding periods in between the word, which you can search for with another regex and replace them as well. However, you must keep in mind that . matches any character in a regex, and must be escaped (with preg_quote() or a backslash).
$bad_words = array_map( function( $el) {
return implode( '\.', str_split( $el));
}, $bad_words);
This will create a $bad_words array similar to:
array(
'd\.i\.c',
't\.i\.t',
'f\.u\.c'
)
Now, you can use this new $bad_words array just like the above one to replace these obfuscated ones.
Hint: You can make this array_map() call "better" in the sense that it can be smarter to catch more obfuscations. For example, if you wanted to catch a bad word separated with either a period or a whitespace character or a comma, you can do:
$bad_words = array_map( function( $el) {
return implode( '(?:\.|\s|,)', str_split( $el));
}, $bad_words);
Now if you make that obfuscation group optional, you'll catch a lot more bad words:
$bad_words = array_map( function( $el) {
return implode( '(?:\.|\s|,)?', str_split( $el));
}, $bad_words);
Now, bad words should match:
f.u.c
f,u.c
f u c
fu c
f.uc
And many more.

PHP Regex to match a list of words against a string

I have a list of words in an array. I need to look for matches on a string for any of those words.
Example word list
company
executive
files
resource
Example string
Executives are running the company
Here's the function I've written but it's not working
$matches = array();
$pattern = "/^(";
foreach( $word_list as $word )
{
$pattern .= preg_quote( $word ) . '|';
}
$pattern = substr( $pattern, 0, -1 ); // removes last |
$pattern .= ")/";
$num_found = preg_match_all( $pattern, $string, $matches );
echo $num_found;
Output
0
$regex = '(' . implode('|', $words) . ')';
<?php
$words_list = array('company', 'executive', 'files', 'resource');
$string = 'Executives are running the company';
foreach ($words_list as &$word) $word = preg_quote($word, '/');
$num_found = preg_match_all('/('.join('|', $words_list).')/i', $string, $matches);
echo $num_found; // 2
Make sure you add the 'm' flag to make the ^ match the beginning of a line:
$expression = '/foo/m';
Or remove the ^ if you don't mean to match the beginning of a line...

Categories