preg match, replace url to bbcode - php

i write preg match rules:
$subject = 'text LINK text text LINK2';
$search = array(
'/\<a href\="(.*)\">(.*)\<\/a\>/i'
);
$replace = array(
"[a href=\"$1\"]$2[/a]"
);
echo preg_replace($search, $replace, $subject);
When in text only one link everything works great, then more then one - crach code
This i get when is more than one link:
"text [a href="http://google.com">LINK text text "

Change to '/\<a href\="(.*?)\">(.*?)\<\/a\>/i' to make the matching not-greedy.

Here's a better regex - it deals with extra fields in the tags:
\<a (?:.*?)href\=[\"\']([^\"\']+?)[\"\'][^\>]*?\>(.+?)\<\/a\>
I think I've escaped all of the special characters in there, I'm not sure what PHP considers 'special', but basically this should match all of the following:
$subject = 'text <a id="test" href="http://google.com">LINK</a> text text LINK2 text LINK3';
Also, I don't know about PHP, but to match more than one link in Perl, you need the /g modifier on the end of that regex, so:
$search = array(
'/\<a (?:.*?)href\=[\"\']([^\"\']+?)[\"\'][^\>]*?\>(.+?)\<\/a\>/ig'
);
would be your search. Maybe preg_replace does this already, but I'd be surprised, since there are times when you'd only want to replace one instance in your target text.

Related

string replace two matches that have an exact (partly) match within one string

I have a variable $text which is a plain text that can contain one or more email addresses in a line of text. I use a regular expression to find these email addresses and then transform them into clickable <a href="mailto:....etc addresses. This is my code with an example that work fine:
$text = "this is the text that has a email#email.com in it and also test#email.com.";
if(preg_match_all('/[\p{L}0-9_.-]+#[0-9\p{L}.-]+\.[a-z.]{2,6}\b/u',$text,$mails)){
foreach($mails[0] as $mail ){
$text = str_replace($mail,''.$mail.'',$text);
}
}
Or see this live demo. Problems occur when in my variable $text there are two email adresses that have an exact (partial) match. For example sometest#email.com and test#email.com. Here's another live demo. The problem is the string replace happens within the partial match as well (because it is also a full match). How to bypass this issue?
Why not use preg_replace?
str_replace can overwrite previous matches.
This should be good for you:
echo preg_replace(
'/([\p{L}0-9_.-]+#[0-9\p{L}.-]+\.[a-z.]{2,6}\b)/u',
'$1',
$text
);
Notice that I had to slightly modify the regular expression and wrap it in parentheses.
This is so that I can reference it in the replacement.
Live demo
You have to catch the caracter before your match to be sure it's a full match :
if(preg_match_all('/(.)([\p{L}0-9_.-]+#[0-9\p{L}.-]+\.[a-z.]{2,6}\b)/u',$text,$mails))
-----------------------------------^
Then you just have to modify a bit your str_replace parameter
var_dump($mails);
$id = 0;
foreach($mails[2] as $mail ){
$text = str_replace($mails[1][$id].$mail,'$mails[1][$id].'.$mail.'',$text);
$id ++;
}
For example : https://3v4l.org/qYpHo
Like so...
<?php
$string = "this is the text that has a email#email.com in it and also test#email.com.";
$search = array ( "!(\s)([_\.0-9a-z-]+#([0-9a-z][0-9a-z-]+\.)+[a-z]{2,3})!i",
"!^([_\.0-9a-z-]+#([0-9a-z][0-9a-z-]+\.)+[a-z]{2,3})!i" );
$replace = array ( "\\1\\2",
"\\1" );
echo preg_replace ( $search, $replace, $string );
?>
result...
this is the text that has a email#email.com in it and also test#email.com.

php find # in string with regex

I have a php variable where I need to show #value Values as link pattern.
The code looks like this.
$reg_exUrl = "/\#::(.*?)/";
// The Text you want to filter for urls
$text = "This is a #simple text from which we have to perform #regex operation";
// Check if there is a url in the text
if(preg_match($reg_exUrl, $text, $url)) {
// make the urls hyper links
echo preg_replace($reg_exUrl, ''.$url[0].'', $text);
} else {
// if no urls in the text just return the text
echo "IN Else #$".$text;
}
By using \w, you can match a word contains alphanumeric characters and underscore. Change your expression with this:
$reg_exUrl = "/#(.*?)\w+/"
$reg_exUrl = "/\#::(.*?)/";
This doesn't match because of the following reasons
1. there is no need to escape #, this is because it is not a special character.
2. since you want to match just # followed by some words, there is no need for ::
3. (.*?) tries to match the least possible word because of the quantifier ?. So it won't match the required length of word you need.
If you still want to go by your pattern, you can modify it to
$reg_exUrl = "/#(.*?)\w+/" See demo
But a more efficient one that still works is
$reg_exUrl = "/#\w+/". see demo
It's not clear to me exactly what you need match. If you want to replace a # followed by any word chars:
$text = "This is a #simple text from which we have to perform #regex operation";
$reg_exUrl = "/#(\w+)/";
echo preg_replace($reg_exUrl, '$1', $text);
//Output:
//This is a simple text from which we have to perform regex operation
The replacement uses $0 to refer to the text matched and $1 the first group.

Make user name bolded in text in PHP

$text = 'Hello #demo here!';
$pattern = '/#(.*?)[ ]/';
$replacement = '<strong>${1}</strong> ';
echo preg_replace($pattern, $replacement, $text);
This works, I get HTML like this: Hello <strong>demo</strong> here!. But this not works, when that #demo is at the end of string, example: $text = 'Hello #demo';. How can I change my pattern, so it will return same output whenever it is end of the string or not.
Question 2:
What if the string is like $text = 'Hello #demo!';, so it will not put ! as bolded text? Just catch space, end of string or not real-word.
Sorry for bad English, hope you know what I need.
In order to select a word beginning with the # symbol, this regex will work:
$pattern = "/#(\w+)\b/"
`\w` is a short hand character class for `[a-zA-Z0-9_]`. `\b` is an anchor for the beginning or end of a word, in this case the end. So the regex is saying: select something starting with an '#' followed by one or more word characters until the end of the word is reached.
Reference: http://www.regular-expressions.info/tutorial.
You could use a word boundary, that's what they're for:
$pattern = '/#(.+?)\b/';
This will work for question 2 also
You can add an option to match the end of the string:
#(.*?)(?= |\p{P}?$)
Replace with <strong>$1</strong>.
You can also use \p{P} (any Unicode punctuation symbol) to prevent punctuation from bold formatting.
Here is a demo.

preg_replace() pattern to remove brackets and content in php

I want to remove the brackets with its content using preg_replace(), but i am unable to use a lazy(non-greedy) in the pattern since the end bracket is the end character, the text in between the brackets is always a random character length and can contain numbers, underscores, and hyphens.
code-
$array = array(
"Text i want to keep (txt to remove)",
"Random txt (some more random txt)",
"Keep this (remove)",
"I like bananas (txt)"
);
$pattern = "#pattern#";
foreach($array as $new_txt){
$new_outputs .= preg_replace($pattern, '', $new_txt)."\n";
}
echo $new_outputs;
Wanted output-
Text i want to keep
Random txt
Keep this
I like bananas
I do not use regular expressions much and couldn't find anything to solve my problem.
The following regular expression should do it:
$pattern = '#\(.*?\)#';
.*? is a non-greedy match of anything.
$new_outputs .= preg_replace('#\([^\)]*\)$#','',$new_txt);
This might help you:
$pattern = "/\([^)]*\)+/";
foreach($array as $new_txt){
$new_outputs .= preg_replace($pattern, '', $new_txt)."\n";
}

How can I change this regexp to match all keywords that are not inside an anchor

I'm trying to replace keywords inside some text with an anchor that opens a details window for said keyword. This is the code I use for the replacement:
$pattern = '%\b('.$keyword['Keyword'].')\b(?![^<]*</a>)%i';
$replacement = '<strong>\\1</strong>';
$text = preg_replace($pattern, $replacement, $text);
It's built to avoid words that are inside an anchor thus avoiding already replaced multi-word keywords. So I don't replace "deviz" in already replaced "detalii deviz". The exception works in every case except when the word i'm looking for is not the first word in the anchor. So, for example, it will NOT replace "deviz" in <a>deviz detalii</a> or just <a>deviz</a> but WILL replace it <a>detalii deviz</a>.
How should I change the pattern to make the regular expression avoid matching any word that is inside an anchor, just like I want it to.
$text = 'deviz <a>deviz deviz deviz</a> deviz';
$pattern = '%\bdeviz\b(?![^<]*</a>)%i';
$text = preg_replace($pattern, 'replaced', $text);
echo $text;
// 'replaced <a>deviz deviz deviz</a> replaced'
Your regex seems to work fine - what's the problem?

Categories