PHP regEx help needed with /*<##> </##>*/ - php

I am struggling with regEx, but can not get it to work.
I already try with:
SO question, online tool,
$text = preg_replace("%/\*<##>(?:(?!\*/).)</##>*\*/%s", "new", $text);
But nothing works.
My input string is:
$input = "something /*<##>old or something else</##>*/ something other";
and expected result is:
something /*<##>new</##>*/ something other

I see two issues that point out here, you have no capturing groups to replace the delimited markers inside your replacement call and your Negative Lookahead syntax is missing a repetition operator.
$text = preg_replace('%(/\*<##>)(?:(?!\*/).)*(</##>*\*/)%s', '$1new$2', $text);
Although, you can replace the lookahead with .*? since you are using the s (dotall) modifier.
$text = preg_replace('%(/\*<##>).*?(</##>*\*/)%s', '$1new$2', $text);
Or consider using a combination of lookarounds to do this without capturing groups.
$text = preg_replace('%/\*<##>\K.*?(?=</##>\*/)%s', 'new', $text);

Tested:
$input = "something /*<##>old or something else</##>*/ something other";
echo preg_replace('%(/\*<##>)(.*)(</##>\*/)%', '$1new$3', $input);

Related

Find next word after colon in regex

I am getting a result as a return of a laravel console command like
Some text as: 'Nerad'
Now i tried
$regex = '/(?<=\bSome text as:\s)(?:[\w-]+)/is';
preg_match_all( $regex, $d, $matches );
but its returning empty.
my guess is something is wrong with single quotes, for this i need to change the regex..
Any guess?
Note that you get no match because the ' before Nerad is not matched, nor checked with the lookbehind.
If you need to check the context, but avoid including it into the match, in PHP regex, it can be done with a \K match reset operator:
$regex = '/\bSome text as:\s*'\K[\w-]+/i';
See the regex demo
The output array structure will be cleaner than when using a capturing group and you may check for unknown width context (lookbehind patterns are fixed width in PHP PCRE regex):
$re = '/\bSome text as:\s*\'\K[\w-]+/i';
$str = "Some text as: 'Nerad'";
if (preg_match($re, $str, $match)) {
echo $match[0];
} // => Nerad
See the PHP demo
Just come from the back and capture the word in a group. The Group 1, will have the required string.
/:\s*'(\w+)'$/

PHP exploding url from text, possible?

i need to explode youtube url from this line:
[embed]https://www.youtube.com/watch?v=L3HQMbQAWRc[/embed]
It is possible? I need to delete [embed] & [/embed].
preg_match is what you need.
<?php
$str = "[embed]https://www.youtube.com/watch?v=L3HQMbQAWRc[/embed]";
preg_match("/\[embed\](.*)\[\/embed\]/", $str, $matches);
echo $matches[1]; //https://www.youtube.com/watch?v=L3HQMbQAWRc
$string = '[embed]https://www.youtube.com/watch?v=L3HQMbQAWRc[/embed]';
$string = str_replace(['[embed]', '[/embed]'], '', $string);
See str_replace
why not use str_replace? :) Quick & Easy
http://php.net/manual/de/function.str-replace.php
Just for good measure, you can also use positive lookbehind's and lookahead's in your regular expressions:
(?<=\[embed\])(.*)(?=\[\/embed\])
You'd use it like this:
$string = "[embed]https://www.youtube.com/watch?v=L3HQMbQAWRc[/embed]";
$pattern = '/(?<=\[embed\])(.*)(?=\[\/embed\])/';
preg_match($pattern, $string, $matches);
echo $match[1];
Here is an explanation of the regex:
(?<=\[embed\]) is a Positive Lookbehind - matches something that follows something else.
(.*) is a Capturing Group - . matches any character (except a newline) with the Quantifier: * which provides matches between zero and unlimited times, as many times as possible. This is what is matched between the groups prior to and after. This are the droids you're looking for.
(?=\[\/embed\]) is a Positive Lookahead - matches things that come before it.

Issue with regular expression for identifying encrypted ids from string

I want to convert certain patterns into links and it works fine as far as normal user ids are considered.But now i want to do the same for encrypted ids as well.
Below is my code:(works)
$text = "hi how are you guys???... ##[Sam Thomas:10181] ##[Jack Daniel:11074] ##[Paul Walker:11043] ";
$pattern = "/##\[([^:]*):(\d*)\]/";
$matches = array();
preg_match_all($pattern, $text, $matches);
$output = preg_replace($pattern, "$1", $text);
Now i need to do link the text like:
"hi how are you guys???... ##[Sam Thomas:ZGNjAmD9ac3K] ##[Jack Daniel:ZGNjAmD9ac3K] ##[Paul Walker:ZGNjAmD9ac3K] ";
But this encrypted is not identified by above regular expression...
##\[([^:]*):(.*?)\]
^^
Try this.See demo.Just change \d* to .*? to accept anything or \w* to accept only numbers and letters.or [^\]]* or [0-9a-zA-Z] as well.
https://regex101.com/r/vD5iH9/52
Change your regex to accept numbers and letters as well.
Something like this -
##\[([^:]*):([0-9a-zA-Z]*)\]
^^^^^^^^^^^ Replaced \d
Demo

preg_replace() seems to remove entire word instead of part of it

I'm trying to match a certain word and replace part of the word with certain text but leave the rest of the word intact. It is my understanding that adding parentheses to part of the regex pattern means that the pattern match within the parentheses gets replaced when you use preg_replace()
for testing purposes I used:
$text = 'batman';
echo $new_text = preg_replace('#(bat)man#', 'aqua', $text);
I only want 'bat' to be replaced by 'aqua' to get 'aquaman'. Instead, $new_text echoes 'aqua', leaving out the 'man' part.
preg_replace replaces all the string matched by regular expression
$text = 'batman';
echo $new_text = preg_replace('#bat(man)#', 'aqua\\1', $text);
Capture man instead and append it to your aqua prefix
Another way of doing that is to use assertions:
$text = 'batman';
echo $new_text = preg_replace('#bat(?=man)#', 'aqua', $text);
I would not use preg_* functions for this and just do str_replace() DOCs:
echo str_replace('batman', 'aquaman', $text);
This is simpler as a regex is not really needed in this case. Otherwise it would be with a regular expression:
echo $new_text = preg_replace('#bat(man)#', 'aqua\\1', $text);
This will substitute your man in after aqua when replacing the entire search phrase. preg_replace DOCs replaces the entire matching portion of the pattern.
The way you're trying to do it, it would be more like:
preg_replace('#bat(man)#', 'aqua$1', $text);
I'd using positive lookahead:
preg_replace('/bat(?=man)/', 'aqua', $text)
Demo here: http://ideone.com/G9F4q
The brackets are creating a capturing group, that means you can access the part matched by this group using \1.
you can do either what zerkms suggested or use a lookahead that does just check but not match.
$text = 'batman';
echo $new_text = preg_replace('#bat(?=man)#', 'aqua', $text);
This will match "bat" but only if it is followed by "man", and only "bat" is replaced.

how could I combine these regex rules?

I'm detecting #replies in a Twitter stream with the following PHP code using regexes.
$text = preg_replace('!^#([A-Za-z0-9_]+)!', '#$1', $text);
$text = preg_replace('! #([A-Za-z0-9_]+)!', ' #$1', $text);
How can I best combine these two rules without false flagging email#domain.com as a reply?
OK, on a second thought, not flagging whatever#email means that the previous element has to be a "non-word" item, because any other element that could be contained in a word could be signaled as an email, so it would lead:
!(^|\W)#([A-Za-z0-9_]+)!
but then you have to use $2 instead of $1.
Since the ^ does not have to stand at the beginning of the RE, you can use grouping and | to combine those REs.
If you don't want re-insert the whitespace you captured, you have to use "positive lookbehind":
$text = preg_replace('/(?<=^|\s)#(\w+)/',
'#$1', $text);
or "negative lookbehind":
$text = preg_replace('/(?<!\S)#(\w+)/',
'#$1', $text);
...whichever you find easier to understand.
Here's how I'd do the combination
$text = preg_replace('!(^| )#([A-Za-z0-9_]+)!', '$1#$2', $text);
$text = preg_replace('/(^|\W)#(\w+)/', '#$2', $text);
preg_replace('%(?<!\S)#([A-Za-z0-9_]+)%', '#$1', $text);
(?<!\S) is loosely translated to "no preceding non-whitespace character". Sort of a double-negation, but also works at the start of the string/line.
This won't consume any preceding character, won't use any capturing group, and won't match strings such as "foo-#host.com", which is a valid e-mail address.
Tested:
Input = 'foo bar baz-#qux.com bee #def goo#doo #woo'
Output = 'foo bar baz-#qux.com bee #def goo#doo #woo'
Hu, guys, don't push too far... Here it is :
!^\s*#([A-Za-z0-9_]+)!
I think you can use alternation,: so look for the beginning of a string or a space
'!(?:^|\s)#([A-Za-z0-9_]+)!'

Categories