Regex to match words starting with hyphen - php

I have a regex which does all matches except one match.The PHP Code for the word match is:
$string = preg_replace("/\b".$wordToMatch."\b/","<span class='sp_err' style='background-color:yellow;'>".$wordToMatch."</span>",$string);
Here in the above regex when the $wordToMatch variable value becomes "-abc" and the $string value is "The word -abc should match and abc-abc should not match".With above regex it fails to catch "-abc".
I want enhancement in the above regex so that it can catch "-abc" in $string,but if it tries to match "-abc" in "abc-abc" of $string it should not.

In case your keywords can have non-word characters on both ends you can rely on lookarounds for a whole word match:
"/(?<!\\w)".$wordToMatch."(?!\\w)/"
Here, (?<!\w) will make sure there is no word character before the word to match, and (?!\w) negative lookahead will make sure there is no word character after the word to match. These are unambiguous subpatterns, while \b meaning depends on the context.
See regex demo showing that -abc is not matched in abc-abc and matches if it is not enclosed with word characters.
PHP demo:
$wordToMatch = "-abc";
$re = "/(?<!\\w)" . $wordToMatch . "(?!\\w)/";
$str = "abc-abc -abc";
$subst = "!$0!";
$result = preg_replace($re, $subst, $str);
echo $result; // => abc-abc !-abc!

Related

PHP Regular Expression Exclusion

Here is the sample PHP code:
<?php
$str = '10,000.1 $100,000.1';
$pattern = '/(?!\$)\d+(,\d{3})*\.?\d*/';
$replacement_str = 'Without$sign';
echo preg_replace($pattern, $replacement_str, $str);?>
Target is to replace numbers only (i.e. "$100,000.1" should not be replaced). But the above code replaces both 10,000.1 and $100,000.1. How to achieve the exclusion?
This assertion is always true (?!\$)\d+ as you match a digit which can not be a $
As the . and the digits at the end of the pattern are optional, it could also match ending on a dot like for example 0,000.
Instead you can assert a whitespace boundary to the left, and optionally match a dot followed by 1 or more digits:
(?<!\S)\d+(?:,\d{3})*(?:\.\d+)?\b
Regex demo
Example:
$str = '10,000.1 $100,000.1';
$pattern = '/(?<!\S)\d+(?:,\d{3})*(?:\.\d+)?\b/';
$replacement_str = 'Without$sign';
echo preg_replace($pattern, $replacement_str, $str);
Output (If you remove the numbers, the text "Without$sign" is not correct)
Without$sign $100,000.1

no solution for me. how can i replace second occurence of a find in php

im searching a paragrahp (string) for a certain word. and i want to replace that word with another word, but i want to replace on the second occurence of my find.
here is what i tried
$string = 'hello my name is hello';
$output = str_replace('hello', 'Gary', $string);
// desired output
//hello my name is Gary
It is very simple but i cant get it right. Please bare in mind my string is very long and has all types of characters in it
With this regex : /^.*?hello\b.*?\Khello/ :
^ assert position at start of the string
.*? matches any character (except newline)
\b assert position at a word boundary (^\w|\w$|\W\w|\w\W)
\K resets the starting point of the reported match. Any previously consumed characters are no longer included in the final match
Check this demo : https://regex101.com/r/lW2kK1/2
which gives you :
$re = "/^.*?hello\\b.*?\\Khello/";
$str = "hello my name is hello";
$subst = "Gary";
$result = preg_replace($re, $subst, $str);

Matching all of a certain character after a Positive Lookbehind

I have been trying to get the regex right for this all morning long and I have hit the wall. In the following string I wan't to match every forward slash which follows .com/<first_word> with the exception of any / after the URL.
$string = "http://example.com/foo/12/jacket Input/Output";
match------------------------^--^
The length of the words between slashes should not matter.
Regex: (?<=.com\/\w)(\/) results:
$string = "http://example.com/foo/12/jacket Input/Output"; // no match
$string = "http://example.com/f/12/jacket Input/Output";
matches--------------------^
Regex: (?<=\/\w)(\/) results:
$string = "http://example.com/foo/20/jacket Input/O/utput"; // misses the /'s in the URL
matches----------------------------------------^
$string = "http://example.com/f/2/jacket Input/O/utput"; // don't want the match between Input/Output
matches--------------------^-^--------------^
Because the lookbehind can have no modifiers and needs to be a zero length assertion I am wondering if I have just tripped down the wrong path and should seek another regex combination.
Is the positive lookbehind the right way to do this? Or am I missing something other than copious amounts of coffee?
NOTE: tagged with PHP because the regex should work in any of the preg_* functions.
If you want to use preg_replace then this regex should work:
$re = '~(?:^.*?\.com/|(?<!^)\G)[^/\h]*\K/~';
$str = "http://example.com/foo/12/jacket Input/Output";
echo preg_replace($re, '|', $str);
//=> http://example.com/foo|12|jacket Input/Output
Thus replacing each / by a | after first / that appears after starting .com.
Negative Lookbehind (?<!^) is needed to avoid replacing a string without starting .com like /foo/bar/baz/abcd.
RegEx Demo
Use \K here along with \G.grab the groups.
^.*?\.com\/\w+\K|\G(\/)\w+\K
See demo.
https://regex101.com/r/aT3kG2/6
$re = "/^.*?\\.com\\/\\w+\\K|\\G(\\/)\\w+\\K/m";
$str = "http://example.com/foo/12/jacket Input/Output";
preg_match_all($re, $str, $matches);
Replace
$re = "/^.*?\\.com\\/\\w+\\K|\\G(\\/)\\w+\\K/m";
$str = "http://example.com/foo/12/jacket Input/Output";
$subst = "|";
$result = preg_replace($re, $subst, $str);
Another \G and \K based idea.
$re = '~(?:^\S+\.com/\w|\G(?!^))\w*+\K/~';
The (: non capture group to set entry point ^\S+\.com/\w or glue matches \G(?!^) to it.
\w*+\K/ possessively matches any amount of word characters until a slash. \K resets match.
See demo at regex101

PHP capture word that contains special char from string using RegEx

I have special words in a string that i would like to capture based on the prefix.
Example Special words such as ^to_this should be caught.
I would need the word this because of the special prefix ^to_.
Here is my attempt but it is not working
preg_match('/\b(\w*^to_\w*)\b/', $str, $specialWordArr);
but this returns an empty array
Your code would be,
<?php
$mystring = 'Special words such as ^to_this should be caught';
$regex = '~[_^;]\w+[_^;](\w+)~';
if (preg_match($regex, $mystring, $m)) {
$yourmatch = $m[1];
echo $yourmatch;
}
?> //=> this
Explanation:
[_^;] Add the special characters into this character class to ensure that the begining of a word would be a special character.
\w+ After a special character, there must one or more word characters followed.
[_^;] Word characters must be followed by a special character.
(\w+) If these conditions are satisfied, capture the following one or more word characters into a group.
Without some additional examples this will work for what you've posted:
$str = 'Special words such as ^to_this should be caught';
preg_match('/\s\^to_(\w+)\s/', $str, $specialWordArr);
echo $specialWordArr[1]; //this

Replace symbol if it is preceded and followed by a word character

I want to change a specific character, only if it's previous and following character is of English characters. In other words, the target character is part of the word and not a start or end character.
For Example...
$string = "I am learn*ing *PHP today*";
I want this string to be converted as following.
$newString = "I am learn'ing *PHP today*";
$string = "I am learn*ing *PHP today*";
$newString = preg_replace('/(\w)\*(\w)/', '$1\'$2', $string);
// $newString = "I am learn'ing *PHP today* "
This will match an asterisk surrounded by word characters (letters, digits, underscores). If you only want to do alphabet characters you can do:
preg_replace('/([a-zA-Z])\*([a-zA-Z])/', '$1\'$2', 'I am learn*ing *PHP today*');
The most concise way would be to use "word boundary" characters in your pattern -- they represent a zero-width position between a "word" character and a "non-word" characters. Since * is a non-word character, the word boundaries require the both neighboring characters to be word characters.
No capture groups, no references.
Code: (Demo)
$string = "I am learn*ing *PHP today*";
echo preg_replace('~\b\*\b~', "'", $string);
Output:
I am learn'ing *PHP today*
To replace only alphabetical characters, you need to use a [a-z] as a character range, and use the i flag to make the regex case-insensitive. Since the character you want to replace is an asterisk, you also need to escape it with a backslash, because an asterisk means "match zero or more times" in a regular expression.
$newstring = preg_replace('/([a-z])\*([a-z])/i', "$1'$2", $string);
To replace all occurances of asteric surrounded by letter....
$string = preg_replace('/(\w)*(\w)/', '$1\'$2', $string);
AND
To replace all occurances of asteric where asteric is start and end character of the word....
$string = preg_replace('/*(\w+)*/','\'$1\'', $string);

Categories