Preg replace till dot including that - php

I am having one line I have to replace that below is the line
/myhome/ishere/where/are.you.ha
Here I have a live regex
preg_match('/(.+\/)[^.]+(.+\.ha)/', $input_line, $output_array);
it results me live this
/myhome/ishere/where/.you.ha
But I need a answer like this
/myhome/ishere/where/you.ha
Please anyone help me to remove this dot.

You could write the pattern as this, which will give you 2 capture groups that you can use:
(.+\/)[^.]+\.([^.]+\.ha)$
Explanation
(.+\/) Capture group 1, match 1+ chars and then the last /
[^.]+ Match 1+ non dots
\. Match a dot
([^.]+\.ha) Match non dots, then .ha
$ End of string
Regex demo | Php demo
If you use $1$2 in the replacement:
$pattern = "/(.+\/)[^.]+\.([^.]+\.ha)$/";
$s = "/myhome/ishere/where/are.you.ha";
echo preg_replace($pattern, "$1$2", $s);
Output
/myhome/ishere/where/you.ha
Or see a code an example using preg_match with 2 capture groups.

You can use
/^(.+\/)[^.]*\.(.*\.ha)$/
See the regex demo. Details:
^ - start of a string
(.+\/) - Group 1: any one or more chars other than line break chars as many as possible and then /
[^.]* - zero or more chars other than a .
\. - a . char
(.*\.ha) - Group 2: any zero or more chars other than line break chars as many as possible and then .ha
$ - end of a string

Although I got the answer I was mistaking to put +\. in between
(.+/)[^.]+\.(.+\.ha)
preg_replace('/(.+\/)[^.]+\.(.+\.ha)/', '$1$2', $input_lines);
This is how it works.

Related

Regex - Do not capture word if it contains another word

I'm currently fighting with regex to achieve something and can't success by myself ...
Here my string: path/to/folder/#here
What I want is something like this: a:path/a:to/a:folder/#here
So my regex is the following one : /([^\/]+)/g, but the problem is that the result will be (preg_replace('/([^\/#]+)/', 'a:$0'): a:path/a:to/a:folder/a:#here ...
How can I had skip the replace if the captured group contains # ?
I tried this one, without any success : /((?!#)[^\/]+)/g
Another option could be to match what you want to avoid, and use a SKIP FAIL approach.
/#[^/#]*(*SKIP)(*F)|[^/]+
/# Match literally
[^/#]*(*SKIP)(*F) Optionally match any char except / and # and then skip this match
| Or
[^/]+ Match 1+ occurrences of any char except /
See a regex demo and a PHP demo.
For example
$str = 'path/to/folder/#here';
echo preg_replace('#/#[^/#]*(*SKIP)(*F)|[^/]+#', 'a:$0', $str);
Output
a:path/a:to/a:folder/#here
You can use
(?<![^\/])[^\/#][^\/]*
See the regex demo. Details:
(?<![^\/]) - a negative lookbehind that requires either start of string or a / char to appear immediately to the left of the current location
[^\/#] - a char other than / and #
[^\/]* - zero or more chars other than /.
See the PHP demo:
$text = 'path/to/folder/#here';
echo preg_replace('~(?<![^/])[^/#][^/]*~', 'a:$0', $text);
// => a:path/a:to/a:folder/#here

Why regex with lookaheads doesn't match?

I need (in PHP) to split a sententse by the word that cannot be the first or the last one in the sentence. Say the word is "pression" and here is my regex
/^.+?(?=[\s\.\,\:\;])pression(?=[\s\.\,\:\;]).+$/i
Live here: https://regex101.com/r/CHAhKj/1/
First, it doesn't match.
Next, I think - it is at all possible to split that way? I tryed simplified example
print_r(preg_split('/^.+pizza.+$/', 'my pizza is cool'));
live here http://sandbox.onlinephpfunctions.com/code/10b674900fc1ef44ec79bfaf80e83fe1f4248d02
and it prints an array of 2 empty strings, when I expect
['my ', ' is cool']
I need (in PHP) to split a sentence by the word that cannot be the first or the last one in the sentence
You may use this regex:
(?<=[^\s.?]\h)pression(?=\h[^\s.?])
RegEx Demo
RegEx Details:
(?<=[^\s.?]\h): Lookbehind to assert that ahead of current position we have a space and a character that not a whitespace, not a dot and not a ?.
pression: Match word pression
(?=\h[^\s.?]): Lookahead to assert that before current position we have a space and a character that not a whitespace, not a dot and not a ?
First, ^.+?(?=[\s\.\,\:\;])pression(?=[\s\.\,\:\;]).+$ can't match any string at all because the (?=[\s\.\,\:\;])p part requires p to be also either a whitespace char, or a ., ,, : or ;, which invalidates the whole match at once.
Second, ^.+pizza.+$ pattern does not ensure the pizza matched is not the first or last word in a sentence as . matches whitespace, too. It does not return anything meaningful, because preg_split uses the match to break string into chunks, and the two empty values are 1) start of string and 2) empty string positions.
That said, all you need is:
preg_match('~^(.*?\w\W+)pression(\W+\w.*)$~is', $text, $m)
See the regex demo. Details:
^ - start of string
(.*?\w\W+) - Capturing group 1: any zero or more chars, as few as possible, then a word char and then one or more non-word chars
pression - a word
(\W+\w.*) - Capturing group 2: one or more non-word chars, a word char, and then any zero or more chars as many as possible
$ - end of string.
s makes the . match across lines and i flag makes the pattern match in a case insensitive way.
See the PHP demo:
$text = "You can use any regular expression pression inside the lookahead ";
if (preg_match('~^(.*?\w\W+)pression(\W+\w.*)$~is', $text, $m)) {
echo $m[1] . " << | >> " . $m[2];
}
// => You can use any regular expression << | >> inside the lookahead

php Regex file path | second matching after specific character

I trying to extract file patches, without disk letter, that are inside text. Like from AvastSecureBrowserElevationService; C:\Program Files (x86)\AVAST Software\Browser\Application\elevation_service.exe [X] extract :\Program Files (x86)\AVAST Software\Browser\Application\elevation_service.exe.
My actual regex look like this, but it will stop on any space, which can contains file names.
(?<=:\\)([^ ]*)
The soulution that I figure out is, that I can match first space character after dot, because there is very little chance that there will be some directory name with space after dot, and I will always do fast manual check. But I do not know how to write this in regex
You may use this regex for this purpose:
(?<=[a-zA-Z]):[^.]+\.\S+
RegEx Demo
RegEx Details:
(?<=[a-zA-Z]): Lookbehind to assert we have a English letter before :
:: Match literal :
[^.]+: Match 1+ non-dot characters
\.: Match literal .
\S+: Match 1+ non-whitespace characters
Here we would consume our entire string, as we collect what we wish to output, and we would preg_replace:
.+C(:\\.+\..+?)\s.+
Test
$re = '/.+C(:\\.+\..+?)\s.+/m';
$str = 'AvastSecureBrowserElevationService; C:\\Program Files (x86)\\AVAST Software\\Browser\\Application\\elevation_service.exe [X]';
$subst = '$1';
$result = preg_replace($re, $subst, $str);
echo $result;
Demo
You can use the following regex:
[A-Z]\K:.+\.\w+
It will match any capital letter followed by :, then any character string ending wit ., followed by at least one word character.
\K removes from the match what comes before it.
Demo

Get only alphanumeric part of regex string

Let's say I can have strings like these:
^(www.|)mysite1.com$
^(.*)mysite2.com(.*)$
^(www\.|)mysite3\.com$
How do I get only the mysite1, mysite2 or mysite3 part of such strings. I tried set the non-alphanumeric parts to empty string using:
preg_replace("/[^A-Za-z0-9]/", '', $mystring);
But that returns me
mysite1com
mysite2com
mysite3com
Thanks in advance.
What you might do is use preg_match instead of preg_replace and use for example this regex:
\^\([^)]+\)\K[A-Za-z0-9]+
That would match
\^ # Match ^
\( # Match (
[^)]+ # Match not ) one or more times
\) # Match )
\K # Reset the starting point of the reported match
[A-Za-z0-9]+ # Match one or more upper/lowercase character or digit
For example:
preg_match("/\^\([^)]+\)\K[A-Za-z0-9]+/", "^(www.|)mysite1.com$", $matches);
echo $matches[0];
Demo
With preg_replace an approach could be to use 3 capturing groups where the value you want to keep is in the second group.
In the replacement, you would use $2:
(\^\([^)]+\))([A-Za-z0-9]+)(.*)
preg_replace("/(\^\([^)]+\))([A-Za-z0-9]+)(.*)/", '$2', $mystring);
Demo

Regex to get only characters without space inside special tags

I have 2 texts in a string:
%Juan%
%Juan Gonzalez%
And I want to only be able to get %Juan% and not the one with the Space, I have been trying several Regexes witout luck. I currently use:
/%(.*)%/U
but it gets both things, I tried adding and playing with [^\s] but it doesnt works.
Any help please?
The issue is that . matches any character but a newline. The /U ungreedy mode only makes .* lazy and it captures a text from the % up to the first % to the right of the first %.
If your strings contain one pair of %...%, you may use
/%(\S+)%/
See the regex demo
The \S+ pattern matches 1+ characters other than a whitespace, and the whole [^\h%] negated character class that matches any character but a horizontal space and % symbol.
If you have multiple %...% pairs, you may use
/%([^\h%]+)%/
See another regex demo, where \h matches any horizontal whitespace.
PHP demo:
$re = '/%([^\h%]+)%/';
$str = "%Juan%\n%Juan Gonzalez%";
preg_match_all($re, $str, $matches);
print_r($matches[1]);

Categories