regex issue with finding " a-Z , a-Z "

regex issue with finding " a-Z , a-Z " - php

I am looking to find comma after Oliver and before Roy in the following string
"Username,Hello Person Oli,"Oliver,Who" roy,Roy"
I want to replace this one Oliver,Who
I am using the following regex
'/^"(?:[a-zA-Z]+)(?:,+)(?:[a-zA-Z]+)"$/'
However its not working with preg_replace
This is my code
$pregData = preg_replace('/^"(?:[a-zA-Z]+)(?:,+)(?:[a-zA-Z]+)"$/',';',$csv);
Any ideas why ?
Sorry for the poor first message.

To take into account other characters than just a-z, you could use:
/"([^,]+?),(.+?)"/
" = quote
[^,]+?, = anything that's not a comma until the first comma
.+?" = anything else until the next quote
Note that the string on the left and on the right of the comma are captured by (...) constructs. That means that, if the expression matches, then the string on the left will be assigned to \1, while the string on the right will be assigned to \2. Therefore, if you want to replace this with something like left; right, you could use:
preg_replace('/"([^,]+?),(.+?)"/', '\1; \2', $csv)
If you just want to keep the left and right parts without the comma in the middle, you can simply replace the expression with the left part, followed by a space, followed by the right part.
preg_replace('/"([^,]+?),(.+?)"/', '\1 \2', $csv)

Your regex is looking for a text with only a-z at the beginning, a comma, and a-z again. That doesn't match with your string, you should remove the beginning (^) and end ($) of string characters.
/"(?:[a-zA-Z]+)(?:,+)(?:[a-zA-Z]+)"/

If you want to replace only the comma, then you need, additionally to the two wrong anchors, to make the outer groups capturing.
'/"([a-zA-Z]+)(?:,+)([a-zA-Z]+)"/'
So that you can reuse the captured words around the comma using $1 and $2
See it here on Regexr

Related

Regex to get string between single or double quotes even if it's empty

Below is the REGEX which I am trying:
/((?<![\\\\])['"])((?:.(?!(?<![\\\\])\\1))*.?)\\1/
Here this is the text which I am giving
val1=""val2>"2022-11-16 10:19:20"
I need blank expressions like for val1 as well,
i.e. I need something like below in matches
""
2022-11-16 10:19:20
If I change the text to something like below, I am getting proper output
val2>"2022-11-16 10:19:20"val1=""
Can anyone please let me know where I am going wrong

Use alternatives to match the two cases.
One alternative matches the pair of quotes, the other uses lookarounds to match the inside of two quotes.
""|(?<=")[^"]+(?=")

In your pattern, this part (?:.(?!(?<![\\])\1))* first matches any character and then it asserts that what is to the right is not a group 1 value without an escape \
So in this string ""val2>" your whole pattern matches " with the character class ["'] and then it matches " again with the . From the position after that match, it is true that what is to the right is not the group 1 value without a preceding \ and that is why that match is ""val2>" instead of ""
If the second example string does give you a proper output, you could reverse the dot and first do the assertiong in the repeating part of the pattern, and omit matching an optional char .?
Note that the backslash does not have to be in square brackets.
(?<!\\)(['"])((?:(?!(?<!\\)\1).)*+)\1
See the updated regex101 demo.

RegEx expression to hit only words with a-z and no aumlats

Can you help me out with this one? I have a list of words like this:
sachbearbeiter/-in
referent/-in
anlagenführer/-in
it-projektleiter/-in
I want to select only:
sachbearbeiter/-in
referent/-in
This is my current regex: ([a-z]+)/-(in)
The problem is it hits all even the ones with - and with ü
Thank you in advance.

You can use anchors to match the word you want:
^([a-z]+)/-(in)$
^---- Here ----^
Working demo
Update: for your comment, if you want to accept aumlats you can use unicode flag with \w like this:
^(\w+)/-(in)$
Working demo

You need to specify beginning & end of string so that it can match exact chars
change your regex to
^([a-z]+)/-(in)$
^ -> stands for beginning of string
$-> for end of string

Your current regex i.e. ([a-z]+)/-(in) does escape the / character and also trying to look into substrings that matches the pattern, so it'll show each of them.
Regex should be : ^([a-z]+)\/-(in) i.e. it should start with only small case alphabets with escaped /

how to replace TOWER 2_3_C with T2_C using Preg_replace

I want to replace my string using preg_replace. I want to replace:
TOWER 2_3_C
with
T2_C
Actually I want to get the first letter and remove the second number with its under score.
To do this I used:
return preg_replace('/([A-Z]). * ? (. * ?)_(.*?)_(.*?)/', '$1$2-$4', $a);
but it does not work.
Any Idea??

This should accomplish that. Given the string though you may want to make this stricter.
$string = 'TOWER 2_3_C';
echo preg_replace('/([A-Z]).*?(\d+?_).*?([A-Z])/', '$1$2$3', $string);
Regex101 Demo: https://regex101.com/r/iX9dP5/1
This isn't remove the second number with its under score. This finds an A-Z, anything until a number, (make it \d if you only want a single placed value) the next underscore, then anything until the first A-Z. This is case sensitive currently use the i modifier or add a-z to the character classes.
In your regex you have issues with whitespace. For example:
. * ?
The . is a single character, the  * is zero or more white spaces, and the  ? is an optional whitespace (can't get whitespaces to show up in code highlights for some reason,   is the whitespace entity). The quantifier needs to be the preceding character for it to quantify that character.

Finding match, removing the bits I don't want, and then putting it back in

I'm trying to parse thru a file and find a particular match, filter it in some way, and then print that data back into the file with some of the characters removed. I've been trying different things for a couple hours with preg slits and preg replace, but my regular express knowledge is limited so I haven't made much progress.
I have a large file that has many instances like this [something]{title:value}. I want to find everything between "[" and "}" and remove everything besides the "something" bit.
After that parts done I want to find everything between "{" and "}" on everything left like {title:value} and then remove everything besides the "value" part. I'm sure there is some simple method to do this, so even just a resource on how to get started would be helpful.

Not sure if I get your meaning right (and haven't touched PHP for months), what about this?
$matches = array();
preg_match_all("/\[(.*?)\]\{.*?:(.*?)\}/", $str, $matches);
$something = $matches[1]; // $something stores all texts in the "something" part
$value = $matches[2]; // $value stores all texts in the "value" part
Doc for preg_match_all
For the regex pattern \[(.*?)\]\{.*?:(.*?)\}:
We escapes all the [, ], { and } with a slash because these characters have a special meaning in regex, and need an escape for the literal character.
.*? is a lazy match all, which will match any character until the next character matches the next token. It is used instead of .* so that it won't match other symbols
(.*?) is a capturing group, getting what we need and PHP will put those matches in $matches array
So the entire thing is - match the [ character, then any string until getting the ] character and put it in capturing group 1, then ]{ characters, then any string until getting the : character (no capturing group because we don't care.), then match the : character, then any string until the } character and put it incapturing group 2.

You can do it in one shot:
$txt = preg_replace('~\[\K[^]]*(?=])|{[^:}]+:\K[^}]+(?=})~', '', $txt);
\K removes from match result all that have been matched on his left.
The lookahead (?=...) (followed by) performs a check but add nothing to the match result.

php regex: or clause doesn't work

i need to write a regex for make a double check: if a string contains empty spaces at the beginning, at the end, and if all string it's composed by empty spaces, and if string contains only number.
I've write this regex
$regex = '/^(\s+ )| ^(\d+)$/';
but it doesn't' work. What's wrong ?

First things first: get your spaces right!
For example (\s+ ) will match a minimum of one space (\s+) followed by another space ()! Same applies for the space between | and ^. This way you will match the space literally every time and this leads to wrong results.
If I get you right and you want to match on strings which
start with one or more spaces OR
end with one or more spaces OR
consist only of spaces OR
consist only of numbers
I'd use
/^(?:\s+.*|.*\s+$|\d+$)/
Demo # regex101
This way you match spaces at the start of the string (\s+.*) or (|) spaces at the end of the string (.*\s+$) or a completely numeric string (\d+$).
Insert capturing groups as needed.
This will match in case the whole string consists of spaces, too, because technically the string then starts with spaces.

The space before ^(\d+) make your regex can't catch the numeric string.
It should be like below:
$regex = '/^\s*\d*\s*$/';

First if all, remove the space between | and ^. You are trying to match a space before the beginning of the line (^), so that can not work.
I do not exactly understand what you want. Either a string that only consists of white spaces, or a number that may have white spaces at the beginning or end? Try this:
$regex = '/^\s*\d*\s*$/';

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

regex issue with finding " a-Z , a-Z " - php

Your regex is looking for a text with only a-z at the beginning, a comma, and a-z again. That doesn't match with your string, you should remove the beginning (^) and end ($) of string characters. /"(?:[a-zA-Z]+)(?:,+)(?:[a-zA-Z]+)"/

If you want to replace only the comma, then you need, additionally to the two wrong anchors, to make the outer groups capturing. '/"([a-zA-Z]+)(?:,+)([a-zA-Z]+)"/' So that you can reuse the captured words around the comma using $1 and $2 See it here on Regexr

Related

Regex to get string between single or double quotes even if it's empty

RegEx expression to hit only words with a-z and no aumlats

how to replace TOWER 2_3_C with T2_C using Preg_replace

Finding match, removing the bits I don't want, and then putting it back in

php regex: or clause doesn't work

Categories

Resources