I need to remove the outside brackets of a string, but not the inside ones.
For example:
"(-58)" -> "-58"
"('test')" -> "'test'"
"('st())" -> "st()"
" (hd)h(l() ) " -> "hd)h(l() " --> removed all chars up to the bracket
Hopefully you can see what I mean.
I know how to remove all the brackets inside a string, but am not sure how to remove just the first and last ones. I also need it to remove all the chars UP TO the bracket, as there could be a space before/after the bracket which I do not want.
Any help would be greatly appreciated.
One way is to use preg_replace(). This regex replaces leading and trailing brackets (only one) and spaces according to your examples:
/(^\s*\()|(\)\s*$)/
You can use it like this:
$string = ' (hd)h(l() ) ';
$pattern = '/(^\s*\()|(\)\s*$)/';
$replacement = '';
echo preg_replace($pattern, $replacement, $string); // Output: "hd)h(l() "
Using php's trim() may lead to unexpected over-matching. Consider the following implementation:
$strings=[
"(-58)", // -> "-58"
"('test')", // -> "'test'"
"('st())", // -> "st()"
" (hd)h(l() ) ", // -> "hd)h(l() " --> removed all chars up to the bracket
" ((2x parentheses))" // -> assumed to be "(2x parentheses)"
];
foreach($strings as $s){
// use double trim() to strip leading/trailing spaces, then parentheses
var_export(trim(trim($s,' '),'()'));
echo "\n";
}
Output:
'-58'
'\'test\''
'\'st'
'hd)h(l() '
'2x parentheses' // notice this string had two sets of parentheses removed!
Although several string manipulating functions could be used to generate the desired string, using regex is a more direct approach.
Given the following input data:
(-58)
('test')
('st())
(hd)h(l() )
((2x parentheses))
Gergo's pattern will accurately replace the desired characters with and empty string in 261 steps.
I would like to suggest a more efficient and brief pattern with equivalent accuracy given the OP's sample strings.
/^ ?\(|\) ?$/ #142 steps
Demo Link
Enhancements:
Capture groups are not needed for the pipe to separate the two alternatives. (improves efficiency and brevity)
Replace white-space character \s with literal space character. (improves brevity *and potentially accuracy)
Reduce quantifier on spaces from zero or more to zero or one. (more literal to sample data)
You can use trim() function as shown below
$a="(-58)";
$str=trim($a,"()");
echo $str;
Related
thanks by your help.
my target is use preg_replace + pattern for remove very sample strings.
then only using preg_replace in this string or others, I need remove ANY content into <tag and next symbol >, the pattern is so simple, then:
$x = '#<\w+(\s+[^>]*)>#is';
$s = 'DATA<td class="td1">111</td><td class="td2">222</td>DATA';
preg_match_all($x, $s, $Q);
print_r($Q[1]);
[1] => Array
(
[0] => class="td1"
[1] => class="td2"
)
work greath!
now I try remove strings using the same pattern:
$new_string = '';
$Q = preg_replace($x, "\\1$new_string", $s);
print_r($Q);
result is completely different.
what is bad in my use of preg_replace?
using only preg_replace() how I can remove this strings?
(we can use foreach(...) for remove each string, but where is the error in my code?)
my result expected when I intro this value:
$s = 'DATA<td class="td1">111</td><td class="td2">222</td>DATA';
is this output:
$Q = 'DATA<td>111</td><td>222</td>DATA';
Let's break down your RegEx, #<\w+(\s+[^>]*)>#is, and see if that helps.
# // Start delimiter
< // Literal `<` character
\w+ // One or more word-characters, a-z, A-Z, 0-9 or _
( // Start capturing group
\s+ // One or more spaces
[^>]* // Zero or more characters that are not the literal `>`
) // End capturing group
> // Literal `>` character
# // End delimiter
is // Ignore case and `.` matches all characters including newline
Given the input DATA<td class="td1">DATA this matches <td class="td1"> and captures class="td1". The difference between match and capture is very important.
When you use preg_match you'll see the entire match at index 0, and any subsequent captures at incrementing indexes.
When you use preg_replace the entire match will be replaced. You can use the captures, if you so choose, but you are replacing the match.
I'm going to say that again: whatever you pass as the replacement string will replace the entirety of the found match. If you say $1 or \\=1, you are saying replace the entire match with just the capture.
Going back to the sample after the breakdown, using $1 is the equivalent of calling:
str_replace('<td class="td1">', ' class="td1"', $string);
which you can see here: https://3v4l.org/ZkPFb
To your question "how to change [0] by $new_string", you are doing it correctly, it is your RegEx itself that is wrong. To do what you are trying to do, your pattern must capture the tag itself so that you can say "replace the HTML tag with all of the attributes with just the tag".
As one of my comments noted, this is where you'd invert the capturing. You aren't interesting in capturing the attributes, you are throwing those away. Instead, you are interested in capturing the tag itself:
$string = 'DATA<td class="td1">DATA';
$pattern = '#<(\w+)\s+[^>]*>#is';
echo preg_replace($pattern, '<$1>', $string);
Demo: https://3v4l.org/oIW7d
So, I'm doing some manipulation on lat/long pairs, and I need to turn this:
39.1889375383777,-94.48019109594397
into:
39.1889375383777 -94.48019109594397
I can't use str_replace, unless I want to have an array of 10 search and 10 replace strings, so I was hoping to use preg_replace:
$query1 = preg_replace( "/([0-9-]),([0-9-])/", "\1 \2", $query );
The problem is that the "-" gets lost:
39.1889375383777 94.48019109594397
Note, that I have a string containing a list of these, trying to do all at once:
[[39.1889375383777,-94.48019109594397],[39.18425796890108,-94.28288005131176],[39.41972019529712,-94.19956344733345],[39.41412315915102,-94.41932608390658],[39.34785744845041,-94.4893603307242],[39.1889375383777,-94.48019109594397]]
I managed to make this work with preg_replace_callback:
$str = preg_replace_callback( "/([0-9-]),([0-9-])/",
function ($matches) {return $matches[1] . " " . $matches[2];},
$query
);
But still not sure why the simpler preg_match didn't work?
Your main issue is that "\1 \2" define a "\x1\x20\x2" string, where the first character is a SOH char and the third one is STX char (see the ASCII table). To define backreferences, you need to use a literal backslash, "\\", or, better, use $n notation, and better inside a single-quoted string literal.
You can also use a solution without backreferences:
preg_replace('~(?<=\d),(?=-?\d)~', ' ', $str)
Details:
(?<=\d) - a location that is immediately preceded with a digit
, - a comma
(?=-?\d) - a location that is immediately followed with an optional - and a digit.
See the PHP demo:
$str = '[[39.1889375383777,-94.48019109594397],[39.18425796890108,-94.28288005131176],[39.41972019529712,-94.19956344733345],[39.41412315915102,-94.41932608390658],[39.34785744845041,-94.4893603307242],[39.1889375383777,-94.48019109594397]]';
echo preg_replace('~(?<=\d),(?=-?\d)~', ' ', $str);
// => [[39.1889375383777 -94.48019109594397],[39.18425796890108 -94.28288005131176],[39.41972019529712 -94.19956344733345],[39.41412315915102 -94.41932608390658],[39.34785744845041 -94.4893603307242],[39.1889375383777 -94.48019109594397]]
Maybe it can not be solved this issue as I want, but maybe you can help me guys.
I have a lot of malformed words in the name of my products.
Some of them has leading ( and trailing ) or maybe one of these, it is same for / and " signs.
What I do is that I am explode the name of the product by spaces, and examines these words.
So I want to replace them to nothing. But, a hard drive could be 40GB ATA 3.5" hard drive. I need to process all the word, but I can not use the same method for 3.5" as for () or // because this 3.5" is valid.
So I only need to replace the quotes, when it is at the start of the string AND at end of the string.
$cases = [
'(testone)',
'(testtwo',
'testthree)',
'/otherone/',
'/othertwo',
'otherthree/',
'"anotherone',
'anothertwo"',
'"anotherthree"',
];
$patterns = [
'/^\(/',
'/\)$/',
'~^/~',
'~/$~',
//Here is what I can not imagine, how to add the rule for `"`
];
$result = preg_replace($patterns, '', $cases);
This is works well, but can it be done in one regex_replace()? If yes, somebody can help me out the pattern(s) for the quotes?
Result for quotes should be this:
'"anotherone', //no quote at end leave the leading
'anothertwo"', //no quote at start leave the trailin
'anotherthree', //there are quotes on start and end so remove them.
You may use another approach: rather than define an array of patterns, use one single alternation based regex:
preg_replace('~^[(/]|[/)]$|^"(.*)"$~s', '$1', $s)
See the regex demo
Details:
^[(/] - a literal ( or / at the start of the string
| - or
[/)]$ - a literal ) or / at the end of the string
| - or
^"(.*)"$ - a " at the start of the string, then any 0+ characters (due to /s option, the . matches a linebreak sequence, too) that are captured into Group 1, and " at the end of the string.
The replacement pattern is $1 that is empty when the first 2 alternatives are matched, and contains Group 1 value if the 3rd alternative is matched.
Note: In case you need to replace until no match is found, use a preg_match with preg_replace together (see demo):
$s = '"/some text/"';
$re = '~^[(/]|[/)]$|^"(.*)"$~s';
$tmp = '';
while (preg_match($re, $s) && $tmp != $s) {
$tmp = $s;
$s = preg_replace($re, '$1', $s);
}
echo $s;
This works
preg_replace([[/(]?(.+)[/)]?|/\"(.+)\"/], '$1', $string)
I'm doing str_replace on a very long string and my $search is an array.
$search = array(
" tag_name_item ",
" tag_name_item_category "
);
$replace = array(
" tag_name_item{$suffix} ",
" tag_name_item_category{$suffix} "
);
echo str_replace($search, $replace, $my_really_long_string);
The reason why I added spaces on both $search and $replace is because I want to only match whole words. As you would have guessed from my code above, if I removed the spaces and my really long string is:
...
tag_name_item ...
tag_name_item_category ...
...
Then I would get something like
...
tag_name_item_sfx ...
tag_name_item_sfx_category ...
...
This is wrong because I want the following result:
...
tag_name_item_sfx ...
tag_name_item_category_sfx ...
...
So what's wrong?
Nothing really, it works. But I don't like it. Looks dirty, not well coded, inefficient.
I realized I can do something like this using regular expressions using the \b modifier but I'm not good with regex and so I don't know how to preg_replace.
A possible approach using regular expressions would/could look like this:
$result = preg_replace(
'/\b(tag_name_item(_category)?)\b/',
'$1' . $suffix,
$string
);
How it works:
\b: As you say are word boundaries, this is to ensure we're only matching words, not word parts
(: We want to use part of our match in the replacement string (tag_name_index has to be replaced with itself + a suffix). That's why we use a match group, so we can refer back to the match in the replacement string
tag_name_index is a literal match for that string.
(_category)?: Another literal match, grouped and made optional through use of the ? operator. This ensures that we're matching both tag_name_item and tag_name_item_category
): end of the first group (the optional _category match is the second group). This group, essentially, holds the entire match we're going to replace
\b: word boundary again
These matches are replaced with '$1' . $suffix. The $1 is a reference to the first match group (everything inside the outer brackets in the expression). You could refer to the second group using $2, but we're not interested in that group right now.
That's all there is to it really
More generic:
So, you're trying to suffix all strings starting with tag_name, which judging by your example, can be followed by any number of snake_cased words. A more generic regex for that would look something like this:
$result = preg_replace(
'/\b(tag_name[a-z_]*)\b/',
'$1' . $suffix,
$string
);
Like before, the use of \b, () and the tag_name literal remains the same. what changed is this:
[a-z_]*: This is a character class. It matches characters a-z (a to z), and underscores zero or more times (*). It matches _item and _item_category, just as it would match _foo_bar_zar_fefe.
These regex's are case-sensitive, if you want to match things like tag_name_XYZ, you'll probably want to use the i flag (case-insensitive): /\b(tag_name[a-z_]*)\b/i
Like before, the entire match is grouped, and used in the replacement string, to which we add $suffix, whatever that might be
To avoid the problem, you can use strtr that parses the string only once and chooses the longest match:
$pairs = [ " tag_name_item " => " tag_name_item{$suffix} ",
" tag_name_item_category " => " tag_name_item_category{$suffix} " ];
$result = strtr($str, $pairs);
This function replaces the entire whole word but not the substring with an array element which matches the word
<?PHP
function removePrepositions($text){
$propositions=array('/\b,\b/i','/\bthe\b/i','/\bor\b/i');
if( count($propositions) > 0 ) {
foreach($propositions as $exceptionPhrase) {
$text = preg_replace($exceptionPhrase, '', trim($text));
}
$retval = trim($text);
}
return $retval;
}
?>
See the entire example
I went through dozen of already answered Q without finding one that can help me.
I have a string like this:
aaa.{foo}-{bar} dftgyh {foo-bar}{bar} .? {.!} -! a}aaa{
and I want to obtain a string like this:
aaa{foo}-{bar}dftgyh{foo-bar}{bar}-aaaa
Essentially I want to keep:
valid word chars and hyphens wrapped in an open and a closed curly bracket, something that will match the regex \{[\w\-]+\}
all the valid word chars and hyphens outside curly brackets
Using this:
$result = preg_replace( array( "#\{[\w\-]+\}#", '#[\w\-]#' ), "", $string );
I obtain the exact contrary of what I want: I remove the part that I want to keep.
Sure I can use ^ inside the square brackets in the second pattern, but it will not work for the first.
I.e. this will not work (the second pattern in the array is valid, the first not):
$result = preg_replace( array( "#[^\{[\w\-]+\}]#", '#[^\w\-]#' ), "", $string );
So, whis the regex that allow me to obtain the wanted result?
You may consider matching what you want instead of replacing the characters you do not want. The following will match word characters and hyphen both inside and outside of curly braces.
$str = 'aaa.{foo}-{bar} dftgyh {foo-bar}{bar} .? {.!} -! a}aaa{';
preg_match_all('/{[\w-]+}|[\w-]+/', $str, $matches);
echo implode('', $matches[0]);
Output as expected:
aaa{foo}-{bar}dftgyh{foo-bar}{bar}-aaaa
Also an option to (*SKIP)(*F) the good stuff and do a preg_replace() with the remaining:
$str = preg_replace('~(?:{[-\w]+}|[-\w]+)(*SKIP)(*F)|.~' , "", $str);
test at regex101; eval.in