what is the regular expression for this

what is the regular expression for this - php

I want to parse this
(adv) much (thanks)
I want to eliminate the words and the bracket (adv) but not (thanks)
the condition is:
inside bracket, and word length inside bracket is 1-5 characters
I am using preg_match in PHP

$matches = NULL;
preg_match("/\([^\)]{1,5}\)/", "(adv) much (thanks)", $matches);
var_export($matches);
array (
0 => '(adv)',
)

$str = '(adv) much (thanks)';
$str = preg_replace('/\(\w{1,5}\) ?/', '', $str);

Related

Is is possible to know the position of a match in a subject string

I have a file name where information has to be replaced. Here is a subject sample :
FileA-2014-11-01_K_1_A2_383.xxx
As many files are to be processed, this filename is first matched by a regex, say :
/[a-zA-Z]*-\d{4}-\d{2}-\d{2}_(\w)_(\d)_A2_(\d*)\.xxx$/
This regex will give me, using preg_match, the values to be replaced, here :
K=>A
1=>2
383=>666
My first try was to naively use "str_replace", but it fails when patterns are repeated in the string : here i will get :
FileA-2024-22-02_A_2_A2_666.xxx
So the date is also modified by the str_replace (as it was told to do..)
So, i wonder if there is a way to know where is a given match in the string to have a clean replacement.
I'm now trying to revert the regex to be able to capture non-replacement blocks, and then insert replaced data. That regex would be :
/([a-zA-Z]*-\d{4}-\d{2}-\d{2}_)\w(_)\d(_A2_)\d*(\.xxx)$/
With that one, i'm able to keep non-replaced parts. I now have to find a kind of index to know the replacement position in the string. I guess I can achieve this way, but is seems somewhat complicated and error prone.
Given I only have the initial regex and the map for to=>from replacement, is there a way to do that in a better way?
[EDIT : solution]
<?php
$filename = "FileA-2014-11-01_K_1_A2_383.xxx";
$expected = "FileA-2014-11-01_A_2_A2_666.xxx";
$regex = "/[a-zA-Z]*-\d{4}-\d{2}-\d{2}_(\w)_(\d)_A2_(\d*)\.xxx$/";
global $replacements;
$replacements["K"] = "A";
$replacements["1"] = "2";
$replacements["383"] = "666";
$result = preg_replace_callback($regex, function($matches){
global $replacements;
print_r($matches);
// ended here. no way.
}, $filename);
if(strcmp($result,$expected)==0)
echo "preg_replace_callback() : Yep\n";
else
echo "preg_replace_callback() : Nop\n";
preg_match($regex, $filename, $matches, PREG_OFFSET_CAPTURE);
// remove useless global string match
array_shift($matches);
$result = $filename;
foreach($matches as $matchInfo){
$match = $matchInfo[0];
$position = $matchInfo[1];
$matchLength= strlen($match);
$beforeReplacementPart = substr($result, 0, $position);
$afterReplacementPart = substr($result, ($position + $matchLength));
$result = $beforeReplacementPart . $replacements[$match] . $afterReplacementPart;
}
if(strcmp($result,$expected)==0)
echo "preg_match() and substr game : Yep\n";
else
echo "preg_match() and substr game : Nop\n";

A regex that matches that filename:
$re = '/[a-zA-Z]*-\d{4}-\d{2}-\d{2}_(\w)_(\d)_A2_(\d*)\.xxx$/';
$str = 'FileA-2014-11-01_K_1_A2_383.xxx';
If you add PREG_OFFSET_CAPTURE as the fourth parameter ($flags) to the call to preg_match(), it will also return the offset of each captured string in the third parameter:
preg_match($re, $str, $matches, PREG_OFFSET_CAPTURE);
A print_r($matches) will reveal:
Array
(
[0] => Array
(
[0] => FileA-2014-11-01_K_1_A2_383.xxx
[1] => 0
)
[1] => Array
(
[0] => K
[1] => 17
)
[2] => Array
(
[0] => 1
[1] => 19
)
[3] => Array
(
[0] => 383
[1] => 24
)
)
$matches[0] is the part that matched the entire regex. $matches[1] is the first capturing sub-expression, $matches[2] is the second and so on.
$matches[1][0] is the fragment from the input string that matched the first regex sub-expression (\w) and $matches[1][1] is the offset in the input string where it was found. The same for $matches[N][0] and $matches[N][1] for the Nth sub-expression.
If you need to do a simple replacement then you don't need to bother about offsets but use preg_replace() or, if the replacement expression is complex or dynamic, preg_replace_callback().
Using preg_replace() you need to capture the parts you want to keep:
$re = '/([a-zA-Z]*-\d{4}-\d{2}-\d{2}_)\w_\d_A2_\d*(\.xxx)$/';
$str = 'FileA-2014-11-01_K_1_A2_383.xxx';
$new = preg_replace($re, '$1A_2_A2_666$2', $str);
echo($new."\n");
In the replacement string, $1 and $2 denote the sub-expressions from the regex. We marked them for capturing in order to re-use them in the replacement string.

At least preg_match_all() offers the option
PREG_OFFSET_CAPTURE
If this flag is passed, for every occurring match the appendant string offset will also be returned. Note that this changes the value of matches into an array where every element is an array consisting of the matched string at offset 0 and its string offset into subject at offset 1.

You could try the below regex.
([a-zA-Z]*-\d{4}-\d{2}-\d{2}(?:-\d*)?_)\w_\d(_A2)_\d*(\.xxx)$
Then replace the match with
\1A_2\2_666\3
DEMO
$re = "~([a-zA-Z]*-\\d{4}-\\d{2}-\\d{2}(?:-\\d*)?_)\\w_\\d(_A2)_\\d*(\\.xxx)$~m";
$str = "FileA-2014-11-01_K_1_A2_383.xxx";
$subst = "\1A_2\2_666\3";
$result = preg_replace($re, $subst, $str);

You can use:
$re = "/([a-zA-Z]+-\\d{4}-\\d{2}-\\d{2}_)\\w+_\\d+(_A2_)\\d+(\\.xxx)$/m";
$str = "FileA-2014-11-01_K_1_A2_383.xxx";
$subst = "${1}A_2${2}666${3}";
$result = preg_replace($re, $subst, $str);
//=> FileA-2014-11-01_A_2_A2_666.xxx
RegEx Demo

Perhaps it is possible to use this in your case:
$str = strtr($str, array('_K_1_'=>'_A_2_', '_383.'=>'_666.'));
or
$str = str_replace('_K_1_A2_383.xxx', '_A_2_A2_666.xxx', $str);
So there is no more ambiguity and the replacement is fast.

Regex to return contents of word and bracket

What Regex pattern would I need to extract the contents of a pair of parentheses in PHP which have a preceding specific string?
So if I have a statement
#includelayout( 'cms.layout.nav-header' )
I only need the contents of the parentheses that are directly preceded by #includelayout.
So I just want it to return:
'cms.layout.nav-header'
I am currently using:
preg_match('/(?<!\w)(?:\s*)#includelayout((?:\s*)?\(.*)/', $value, $matches);
which gives me
array (size=2)
0 => string '#includelayout( 'output.layout.nav-header' )' (length=46)
1 => string '( 'output.layout.nav-header' )' (length=30)
but I just can't get it to not return the parentheses.
Thanks

Get the matched group from index 1:
(?<=#includelayout\()([^)]*)
DEMO
Sample code:
$re = "/(?<=#includelayout\\()([^)]*)/i";
$str = "#includelayout( 'cms.layout.nav-header' )";
preg_match_all($re, $str, $matches);

You could try the below regex to match the contents within paranthesis without leading and the following spaces inside paranthesis,
#includelayout\(\s*\K.*?(?=\s*\))
DEMO
If you want to match all the characters which are enclosed within () preceeded by the string #includelayout then you could try the below.
#includelayout\(\K[^)]*
DEMO
Your PHP code would be,
<?php
$mystring = "#includelayout( 'cms.layout.nav-header' )";
$regex = '~#includelayout\(\K[^)]*~';
if (preg_match($regex, $mystring, $m)) {
$yourmatch = $m[0];
echo $yourmatch;
}
?> //=> 'cms.layout.nav-header'

preg_replace() pattern to remove brackets and content in php

I want to remove the brackets with its content using preg_replace(), but i am unable to use a lazy(non-greedy) in the pattern since the end bracket is the end character, the text in between the brackets is always a random character length and can contain numbers, underscores, and hyphens.
code-
$array = array(
"Text i want to keep (txt to remove)",
"Random txt (some more random txt)",
"Keep this (remove)",
"I like bananas (txt)"
);
$pattern = "#pattern#";
foreach($array as $new_txt){
$new_outputs .= preg_replace($pattern, '', $new_txt)."\n";
}
echo $new_outputs;
Wanted output-
Text i want to keep
Random txt
Keep this
I like bananas
I do not use regular expressions much and couldn't find anything to solve my problem.

The following regular expression should do it:
$pattern = '#\(.*?\)#';
.*? is a non-greedy match of anything.

$new_outputs .= preg_replace('#\([^\)]*\)$#','',$new_txt);

This might help you:
$pattern = "/\([^)]*\)+/";
foreach($array as $new_txt){
$new_outputs .= preg_replace($pattern, '', $new_txt)."\n";
}

Why doesn't this PHP regular expression extract the url from the css value?

I want to extract the url from the background css property "url('/img/hw (11).jpg') no-repeat". I tried:
$re = '/url\(([\'\"]?.*\.[png|jpg|jpeg|gif][\'\"]?)\)/i';
$text = "url('/img/hw (11).jpg')";
preg_match_all($re, $text, $matches);
print_r($matches);
and it gives me :
Array
(
[0] => Array
(
)
[1] => Array
(
)
)

Here is the correct regex. The ".*" in the middle of your regex is too greedy. Also, try replacing the square brackets with paranthesis. Also note that since you are using single quotes around the string that you do not need to escape the double quotes.
$re = '/url\(([\'"]?.[^\'"]*\.(png|jpg|jpeg|gif)[\'"]?)\)/i';

Try:
/url\(([\'\"]?.*\.(png|jpg|jpeg|gif)[\'\"]?)\)/i
Instead. The square brackets do a character-by-character comparison rather than the or comparison you're looking for.

I think the probably lies in this part [png|jpg|jpeg|gif]. It's supposed to match only single characters.
You should do this instead :
/url\([\'\"]?(.*\.(jpg|png|jpeg|gif)[\'\"]?)\)/

Get content from bracket

With preg_match how can I get the string between the bracket
Example: sdsdds (sdsd) sdsdsd
And I want the
sdsd

preg_match('/\(([^\)]*)\)/', 'sdsdds (sdsd) sdsdsd', $matches);
echo $matches[1]; // sdsd
Matches characters within parentheses, including blank values. If you want to match multiple instances, you can use preg_match_all.

preg_match('/\((.*?)\)/', $text, $a);
echo $a[1];

The simplest:
#\(([^\)]+)\)#
It's not very readable, because all the ( and ) must be escaped with \.
The # are delimiters.
Using preg_match:
$str = 'sdsdds (sdsd) sdsdsd';
$iMatches = preg_match('#\(([^\)]+)\)#', $str, $aMatches);
echo $aMatches[1]; // 'sdsd'

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

what is the regular expression for this - php

I want to parse this (adv) much (thanks) I want to eliminate the words and the bracket (adv) but not (thanks) the condition is: inside bracket, and word length inside bracket is 1-5 characters I am using preg_match in PHP

$matches = NULL; preg_match("/\([^\)]{1,5}\)/", "(adv) much (thanks)", $matches); var_export($matches); array ( 0 => '(adv)', )

$str = '(adv) much (thanks)'; $str = preg_replace('/\(\w{1,5}\) ?/', '', $str);

Related

Is is possible to know the position of a match in a subject string

Regex to return contents of word and bracket

preg_replace() pattern to remove brackets and content in php

Why doesn't this PHP regular expression extract the url from the css value?

Get content from bracket

Categories

Resources