regular expression for route parameter of URL - php

I'm not great with regular expressions, thats why I need your help.
look here http://kohanaframework.org/3.3/guide/kohana/routing#examples:
Route::set('search', ':<query>', array('query' => '.*'))
->defaults(array(
'controller' => 'Search',
'action' => 'index',
));
this regular expression (.*) excepts all parameters what I need:
"cat1/cat2/cat3"
but also:
"cat1/cat 2/ cat3",
"cat1/cat 2/ /// a |<>"?\':*"
How to modify this expression to disallow:
1. any kind of spaces ( "\s" )
2. more then one slash together ( 'cat1/cat2' but not 'cat1/////cat2')
3. and each symbol of range : [ "|", "<", ">" , "\"", "?", "\", "'", ":", "*" ]
Thanks for everyone who try to help me
define('CATEGORIES_RGXP', '(?:[^|<>\\?"\':*\s]+\/?)+');
Route::set('debug_route', '(<categories>/)<file>.<ext>',array(
'categories' => CATEGORIES_RGXP,
))
->defaults(array(
'controller' => 'index',
'action' => 'file',
));
Dump in controller when i follow "/cat1/cat2/////cat3/file.php": var_dump($this->request->param());
array(3) {
["categories"]=>
string(14) "cat1/cat2/cat3"
["file"]=>
string(4) "file"
["ext"]=>
string(3) "php"
}
so it allow to pass a group of few slashes

the . matches every character (except new line) which explains the observed behaviour
Instead, we'll use the negated character class ie [^X] which means "match everything but X"
According to your requirements, you should use then:
^((?:[^|<>\\\/?"':*\s]+\/?)+)$
DEMO
NODE EXPLANATION
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
(?: group, but do not capture (1 or more
times (matching the most amount
possible)):
--------------------------------------------------------------------------------
[^|<>\\\/?"':*\s any character except: '|', '<', '>',
]+ '\\', '\/', '?', '"', ''', ':', '*',
whitespace (\n, \r, \t, \f, and " ")
(1 or more times (matching the most
amount possible))
--------------------------------------------------------------------------------
\/? '/' (optional (matching the most
amount possible))
--------------------------------------------------------------------------------
)+ end of grouping
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
$ before an optional \n, and the end of the
string

Related

Get content in parentheses following right after string using regex in php

I have a php file as a string, I am looking for places where certain functions are called and I want to extract the passed arguments to the function.
I need to match the following cases:
some_function_name("abc123", ['key' => 'value'])
some_function_name("abc123", array("key" => 'value'))
So far I have this, but it breaks as soon as I have any nesting conditions:
(function_name)\(([^()]+)\)
$text = "test test test test some_function_name('abc123', ['key' => 'value']) sdohjsh dsfkjh spkdo sdfopmsdfohp some_function_name('abc123', array('key' => 'value'))";
preg_match_all('/\w+\(.*?\)(\)|!*)/', $text, $matches);
var_dump($matches[0]);
Is this the desired result you want?
$text = "blah some_function_name('abc123', ['key' => 'value']) blah some_function_name('abc123', array('key' => 'value')) blah";
preg_match_all('/\w+\(.+?(?:array\(.+?\)|\[.+?\])\)/', $text, $matches);
var_dump($matches);
Output:
array(1) {
[0]=>
array(2) {
[0]=>
string(48) "some_function_name('abc123', ['key' => 'value'])"
[1]=>
string(53) "some_function_name('abc123', array('key' => 'value'))"
}
}
Explanation:
\w+ # 1 or more word character (i.e. [a-zA-Z0-9_])
\( # opening parenthesis
.+? # 1 or more any character, not greedy
(?: # non capture group
array\(.+?\) # array(, 1 or more any character, )
| # OR
\[.+?\] # [, 1 or more any character, ]
) # end group
\) # closing parenthesis
I managed to solve it using the following pattern:
((\'.*?\'|\".*?\")(\s*,\s*.*?)*?\);?
Thanks everyone for your suggestions!

How to replace a substring with help of preg_replace

I have a string that consists of repeated words. I want to replace a substring 'OK' located between 'L3' and 'L4'. Below you can find my code:
$search = "/(?<=L3).*(OK).*(?=L4)/";
$replace = "REPLACEMENT";
$subject = "'L1' => ('Vanessa', 'Prague', 'OK'), 'L2' => ('Alex', 'Paris', 'OK'), 'L3' => ('Paul', 'Paris', 'OK'), 'L4' => ('John', 'Madrid', 'OK')";
$str = preg_replace($search, $replace, $str);
If I use that pattern with preg_match, it finds a correct substring(third 'OK'). However, when I apply that pattern to preg_replace, it replaces substring that matches the full pattern, instead of the parenthesized subpattern.
So could you please give me an advice what I should change in my code? I know that there are plenty amount of similar questions about regex, but as I understand my pattern is correct and I'm only confused with preg_replace function
It is true that your regex matches a place in the string that is preceded with L3 then contains the last OK substring after 0+ chars other than linebreak symbols and then matches any 0+ chars up to the place followed with L4. See your regex demo.
A possible solution is to use 2 capturing groups around the subpatterns before and after the OK, and use backreferences in the replacement pattern:
$search = "/(L3.*?)OK(.*?L4)/";
$replace = "REPLACEMENT";
$subject = "'L1' => ('Vanessa', 'Prague', 'OK'), 'L2' => ('Alex', 'Paris', 'OK'), 'L3' => ('Paul', 'Paris', 'OK'), 'L4' => ('John', 'Madrid', 'OK')";
$str = preg_replace($search, '$1'.$replace.'$2', $subject);
echo $str; // => 'L1' => ('Vanessa', 'Prague', 'OK'), 'L2' => ('Alex', 'Paris', 'OK'), 'L3' => ('Paul', 'Paris', 'REPLACEMENT'), 'L4' => ('John', 'Madrid', 'OK')
See the PHP demo
If there cannot be any L3.5 in between L3 and L4, the (L3.*?)OK(.*?L4) pattern is safe to use. It will match and capture L3 and then 0+ chars other than a linebreak up to the first OK, then will match OK, and then will match and capture 0+ chars up to the first L4.
If there can be no L4, use a (?:(?!L4).)* tempered greedy token matching any symbol other than a linebreak symbol that is not starting an L4 sequence:
'~(L3(?:(?!L4).)*)OK~'
See the regex demo
NOTE: If you want to make the regexps safer, add ' around L# inside the patterns.

How to rewrite a string and get params by pattern in php

$string = '/start info#example.com';
$pattern = '/{command} {name}#{domain}';
get array params in php, Like the example below:
['command' => 'start', 'name' => 'info', 'domain' => 'example.com']
and
$string = '/start info#example.com';
$pattern = '/{command} {email}';
['command' => 'start', 'email' => 'info#example.com']
and
$string = '/start info#example.com';
$pattern = '{command} {email}';
['command' => '/start', 'email' => 'info#example.com']
If its a single line string you can use preg_match and a regular expression such as this
preg_match('/^\/(?P<command>\w+)\s(?P<name>[^#]+)\#(?P<domain>.+?)$/', '/start info#example.com', $match );
But depending on variation in the data you may have to adjust the regx a bit. This outputs
command [1-6] start
name [7-11] info
domain [12-23] example.com
but it will also have the numeric index in the array.
https://regex101.com/r/jN8gP7/1
Just to break this down a bit, in English.
The leading ^ is start of line, then named capture ( \w (any a-z A-Z 0-9 _ ) ) then a space \s then named capture of ( anything but the #t sign [^#] ), then the #t sign #, then name captured of ( anything .+? to the end $ )
This will capture anything in this format,
(abc123_ ) space (anything but #)#(anything)

Bad json encoding

I'm trying to parse in json a txt file content. This is the file content:
[19-02-2016 16:48:45.505547] [info] System done.
0: array(
'ID' => 'Example 2'
)
Now this is my code for parse the file:
$fh = fopen($file, "r");
$content = array();
$content["trace"] = array();
while ($line = fgets($fh))
{
$raw = preg_split("/[\[\]]/", $line);
$entry = array();
$entry["date"] = trim($raw[1]);
$entry["type"] = trim($raw[3]);
$entry["message"] = trim($raw[4]);
$content["trace"][] = $entry;
}
fclose($fh);
return $content;
and this is what is returned from $content:
{
"trace": [{
"date": "19-02-2016 16:48:45.505547"
"type": "info"
"message": "System done."
}, {
"date": ""
"type": ""
"message": ""
}, {
"date": ""
"type": ""
"message": ""
}, {
"date": ""
"type": ""
"message": ""
}]
}
UPDATE I'm expecting this:
{
"trace": [{
"date": "19-02-2016 16:48:45.505547"
"type": "info"
"message": "System done."
"ID": Example 2
}]
}
how you can see the array is saw as a new line and the code create other empty array in the while without content. I just want create new index later message and put the array content, how I can achieve this?
UPDATE WITH MORE CONTENT IN FILE
[19-02-2016 16:57:17.104504] [info] system done.
0: array(
'ID' => 'john foo'
)
[19-02-2016 16:57:17.110482] [info] transaction done.
0: array(
'ID' => 'john foo'
)
Expected result:
{
"trace": [20]
0: {
"date": "19-02-2016 16:57:17.104504"
"type": "info"
"message": "system done."
"ID": john foo
}
1: {
"date": "19-02-2016 16:57:17.110482"
"type": "info"
"message": "transaction done."
"ID": john foo
}
...
Try this:
Code
<?php
$file = 'test.log';
$content = array();
$content["trace"] = array();
$input = file_get_contents('test.log');
preg_match_all('/\[(.*)\][\s]*?\[(.*?)\][\s]*?(.*)[\s][^\']*\'ID\'[ ]*=>[ ]*\'(.*)\'/', $input, $regs, PREG_PATTERN_ORDER);
for ($i = 0; $i < count($regs[0]); $i++) {
$content['trace'][] = array(
'date' => $regs[1][$i],
'type' => trim($regs[2][$i]),
'message' => trim($regs[3][$i]),
'ID' => trim($regs[4][$i]),
);
}
// return $content;
echo '<pre>'; print_r($content); echo '</pre>'; // For testing only
$content = json_encode($content); // For testing only
echo '<pre>' . $content . '</pre>'; // For testing only
Result
PHP array:
Array
(
[trace] => Array
(
[0] => Array
(
[date] => 19-02-2016 16:57:17.104504
[type] => info
[message] => system done.
[ID] => john foo
)
[1] => Array
(
[date] => 19-02-2016 16:57:17.110482
[type] => info
[message] => transaction done.
[ID] => john foo
)
)
)
Json object (string):
{
"trace":[
{
"date":"19-02-2016 16:57:17.104504",
"type":"info",
"message":"system done.",
"ID":"john foo"
},
{
"date":"19-02-2016 16:57:17.110482",
"type":"info",
"message":"transaction done.",
"ID":"john foo"
}
]
}
Notes re. the RegEx:
The file is read as a whole into a string variable ($input).
The preg_match_all(RegEx) also scans the entire input.
The code iterates over all its hits, where the groups contain these parts…
1: date
2: type
3: message
4: ID
The RegEx in detail:
\[ Match the character “[” literally
( Match the regular expression below and capture its match into backreference number 1
. Match any single character that is not a line break character
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
)
\] Match the character “]” literally
[\s] Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
*? Between zero and unlimited times, as few times as possible, expanding as needed (lazy)
\[ Match the character “[” literally
( Match the regular expression below and capture its match into backreference number 2
. Match any single character that is not a line break character
*? Between zero and unlimited times, as few times as possible, expanding as needed (lazy)
)
\] Match the character “]” literally
[\s] Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
*? Between zero and unlimited times, as few times as possible, expanding as needed (lazy)
( Match the regular expression below and capture its match into backreference number 3
. Match any single character that is not a line break character
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
)
[\s] Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
[^'] Match any character that is NOT a “'”
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
'ID' Match the characters “'ID'” literally
[ ] Match the character “ ”
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
=> Match the characters “=>” literally
[ ] Match the character “ ”
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
' Match the character “'” literally
( Match the regular expression below and capture its match into backreference number 4
. Match any single character that is not a line break character
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
)
' Match the character “'” literally

preg_match dot and slash

I would like to add a regular expression character dot . and such a slash /.
'numericString' => array(
'pattern' => '^[a-zćęłńóśźżA-Z0-9\s]+$',
)
How i can do?
add \. and \/
'numericString' => array(
'pattern' => '^[a-zćęłńóśźżA-Z0-9\s\.\/]+$',
)
Note: you must escape the characters "^.[$()|*+?{\" with a backslash ('\'), as
they have special meaning.
Use the below code..
'numericString' => array(
'pattern' => '^[a-zćęłńóśźżA-Z0-9\s\.\/]+$',
)
A literal . is expressed in a regex as \.
A literal / is expressed as \/
Note: not all regex flavours require escaping the /, only the ones that use it for delimiting the regex.

Categories