Parsing string - with regex or something similar? - php

I'm writing routing class and need help. I need to parse $controller variable and assign parts of that string to another variables. Here is examples of $controller:
$controller = "admin/package/AdminClass::display"
//$path = "admin/package";
//$class = "AdminClass";
//$method = "display";
$controller = "AdminClass::display";
//$path = "";
//$class = "AdminClass";
//$method = "display";
$controller = "display"
//$path = "";
//$class = "";
//$method = "display";
This three situations is all i need. Yes, i can write long procedure to handle this situations, but what i need is simple solution with regex, with function preg_match_all
Any suggestion how to do this?

The following regex should accomplish this for you, you can then save the captured groups to $path, $class, and $method.
(?:(.+)/)?(?:(.+)::)?(.+)
Here is a Rubular:
http://www.rubular.com/r/1vPIhwPUub
Your php code might look something like this:
$regex = '/(?:(.+)\/)?(?:(.+)::)?(.+)/';
preg_match($regex, $controller, $matches);
$path = $matches[1];
$class = $matches[2];
$method = $matches[3];

This supposes that paths within the class, and the method name, can only contain letters.
The full regex is the following:
^(?:((?:[a-zA-Z]+/)*)([a-zA-Z]+)::)?([a-zA-Z]+)$
Two non capturing groups: the first one which makes all the path and class optional, the second which avoids the capture of individual path elements.
Explanation:
a path element is one or more letters followed by a /: [a-zA-Z]+/;
there may be zero or more of them: we must apply the * quantifier to the above; but the regex is not an atom, we therefore need a group. As we do not want to capture individual path elements, we use a non capturing group: (?:[a-zA-Z]+/)*;
we want to capture the full path if it is there, we must use a capturing group over this ((?:[a-zA-Z]+/)*);
the method name is one or more letters, and we want to capture it: ([a-zA-Z]+);
if present, it follows the path, and is followed by two semicolons: ((?:[a-zA-Z]+/)*)([a-zA-Z]+)::;
but all this is optional: we must therefore put a group around all this, which again we do not want to capture: (?:((?:[a-zA-Z]+/)*)([a-zA-Z]+)::)?;
finally, it is followed by a method name, which is NOT optional this time, and which we want to capture: (?:((?:[a-zA-Z]+/)*)([a-zA-Z]+)::)?([a-zA-Z]+);
and we want this to match the whole line: we need to anchor it both at the beginning and at the end, which gives the final result: ^(?:((?:[a-zA-Z]+/)*)([a-zA-Z]+)::)?([a-zA-Z]+)$
Phew.

$pieces = explode('/',$controller);
$path = '';
for($i = $i<$pieces.length-1; $i++)
{
if($i != 0)
$path+='/';
$path += $pieces[$i];
}
$p2 = explode( '::',$pieces[$pieces.length-1]);
$class = $p2[0];
$method = $p2[1];

Related

How to check specyfic url_path in preg_match

How can I check if specific path match to pattern.
Example:
I have a path with one or more unknown variable
$pathPattern = 'user/?/stats';
And let say I received this path
$receivedPath = 'user/12/stats'
So, how can I check if that received path match to my pattern?
I tried to do something like below but didn't work.
$pathPattern = 'user/?/stats';
$receivedPath = 'user/12/stats';
$pathPatternReg = str_replace('?','.*',$pathPattern);
echo preg_match('/$pathPatternReg/', $receivedPath);
Thank you.
Regex should be something like this for a unknown user\/[0-9]+\/stats
And Could be used as such;
if(preg_match("user\/[0-9]+\/stats",$variable)) { .... }
As stated by Tom, you possibly have to escape the '/' characters with a '\'.
You only want to match one specific part of your total query string, the number in the center. Most regex interpreters provide this functionality in form of round brackets, like this:
$pattern = "user\/([0-9]+)\/stats";
Notice the round brackets around the [0-9]+ : it tells preg_match to store this part of the matched pattern in the $matches array.
So, your code could look like this:
$subject = "user/12/stats";
$pattern = "user\/([0-9]+)\/stats";
$matches = array();
if( preg_match($pattern, $subject, $matches) ){
// there was a match
// The $matches array now looks like this:
// { "user/12/stats", "12" }
// { <whole matched string>, <string in first parenthesis>, .... }
$user_id = $matches[1]
...
}
(not tested)
See also here: https://secure.php.net/manual/en/function.preg-match.php
Thank you booth.
This is a solution:
$pathPattern = 'user/?/stats';
$receivedPath = 'user/12/stats';
$pathPatternReg = str_replace(['/','?'],['\/','.*'],$pathPattern);
$pattern = "/^$uri/";
echo preg_match($pattern, $receivedPath);

SPLIT URL in PHP

I have below URL in my code and i want to split it and get the number from it
For example from the below URL need to fetch 123456
https://review-test.com/#/c/123456/
I have tried this and it is not working
$completeURL = https://review-test.com/#/c/123456/ ;
list($url, $number) = explode('#c', preg_replace('/^.*\/+/', '', $completeURL));
Use parse_url
It's specifically made for this sort of thing.
You can do this without using regex also -
$completeURL = 'https://review-test.com/#/c/123456/' ;
list($url, $number) = explode('#c', str_replace('/', '', $completeURL));
echo $number;
If you wan to get the /c/123456/ params you will need to execute the following:
$url = 'https://review-test.com/#/c/123456/';
$url_fragment = parse_url($url, PHP_URL_FRAGMENT);
$fragments = explode('/', $url_fragment);
$fragments = array_filter(array_map('trim', $fragments));
$fragments = array_values($fragments);
The PHP_URL_FRAGMENT will return a component of the url after #
After parse_url you will end up with a string like this: '/c/123456/'
The explode('/', $url_fragment); function will return an array with empty indexes where '/' was extracted
In order to remove empty indexes array_filter($fragments); the
array_map with trim option will remove excess spaces. It does not
apply in this case but in real case scenario you better trim.
Now if you var_dump the result you can see that the array needs to
be reindexed array_values($fragments)
You should try this: basename
basename — Returns trailing name component of path
<?php
echo basename("https://review-test.com/#/c/123456/");
?>
Demo : http://codepad.org/9Ah83qaP
Subsequently you can directly take from pure regex to fetch numbers from string,
preg_match('!\d+!', "https://review-test.com/#/c/123456/", $matches);
print_r($matches);
Working demo
Simply:
$tmp = explode( '/', $completeUrl).end();
It will explode the string by '/' and take the last element
If you have no other option than regex, for your example data you could use preg_match to split your url instead of preg_replace.
An approach could be to
Capture the first part as a group (.+\/)
Then capture your number as a group (\d+)
Followed by a forward slash at the end of the line \/$/
This will take the last number from the url followed by a forward slash.
Then you could use list and skip the first item of the $matches array because that will contain the text that matched the full pattern.
$completeURL = "https://review-test.com/#/c/123456/";
preg_match('/(.+\/)(\d+)\/$/', $completeURL, $matches);
list(, $url, $number) = $matches;

PHP: preg_replace() to get "parent" component of NameSpace

How can I use the preg_replace() replace function to only return the parent "component" of a PHP NameSpace?
Basically:
Input: \Base\Ent\User; Desired Output: Ent
I've been doing this using substr() but I want to convert it to regex.
Note: Can this be done without preg_match_all()?
Right now, I also have a code to get all parent components:
$s = '\\Base\\Ent\\User';
print preg_replace('~\\\\[^\\\\]*$~', '', $s);
//=> \Base\Ent
But I only want to return Ent.
Thank you!
As Rocket Hazmat says, explode is almost certainly going to be better here than a regex. I would be surprised if it's actually slower than a regex.
But, since you asked, here's a regex solution:
$path = '\Base\Ent\User';
$search = preg_match('~([^\\\\]+)\\\\[^\\\\]+$~', $path, $matches);
if($search) {
$parent = $matches[1];
}
else {
$parent = ''; // handles the case where the path is just, e.g., "User"
}
echo $parent; // echos Ent
I think maybe preg_match might be a better choice for this.
$s = '\\Base\\Ent\\User';
$m = [];
print preg_match('/([^\\\\]*)\\\\[^\\\\]*$/', $s, $m);
print $m[1];
If you read the regular expression backwards, from the $, it says to match many things that aren't backslashes, then a backslash, then many things that aren't backslashes, and save that match for later (in $m).
How about
$path = '\Base\Ent\User';
$section = substr(strrchr(substr(strrchr($path, "\\"), 1), "\\"), 1);
Or
$path = '\Base\Ent\User';
$section = strstr(substr($path, strpos($path, "\\", 1)), "\\", true);

Function to remove GET variable with php

i have this URI.
http://localhost/index.php?properties&status=av&page=1
i am fetching basename of the URI using following code.
$basename = basename($_SERVER['REQUEST_URI']);
the above code gives me following string.
index.php?properties&status=av&page=1
i would want to remove the last variable from the string i.e &page=1. please note the value for page will not always be 1. keeping this in mind i would want to trim the variable this way.
Trim from the last position of the string till the first delimiter i.e &
Update :
I would like to remove &page=1 from the string, no matter in which position it is on.
how do i do this?
Instead of hacking around with regular expression you should parse the string as an url (what it is)
$string = 'index.php?properties&status=av&page=1';
$parts = parse_url($string);
$queryParams = array();
parse_str($parts['query'], $queryParams);
Now just remove the parameter
unset($queryParams['page']);
and rebuild the url
$queryString = http_build_query($queryParams);
$url = $parts['path'] . '?' . $queryString;
There are many roads that lead to Rome. I'd do it with a RegEx:
$myString = 'index.php?properties&status=av&page=1';
$myNewString = preg_replace("/\&[a-z0-9]+=[0-9]+$/i","",$myString);
if you only want the &page=1-type parameters, the last line would be
$myNewString = preg_replace("/\&page=[0-9]+/i","",$myString);
if you also want to get rid of the possibility that page is the only or first parameter:
$myNewString = preg_replace("/[\&]*page=[0-9]+/i","",$myString);
Thank you guys but i think i have found the better solution, #KingCrunch had suggested a solution i extended and converted it into function. the below function can possibly remove or unset any URI variable without any regex hacks being used. i am posting it as it might help someone.
function unset_uri_var($variable, $uri) {
$parseUri = parse_url($uri);
$arrayUri = array();
parse_str($parseUri['query'], $arrayUri);
unset($arrayUri[$variable]);
$newUri = http_build_query($arrayUri);
$newUri = $parseUri['path'].'?'.$newUri;
return $newUri;
}
now consider the following uri
index.php?properties&status=av&page=1
//To remove properties variable
$url = unset_uri_var('properties', basename($_SERVER['REQUEST_URI']));
//Outputs index.php?page=1&status=av
//To remove page variable
$url = unset_uri_var('page', basename($_SERVER['REQUEST_URI']));
//Outputs index.php?properties=&status=av
hope this helps someone. and thank you #KingKrunch for your solution :)
$pos = strrpos($_SERVER['REQUEST_URI'], '&');
$url = substr($_SERVER['REQUEST_URI'], 0, $pos - 1);
Documentation for strrpos.
Regex that works on every possible situation: /(&|(?<=\?))page=.*?(?=&|$)/. Here's example code:
$regex = '/(&|(?<=\?))page=.*?(?=&|$)/';
$urls = array(
'index.php?properties&status=av&page=1',
'index.php?properties&page=1&status=av',
'index.php?page=1',
);
foreach($urls as $url) {
echo preg_replace($regex, '', $url), "\n";
}
Output:
index.php?properties&status=av
index.php?properties&status=av
index.php?
Regex explanation:
(&|(?<=\?)) -- either match a & or a ?, but if it's a ?, don't put it in the match and just ignore it (you don't want urls like index.php&status=av)
page=.*? -- matches page=[...]
(?=&|$) -- look for a & or the end of the string ($), but don't include them for the replacement (this group helps the previous one find out exactly where to stop matching)
You could use a RegEx (as Chris suggests) but it's not the most efficient solution (lots of overhead using that engine... it's easy to do with some string parsing:
<?php
//$url="http://localhost/index.php?properties&status=av&page=1";
$base=basename($_SERVER['REQUEST_URI']);
echo "Basename yields: $base<br />";
//Find the last ampersand
$lastAmp=strrpos($base,"&");
//Filter, catch no ampersands found
$removeLast=($lastAmp===false?$base:substr($base,0,$lastAmp));
echo "Without Last Parameter: $removeLast<br />";
?>
The trick is, can you guarantee that $page will be stuck on the end? If it is - great, if it isn't... what you asked for may not always solve the problem.

Compare strings and extract variables?

Could someone tell me how I would do this. I have 3 strings.
$route = '/user/$1/profile/$2';
$path = '/user/profile/$1/$2';
$url = '/user/jason/profile/friends';
What I need to do is check to see if the url conforms to the route. I am trying to do this as follows.
$route_segments = explode('/', $route);
$url_segments = explode('/', $url);
$count = count($url_segments);
for($i=0; $i < $count; $i++) {
if ($route_segments[$i] != $url_segments[$i] && ! preg_match('/\$[0-9]/', $route_segments[$i])) {
return false;
}
}
I assume the regex works, it's the first I have ever written by myself. :D
This is where I am stuck. How do I compare the following strings:
$route = '/user/$1/profile/$2';
$url = '/user/jason/profile/friends';
So I end up with:
array (
'$1' => 'jason',
'$2' => 'friends'
);
I assume that with this array I could then str_replace these values into the $path variable?
$route_segments = explode('/',$route);
$url_segments = explode('/',$url);
$combined_segments = array_combine($route_segments,$url_segments);
Untested and not sure how it reacts with unequal array lengths, but that's probably what you're looking for regarding an element-to-element match. Then you can pretty much iterate the array and look for $ and use the other value to replace it.
EDIT
/^\x24[0-9]+$/
Close on the RegEx except you need to "Escape" the $ in a regex because this is a flag for end of string (thus the \x24). The [0-9]+ is a match for 1+ number(s). The ^ means match to the beginning of the string, and, as explained, the $ means match to the end. This will insure it's always a dollar sign then a number.
(actually, netcoder has a nice solution)
I did something similar in a small framework of my own.
My solution was to transform the template URL: /user/$1/profile/$2
into a regexp capable of parsing parameters: ^\/user\/([^/]*)/profile/([^/]*)\/$
I then check if the regexp matches or not.
You can have a look at my controller code if you need to.
You could do this:
$route = '/user/$1/profile/$2';
$path = '/user/profile/$1/$2';
$url = '/user/jason/profile/friends';
$regex_route = '#'.preg_replace('/\$[0-9]+/', '([^/]*)', $route).'#';
if (preg_match($regex_route, $url, $matches)) {
$real_path = $path;
for ($i=1; $i<count($matches); $i++) {
$real_path = str_replace('$'.$i, $matches[$i], $real_path);
}
echo $real_path; // outputs /user/profile/jason/friends
} else {
// route does not match
}
You could replace any occurrence of $n by a named group with the same number (?P<_n>[^/]+) and then use it as pattern for preg_match:
$route = '/user/$1/profile/$2';
$path = '/user/profile/$1/$2';
$url = '/user/jason/profile/friends';
$pattern = '~^' . preg_replace('/\\\\\$([1-9]\d*)/', '(?P<_$1>[^/]+)', preg_quote($route, '~')) . '$~';
if (preg_match($pattern, $url, $match)) {
var_dump($match);
}
This prints in this case:
array(5) {
[0]=>
string(28) "/user/jason/profile/friends"
["_1"]=>
string(5) "jason"
[1]=>
string(5) "jason"
["_2"]=>
string(7) "friends"
[2]=>
string(7) "friends"
}
Using a regular expression allows you to use the wildcards at any position in the path and not just as a separate path segment (e.g. /~$1/ for /~jason/ would work too). And named subpatterns allows you to use an arbitrary order (e.g. /$2/$1/ works as well).
And for a quick fail you can additionally use the atomic grouping syntax (?>…).

Categories