Compare strings and extract variables?

Compare strings and extract variables? - php

Could someone tell me how I would do this. I have 3 strings.
$route = '/user/$1/profile/$2';
$path = '/user/profile/$1/$2';
$url = '/user/jason/profile/friends';
What I need to do is check to see if the url conforms to the route. I am trying to do this as follows.
$route_segments = explode('/', $route);
$url_segments = explode('/', $url);
$count = count($url_segments);
for($i=0; $i < $count; $i++) {
if ($route_segments[$i] != $url_segments[$i] && ! preg_match('/\$[0-9]/', $route_segments[$i])) {
return false;
}
}
I assume the regex works, it's the first I have ever written by myself. :D
This is where I am stuck. How do I compare the following strings:
$route = '/user/$1/profile/$2';
$url = '/user/jason/profile/friends';
So I end up with:
array (
'$1' => 'jason',
'$2' => 'friends'
);
I assume that with this array I could then str_replace these values into the $path variable?

$route_segments = explode('/',$route);
$url_segments = explode('/',$url);
$combined_segments = array_combine($route_segments,$url_segments);
Untested and not sure how it reacts with unequal array lengths, but that's probably what you're looking for regarding an element-to-element match. Then you can pretty much iterate the array and look for $ and use the other value to replace it.
EDIT
/^\x24[0-9]+$/
Close on the RegEx except you need to "Escape" the $ in a regex because this is a flag for end of string (thus the \x24). The [0-9]+ is a match for 1+ number(s). The ^ means match to the beginning of the string, and, as explained, the $ means match to the end. This will insure it's always a dollar sign then a number.
(actually, netcoder has a nice solution)

I did something similar in a small framework of my own.
My solution was to transform the template URL: /user/$1/profile/$2
into a regexp capable of parsing parameters: ^\/user\/([^/]*)/profile/([^/]*)\/$
I then check if the regexp matches or not.
You can have a look at my controller code if you need to.

You could do this:
$route = '/user/$1/profile/$2';
$path = '/user/profile/$1/$2';
$url = '/user/jason/profile/friends';
$regex_route = '#'.preg_replace('/\$[0-9]+/', '([^/]*)', $route).'#';
if (preg_match($regex_route, $url, $matches)) {
$real_path = $path;
for ($i=1; $i<count($matches); $i++) {
$real_path = str_replace('$'.$i, $matches[$i], $real_path);
}
echo $real_path; // outputs /user/profile/jason/friends
} else {
// route does not match
}

You could replace any occurrence of $n by a named group with the same number (?P<_n>[^/]+) and then use it as pattern for preg_match:
$route = '/user/$1/profile/$2';
$path = '/user/profile/$1/$2';
$url = '/user/jason/profile/friends';
$pattern = '~^' . preg_replace('/\\\\\$([1-9]\d*)/', '(?P<_$1>[^/]+)', preg_quote($route, '~')) . '$~';
if (preg_match($pattern, $url, $match)) {
var_dump($match);
}
This prints in this case:
array(5) {
[0]=>
string(28) "/user/jason/profile/friends"
["_1"]=>
string(5) "jason"
[1]=>
string(5) "jason"
["_2"]=>
string(7) "friends"
[2]=>
string(7) "friends"
}
Using a regular expression allows you to use the wildcards at any position in the path and not just as a separate path segment (e.g. /~$1/ for /~jason/ would work too). And named subpatterns allows you to use an arbitrary order (e.g. /$2/$1/ works as well).
And for a quick fail you can additionally use the atomic grouping syntax (?>…).

Related

extract part of a string before and after characters

I need to extract a variable based on a portion of a string. the string corresponds to a third level domain name, as in the example below.
$variable1 = "subdomain1.domain24.com"
$variable2 = "subdomain2.newdomain24.com"
I have to extract from the domain (therefore excluding the subdomain) the tld and the number 24. All domains ends with "24.com"
so result must be:
for variable1 : domain
for variable2 : newdomain

Regular expressions is one way for this kind of task
the domain must follow a dot
24\.com$ is saying match 24.com at the end of the string
https://www.php.net/manual/en/function.preg-match.php
preg_match('/\.(?<domain>[^\.]+)24\.com$/', 'subdomain2.newdomain24.com', $matches );
var_dump($matches);
// array(3) {
// [0]=> string(16) ".newdomain24.com"
// ["domain"]=> string(9) "newdomain"
// [1]=> string(9) "newdomain"
// }

Explode your string on . and remove 2 last characters (as it always 24):
$urls = [
"subdomain1.domain24.com",
"subdomain2.newdomain24.com",
];
foreach ($urls as $url) {
$parts = explode('.', $url);
$domain = substr($parts[1], 0, -2);
var_dump($domain);
}
Example

php regex to extract single parameter value from string

I'm working with a string containing parameters, separated by some special characters in PHP with preg_match
An example could be like this one, which has four parameters.
1stparm?#?1111?#?2ndParm?#?2222?#?3rdParm?#?3333?#?4thparm?#?444?#?
Each parameter name is followed by ?#?, and its value is right next to it, ending with ?#? (note: values can be strings or numbers, and even special characters)
I've probably overcomplicated my regex, which works in SOME cases, but not if I search for the last parameter in the string..
This example returns 2222 as the correct value (in group 1) for 2ndParm
(?:.*)2ndParm\?#\?(.*?)\?#\?(?=.)(.*)
but it fails if 2ndParm is the last one in the string as in the following example:
1stparm?#?1111?#?2ndParm?#?2222?#?
I'd also appreciate help in just returning one group with my result.. i havent been able to do so, but since I always get the one I'm interested in group 1, I can get it easily anyway.

Without regex:
$str ='1stparm?#?1111?#?2ndParm?#?2222?#?3rdParm?#?3333?#?4thparm?#?444?#?';
$keyval = explode('?#?', trim($str, '?#'));
$result = [];
foreach($keyval as $item) {
[$key, $result[$key]] = explode('?#?', $item);
}
print_r($result);
demo

You don't need to use a regex for everything, and you should have a serious talk with whoever invented this horrid format about the fact that JSON, YAML, TOML, XML, etc exist.
function bizarre_unserialize($in) {
$tmp = explode('?#?', $in);
$tmp = array_filter($tmp); // remove empty
$tmp = array_map(
function($a) { return explode('?#?', $a); },
$tmp
);
// rearrange to key-value
return array_combine(array_column($tmp, 0), array_column($tmp, 1));
}
$input = '1stparm?#?1111?#?2ndParm?#?2222?#?3rdParm?#?3333?#?4thparm?#?444?#?';
var_dump(
bizarre_unserialize($input)
);
Output:
array(4) {
["1stparm"]=>
string(4) "1111"
["2ndParm"]=>
string(4) "2222"
["3rdParm"]=>
string(4) "3333"
["4thparm"]=>
string(3) "444"
}

You can use
(?P<key>.+?)
\Q?#?\E
(?P<value>.+?)
\Q?#?\E
in verbose mode, see a demo on regex101.com.
The \Q...\E construct disables the ? and # "super-powers" (no need to escape them here).
In PHP this could be
<?php
$string = "1stparm?#?1111?#?2ndParm?#?2222?#?3rdParm?#?3333?#?4thparm?#?444?#?";
$regex = "~(?P<key>.+?)\Q?#?\E(?P<value>.+?)\Q?#?\E~";
preg_match_all($regex, $string, $matches, PREG_SET_ORDER);
foreach ($matches as $match) {
echo $match["key"] . " = " . $match["value"] . "\n";
}
?>
Which yields
1stparm = 1111
2ndParm = 2222
3rdParm = 3333
4thparm = 444
Or shorter:
$result = array_map(
function($x) {return array($x["key"] => $x["value"]);}, $matches);
print_r($result);

Remove only 3 characters after a specific string

Hi I have this string for example
http://aaa-aaaa.com/bbbb-bbbbbbbbbb-2/it/clients/
I want to remove only 3 characters after bbbb-bbbbbbbbbb-2/, so basically I want to remove the it/ part (This it/ part may not always be it but it can be es/ or en/ or different languages always 2 characters )

The following will work provided the the URL structure doesn't change. I assume you're wanting to remove the language part of the URL.
<?php
$url = "http://aaa-aaaa.com/bbbb-bbbbbbbbbb-2/it/clients/";
$parsedURL = parse_url($url);
$path = explode('/', $parsedURL['path']);
unset($path[2]);
$url = "{$parsedURL['scheme']}://{$parsedURL['host']}";
$url .= implode('/', $path);
var_dump($url);
// string(47) "http://aaa-aaaa.com/bbbb-bbbbbbbbbb-2/clients/"

You can use regex to selecting target part of string and run it in preg_replace().
$url = "http://aaa-aaaa.com/bbbb-bbbbbbbbbb-2/it/clients/";
echo preg_replace("#(.*)\w{2}/([^/]+/)$#", "$1$2", $url);
See result of code in demo

Define your languages in an array. If a language is not defined in the array URL will be the same.
$mLanguages = ["en","it","bg","gr"];
$mURL = "http://aaa-aaaa.com/bbbb-bbbbbbbbbb-2/it/clients/";
$mURL = removeLanguageFromURL($mURL, $mLanguages);
echo $mURL; // Output http://aaa-aaaa.com/bbbb-bbbbbbbbbb-2/clients/
function removeLanguageFromUrl($mURL, $mLanguages){
foreach($mLanguages as $language){ // Search languages
if(strpos($mURL, $language) !== false) // If language is found in url remove it
$mURL = str_replace('/' . $language,'', $mURL);
}
return $mURL;
}

Parsing string - with regex or something similar?

I'm writing routing class and need help. I need to parse $controller variable and assign parts of that string to another variables. Here is examples of $controller:
$controller = "admin/package/AdminClass::display"
//$path = "admin/package";
//$class = "AdminClass";
//$method = "display";
$controller = "AdminClass::display";
//$path = "";
//$class = "AdminClass";
//$method = "display";
$controller = "display"
//$path = "";
//$class = "";
//$method = "display";
This three situations is all i need. Yes, i can write long procedure to handle this situations, but what i need is simple solution with regex, with function preg_match_all
Any suggestion how to do this?

The following regex should accomplish this for you, you can then save the captured groups to $path, $class, and $method.
(?:(.+)/)?(?:(.+)::)?(.+)
Here is a Rubular:
http://www.rubular.com/r/1vPIhwPUub
Your php code might look something like this:
$regex = '/(?:(.+)\/)?(?:(.+)::)?(.+)/';
preg_match($regex, $controller, $matches);
$path = $matches[1];
$class = $matches[2];
$method = $matches[3];

This supposes that paths within the class, and the method name, can only contain letters.
The full regex is the following:
^(?:((?:[a-zA-Z]+/)*)([a-zA-Z]+)::)?([a-zA-Z]+)$
Two non capturing groups: the first one which makes all the path and class optional, the second which avoids the capture of individual path elements.
Explanation:
a path element is one or more letters followed by a /: [a-zA-Z]+/;
there may be zero or more of them: we must apply the * quantifier to the above; but the regex is not an atom, we therefore need a group. As we do not want to capture individual path elements, we use a non capturing group: (?:[a-zA-Z]+/)*;
we want to capture the full path if it is there, we must use a capturing group over this ((?:[a-zA-Z]+/)*);
the method name is one or more letters, and we want to capture it: ([a-zA-Z]+);
if present, it follows the path, and is followed by two semicolons: ((?:[a-zA-Z]+/)*)([a-zA-Z]+)::;
but all this is optional: we must therefore put a group around all this, which again we do not want to capture: (?:((?:[a-zA-Z]+/)*)([a-zA-Z]+)::)?;
finally, it is followed by a method name, which is NOT optional this time, and which we want to capture: (?:((?:[a-zA-Z]+/)*)([a-zA-Z]+)::)?([a-zA-Z]+);
and we want this to match the whole line: we need to anchor it both at the beginning and at the end, which gives the final result: ^(?:((?:[a-zA-Z]+/)*)([a-zA-Z]+)::)?([a-zA-Z]+)$
Phew.

$pieces = explode('/',$controller);
$path = '';
for($i = $i<$pieces.length-1; $i++)
{
if($i != 0)
$path+='/';
$path += $pieces[$i];
}
$p2 = explode( '::',$pieces[$pieces.length-1]);
$class = $p2[0];
$method = $p2[1];

PHP: Best way to extract text within parenthesis?

What's the best/most efficient way to extract text set between parenthesis? Say I wanted to get the string "text" from the string "ignore everything except this (text)" in the most efficient manner possible.
So far, the best I've come up with is this:
$fullString = "ignore everything except this (text)";
$start = strpos('(', $fullString);
$end = strlen($fullString) - strpos(')', $fullString);
$shortString = substr($fullString, $start, $end);
Is there a better way to do this? I know in general using regex tends to be less efficient, but unless I can reduce the number of function calls, perhaps this would be the best approach? Thoughts?

i'd just do a regex and get it over with. unless you are doing enough iterations that it becomes a huge performance issue, it's just easier to code (and understand when you look back on it)
$text = 'ignore everything except this (text)';
preg_match('#\((.*?)\)#', $text, $match);
print $match[1];

So, actually, the code you posted doesn't work: substr()'s parameters are $string, $start and $length, and strpos()'s parameters are $haystack, $needle. Slightly modified:
$str = "ignore everything except this (text)";
$start = strpos($str, '(');
$end = strpos($str, ')', $start + 1);
$length = $end - $start;
$result = substr($str, $start + 1, $length - 1);
Some subtleties: I used $start + 1 in the offset parameter in order to help PHP out while doing the strpos() search on the second parenthesis; we increment $start one and reduce $length to exclude the parentheses from the match.
Also, there's no error checking in this code: you'll want to make sure $start and $end do not === false before performing the substr.
As for using strpos/substr versus regex; performance-wise, this code will beat a regular expression hands down. It's a little wordier though. I eat and breathe strpos/substr, so I don't mind this too much, but someone else may prefer the compactness of a regex.

Use a regular expression:
if( preg_match( '!\(([^\)]+)\)!', $text, $match ) )
$text = $match[1];

i think this is the fastest way to get the words between the first parenthesis in a string.
$string = 'ignore everything except this (text)';
$string = explode(')', (explode('(', $string)[1]))[0];
echo $string;

The already posted regex solutions - \((.*?)\) and \(([^\)]+)\) - do not return the innermost strings between an open and close brackets. If a string is Text (abc(xyz 123) they both return a (abc(xyz 123) as a whole match, and not (xyz 123).
The pattern that matches substrings (use with preg_match to fetch the first and preg_match_all to fetch all occurrences) in parentheses without other open and close parentheses in between is, if the match should include parentheses:
\([^()]*\)
Or, you want to get values without parentheses:
\(([^()]*)\) // get Group 1 values after a successful call to preg_match_all, see code below
\(\K[^()]*(?=\)) // this and the one below get the values without parentheses as whole matches
(?<=\()[^()]*(?=\)) // less efficient, not recommended
Replace * with + if there must be at least 1 char between ( and ).
Details:
\( - an opening round bracket (must be escaped to denote a literal parenthesis as it is used outside a character class)
[^()]* - zero or more characters other than ( and ) (note these ( and ) do not have to be escaped inside a character class as inside it, ( and ) cannot be used to specify a grouping and are treated as literal parentheses)
\) - a closing round bracket (must be escaped to denote a literal parenthesis as it is used outside a character class).
The \(\K part in an alternative regex matches ( and omits from the match value (with the \K match reset operator). (?<=\() is a positive lookbehind that requires a ( to appear immediately to the left of the current location, but the ( is not added to the match value since lookbehind (lookaround) patterns are not consuming. (?=\() is a positive lookahead that requires a ) char to appear immediately to the right of the current location.
PHP code:
$fullString = 'ignore everything except this (text) and (that (text here))';
if (preg_match_all('~\(([^()]*)\)~', $fullString, $matches)) {
print_r($matches[0]); // Get whole match values
print_r($matches[1]); // Get Group 1 values
}
Output:
Array ( [0] => (text) [1] => (text here) )
Array ( [0] => text [1] => text here )

This is a sample code to extract all the text between '[' and ']' and store it 2 separate arrays(ie text inside parentheses in one array and text outside parentheses in another array)
function extract_text($string)
{
$text_outside=array();
$text_inside=array();
$t="";
for($i=0;$i<strlen($string);$i++)
{
if($string[$i]=='[')
{
$text_outside[]=$t;
$t="";
$t1="";
$i++;
while($string[$i]!=']')
{
$t1.=$string[$i];
$i++;
}
$text_inside[] = $t1;
}
else {
if($string[$i]!=']')
$t.=$string[$i];
else {
continue;
}
}
}
if($t!="")
$text_outside[]=$t;
var_dump($text_outside);
echo "\n\n";
var_dump($text_inside);
}
Output:
extract_text("hello how are you?");
will produce:
array(1) {
[0]=>
string(18) "hello how are you?"
}
array(0) {
}
extract_text("hello [http://www.google.com/test.mp3] how are you?");
will produce
array(2) {
[0]=>
string(6) "hello "
[1]=>
string(13) " how are you?"
}
array(1) {
[0]=>
string(30) "http://www.google.com/test.mp3"
}

This function may be useful.
public static function getStringBetween($str,$from,$to, $withFromAndTo = false)
{
$sub = substr($str, strpos($str,$from)+strlen($from),strlen($str));
if ($withFromAndTo)
return $from . substr($sub,0, strrpos($sub,$to)) . $to;
else
return substr($sub,0, strrpos($sub,$to));
}
$inputString = "ignore everything except this (text)";
$outputString = getStringBetween($inputString, '(', ')'));
echo $outputString;
//output will be test
$outputString = getStringBetween($inputString, '(', ')', true));
echo $outputString;
//output will be (test)
strpos() => which is used to find the position of first occurance in a string.
strrpos() => which is used to find the position of first occurance in a string.

function getStringsBetween($str, $start='[', $end=']', $with_from_to=true){
$arr = [];
$last_pos = 0;
$last_pos = strpos($str, $start, $last_pos);
while ($last_pos !== false) {
$t = strpos($str, $end, $last_pos);
$arr[] = ($with_from_to ? $start : '').substr($str, $last_pos + 1, $t - $last_pos - 1).($with_from_to ? $end : '');
$last_pos = strpos($str, $start, $last_pos+1);
}
return $arr; }
this is a little improvement to the previous answer that will return all patterns in array form:
getStringsBetween('[T]his[] is [test] string [pattern]') will return:

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Compare strings and extract variables? - php

Related

extract part of a string before and after characters

php regex to extract single parameter value from string

Remove only 3 characters after a specific string

Parsing string - with regex or something similar?

PHP: Best way to extract text within parenthesis?

Categories

Resources