Parse URL with Regex using PHP - php

I have a url formatted like this:
http://www.example.com/detail/state-1/county-2/street-3
What I'd like to do is parse out the values for state (1), county (2), and street (3).
I could do this using a combination of substr(), strpos(), and a loop. But I think using a regex would be faster however I'm not sure what my regex would be.

$pieces = parse_url($input_url);
$path = trim($pieces['path'], '/');
$segments = explode('/', $path);
foreach($segments as $segment) {
$keyval = explode('-', $segment);
if(count($keyval) == 2) {
$key_to_val[$keyval[0]] = $keyval[1];
}
}
/*
$key_to_val: Array (
[state] => 1,
[county] => 2,
[street] => 3
)
*/

Could just do this:
<?php
$url = "http://www.example.com/detail/state-1/county-2/street-3";
list( , $state, $country, $street) = explode('/', parse_url($url, PHP_URL_PATH));
?>

if (preg_match_all('#([a-z0-9_]*)-(\\d+)#i', $url, $matches, PREG_SET_ORDER)) {
$matches = array(
array(
'state-1',
'state',
'1',
),
array(
'county-2',
'county',
'2',
),
array(
'street-3',
'street',
'3',
)
);
}
Note, that's the structure of the $matches array (what it would look like if you var_dump'd it...

This regex pattern should work:
$subject = 'http://www.example.com/detail/state-1/county-2/street-3';
$pattern = 'state-(\\d+)/county-(\\d+)/street-(\\d+)';
preg_match($pattern, $subject, $matches);
// matches[1],[2],[3] now stores (the values) 1,2,3 (respectively)
print_r($matches);

Related

PHP Regex to find first 3 match between slash

I have a string like this:
$url = '/controller/method/para1/para2/';
Expected output:
Array(
[0] => 'controller',
[1] => 'method',
[2] => array(
[0] => 'para1',
[1] => 'para2'
)
)
I am trying to build a regex to achieve this but not able to construct the pattern properly.
Please assist.
I tried to use explode function to split,
$split_url = explode('/',$url);
$controller = $split_url[1];
$method = $split_url[2];
unset($split_url[0]);
unset($split_url[1]);
unset($split_url[2]);
$para = $split_url;
But this is really not a great way of doing this and is prone to errors.
whithout regex:
$url = '/controller/method/para1/para2/para3/';
$arr = explode('/', trim($url, '/'));
$result = array_slice($arr, 0, 2);
$result[] = array_slice($arr, 2);
print_r($result);
Note: if you need to always have parameters at the same index (even if there is no method or parameters), you can change $result[] = array_slice($arr, 2); to $result[2] = array_slice($arr, 2);
Here's a slightly nasty method using explode:
$url = '/controller/method/para1/para2/para3/';
# get rid of leading and trailing slashes
$url = trim($url, '/');
$arr = explode('/', $url);
$results = array( $arr[0], $arr[1], array_slice($arr, 2) );
print_r($results);
Output:
Array
(
[0] => controller
[1] => method
[2] => Array
(
[0] => para1
[1] => para2
[2] => para3
)
)
It will work for any number of para elements.
And just to show that regexs are not scary, they're lovely fluffy friendly things, here's a regex version:
preg_match_all("/\/(\w+)/", $url, $matches);
$arr = $matches[1];
$results = array( $arr[0], $arr[1], array_slice($arr, 2) );
It's actually very easy to match this URL -- just search for / followed by alphanumeric characters (\w+).
How about something like:
$url = '/controller/method/para1/para2/para3/';
$regex = '~^/([^/]+)/([^/]+)/(?:(.*)/)?$~';
if(preg_match($regex, $url, $matches)) {
$controller = $matches[1];
$method = $matches[2];
$parameters = explode('/', $matches[3]);
}
This will capture 3 segments separated by a leading/trailing /. The 3rd segment of parameters can then be split with explode(). To get the array exactly like in your question:
$array = array($controller, $method, $parameters);
// Array
// (
// [0] => controller
// [1] => method
// [2] => Array
// (
// [0] => para1
// [1] => para2
// [2] => para3
// )
// )
An alterate way of thinking about this is to actually parse your route to determine the controller and then pass the remaining route components off to the controller to determine what to do.
$url = '/controller/method/para1/para2/para3/';
$route_parts = explode('/', $url, '/')); // we don't need leading and trailing forward slashes
$controller_str = array_shift($route_parts);
$method_str = array_shift($route_parts);
// instantiate controller object be some means (a factory pattern shown here for demo purposes)
$controller = controllerFactory::getInstance($controller_str);
// set method on controller
$controller->setMethod($method_str);
// pass parameters to controller
$controller->setParams($route_parts);
// do whatever with controller
$controller->execute();

Why does php str_replace with multiple arrays give wrong result, but for loop gives correct result?

I'm trying to replace the characters (numbers and letters) in a string. When I try the "php" way, it gives the wrong result for some of the characters. Why?
PHP-WAY:
$find = array( "0","1","2","3","4","5","6","7","8","9","a","b","c","d","e","f" );
$replace = array( "a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p" );
$haystack = "a5c9a06bfacf5f12cf01ab3f202f6c78"
//This incorrectly returns: kpmjkkglpkmppplmmpklklnpmkmpgmhi
echo str_replace( $find, $replace, $haystack );
LOOP WAY:
$find = array( "0","1","2","3","4","5","6","7","8","9","a","b","c","d","e","f" );
$replace = array( "a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p" );
$haystack = "a5c9a06bfacf5f12cf01ab3f202f6c78"
//This correctly returns: kfmjkaglpkmpfpbcmpabkldpcacpgmhi
$newStr = "";
$chars = str_split( $haystack );
for ( $i = 0, $length = count( $chars ); $i < $length; $i++ )
{
$newStr .= $replace[ array_search( $chars[ $i ], $find ) ];
}
echo $newStr;
Why is the first one incorrect? Am I using it wrong?
Order of entries in your arrays.... str_replace() will process each array entry in the order they appear in your array, so if a '1' gets replaced with 'b', then that 'b' will subsequently get replaced with 'l'; use strtr() rather than str_replace() if you want to prevent that behaviour.
$find = array( "0","1","2","3","4","5","6","7","8","9","a","b","c","d","e","f" );
$replace = array( "a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p" );
$haystack = "a5c9a06bfacf5f12cf01ab3f202f6c78" ;
echo strtr($haystack, array_combine($find, $replace));
Your own code only does a single replace because it's looping against your string, not against the from/to arrays.
Just use strtr
$haystack = "a5c9a06bfacf5f12cf01ab3f202f6c78" ;
echo strtr($haystack, implode($find), implode($replace));
Or preg_replace_callback
$find = array_flip($find);
echo preg_replace_callback('/[a-f0-9]/', function ($v) use($replace, $find) {
return $replace[$find[$v[0]]];
}, $haystack);
Output
kfmjkaglpkmpfpbcmpabkldpcacpgmhi
As specified by #MarkBaker, the answer is that str_replace does not simply move forward in the string, but instead works like a recursive .replace(). Instead, use strtr (which is equivalent to Linux tr command:
$tr = array( "0" => "a","1" => "b","2" => "c","3" => "d","4" => "e","5" => "f","6" => "g","7" => "h","8" => "i","9" => "j","a" => "k","b" => "l","c" => "m","d" => "n","e" => "o","f" => "p" );
$haystack = "a5c9a06bfacf5f12cf01ab3f202f6c78"
echo strtr( $haystack, $tr );

Turn text inside brackets to an array PHP

If I have a string that looks like this:
$myString = "[sometext][moretext][993][112]This is a long text";
I want it to be turned into:
$string = "This is a long text";
$arrayDigits[0] = 993;
$arrayDigits[1] = 112;
$arrayText[0] = "sometext";
$arrayText[1] = "moretext";
How can I do this with PHP?
I understand Regular Expressions is the solution. Please notice that $myString was just an example. There can be several brackets, not just two of each, as in my example.
Thanks for your help!
This is what I came up with.
<?php
#For better display
header("Content-Type: text/plain");
#The String
$myString = "[sometext][moretext][993][112]This is a long text";
#Initialize the array
$matches = array();
#Fill it with matches. It would populate $matches[1].
preg_match_all("|\[(.+?)\]|", $myString, $matches);
#Remove anything inside of square brackets, and assign to $string.
$string = preg_replace("|\[.+\]|", "", $myString);
#Display the results.
print_r($matches[1]);
print_r($string);
After that, you can iterate over the $matches array and check each value to assign it to a new array.
Try this:
$s = '[sometext][moretext][993][112]This is a long text';
preg_match_all('/\[(\w+)\]/', $s, $m);
$m[1] will contain all texts in the brakets, after this you could check type of each value. Also, you could check this using two preg_match_all: at first time with pattern /\[(\d+)\]/ (will return array of digits), in the second - pattern /\[([a-zA-z]+)\]/ (that will return words):
$s = '[sometext][moretext][993][112]This is a long text';
preg_match_all('/\[(\d+)\]/', $s, $matches);
$arrayOfDigits = $matches[1];
preg_match_all('/\[([a-zA-Z]+)\]/', $s, $matches);
$arrayOfWords = $matches[1];
For cases like yours you can make use of named subpatterns so to "tokenize" your string. With some little code, this can be made easily configurable with an array of tokens:
$subject = "[sometext][moretext][993][112]This is a long text";
$groups = array(
'digit' => '\[\d+]',
'text' => '\[\w+]',
'free' => '.+'
);
Each group contains the subpattern and it's name. They match in their order, so if the group digit matches, it won't give text a chance (which is necessary here because \d+ is a subset of \w+). This array can then turned into a full pattern:
foreach($groups as $name => &$subpattern)
$subpattern = sprintf('(?<%s>%s)', $name, $subpattern);
unset($subpattern);
$pattern = sprintf('/(?:%s)/', implode('|', $groups));
The pattern looks like this:
/(?:(?<digit>\[\d+])|(?<text>\[\w+])|(?<free>.+))/
Everything left to do is to execute it against your string, capture the matches and filter them for some normalized output:
if (preg_match_all($pattern, $subject, $matches))
{
$matches = array_intersect_key($matches, $groups);
$matches = array_map('array_filter', $matches);
$matches = array_map('array_values', $matches);
print_r($matches);
}
The matches are now nicely accessible in an array:
Array
(
[digit] => Array
(
[0] => [993]
[1] => [112]
)
[text] => Array
(
[0] => [sometext]
[1] => [moretext]
)
[free] => Array
(
[0] => This is a long text
)
)
The full example at once:
$subject = "[sometext][moretext][993][112]This is a long text";
$groups = array(
'digit' => '\[\d+]',
'text' => '\[\w+]',
'free' => '.+'
);
foreach($groups as $name => &$subpattern)
$subpattern = sprintf('(?<%s>%s)', $name, $subpattern);
unset($subpattern);
$pattern = sprintf('/(?:%s)/', implode('|', $groups));
if (preg_match_all($pattern, $subject, $matches))
{
$matches = array_intersect_key($matches, $groups);
$matches = array_map('array_filter', $matches);
$matches = array_map('array_values', $matches);
print_r($matches);
}
You could try something along the lines of:
<?php
function parseString($string) {
// identify data in brackets
static $pattern = '#(?:\[)([^\[\]]+)(?:\])#';
// result container
$t = array(
'string' => null,
'digits' => array(),
'text' => array(),
);
$t['string'] = preg_replace_callback($pattern, function($m) use(&$t) {
// shove matched string into digits/text groups
$t[is_numeric($m[1]) ? 'digits' : 'text'][] = $m[1];
// remove the brackets from the text
return '';
}, $string);
return $t;
}
$string = "[sometext][moretext][993][112]This is a long text";
$result = parseString($string);
var_dump($result);
/*
$result === array(
"string" => "This is a long text",
"digits" => array(
993,
112,
),
"text" => array(
"sometext",
"moretext",
),
);
*/
(PHP5.3 - using closures)

Parse RFC 822 compliant addresses in a TO header

I would like to parse an email address list (like the one in a TO header) with preg_match_all to get the user name (if exists) and the E-mail. Something similar to mailparse_rfc822_parse_addresses or Mail_RFC822::parseAddressList() from Pear, but in plain PHP.
Input :
"DOE, John \(ACME\)" <john.doe#somewhere.com>, "DOE, Jane" <jane.doe#somewhere.com>
Output :
array(
array(
'name' => 'DOE, John (ACME)',
'email' => 'john.doe#somewhere.com'
),
array(
'name' => 'DOE, Jane',
'email' => 'jane.doe#somewhere.com'
)
)
Don't need to support strange E-mail format (/[a-z0-9._%-]+#[a-z0-9.-]+.[a-z]{2,4}/i for email part is OK).
I can't use explode because the comma can appear in the name. str_getcsv doesn't work, because I can have:
DOE, John \(ACME\) <john.doe#somewhere.com>
as input.
Update:
For the moment, I've got this :
public static function parseAddressList($addressList)
{
$pattern = '/^(?:"?([^<"]+)"?\s)?<?([^>]+#[^>]+)>?$/';
if (preg_match($pattern, $addressList, $matches)) {
return array(
array(
'name' => stripcslashes($matches[1]),
'email' => $matches[2]
)
);
} else {
$parts = str_getcsv($addressList);
$result = array();
foreach($parts as $part) {
if (preg_match($pattern, $part, $matches)) {
$result[] = array(
'name' => stripcslashes($matches[1]),
'email' => $matches[2]
);
}
}
return $result;
}
}
but it fails on:
"DOE, \"John\"" <john.doe#somewhere.com>
I need to test on back reference the \" but I don't remember how to do this.
Finally I did it:
public static function parseAddressList($addressList)
{
$pattern = '/^(?:"?((?:[^"\\\\]|\\\\.)+)"?\s)?<?([a-z0-9._%-]+#[a-z0-9.-]+\\.[a-z]{2,4})>?$/i';
if (($addressList[0] != '<') and preg_match($pattern, $addressList, $matches)) {
return array(
array(
'name' => stripcslashes($matches[1]),
'email' => $matches[2]
)
);
} else {
$parts = str_getcsv($addressList);
$result = array();
foreach($parts as $part) {
if (preg_match($pattern, $part, $matches)) {
$item = array();
if ($matches[1] != '') $item['name'] = stripcslashes($matches[1]);
$item['email'] = $matches[2];
$result[] = $item;
}
}
return $result;
}
}
But I'm not sure it works for all cases.
I don't know that RFC, but if the format is always as you showed then you can try something like:
preg_match_all("/\"([^\"]*)\"\\s+<([^<>]*)>/", $string, $matches);
print_r($matches);

str_replace() with associative array

You can use arrays with str_replace():
$array_from = array ('from1', 'from2');
$array_to = array ('to1', 'to2');
$text = str_replace ($array_from, $array_to, $text);
But what if you have associative array?
$array_from_to = array (
'from1' => 'to1';
'from2' => 'to2';
);
How can you use it with str_replace()?
Speed matters - array is big enough.
$text = strtr($text, $array_from_to)
By the way, that is still a one dimensional "array."
$array_from_to = array (
'from1' => 'to1',
'from2' => 'to2'
);
$text = str_replace(array_keys($array_from_to), $array_from_to, $text);
The to field will ignore the keys in your array. The key function here is array_keys.
$text='yadav+RAHUL(from2';
$array_from_to = array('+' => 'Z1',
'-' => 'Z2',
'&' => 'Z3',
'&&' => 'Z4',
'||' => 'Z5',
'!' => 'Z6',
'(' => 'Z7',
')' => 'Z8',
'[' => 'Z9',
']' => 'Zx1',
'^' => 'Zx2',
'"' => 'Zx3',
'*' => 'Zx4',
'~' => 'Zx5',
'?' => 'Zx6',
':' => 'Zx7',
"'" => 'Zx8');
$text = strtr($text,$array_from_to);
echo $text;
//output is
yadavZ1RAHULZ7from2
$search = array('{user}', '{site}');
$replace = array('Qiao', 'stackoverflow');
$subject = 'Hello {user}, welcome to {site}.';
echo str_replace ($search, $replace, $subject);
Results in Hello Qiao, welcome to stackoverflow..
$array_from_to = array (
'from1' => 'to1',
'from2' => 'to2',
);
This is not a two-dimensional array, it's an associative array.
Expanding on the first example, where we place the $search as the keys of the array, and the $replace as it's values, the code would look like this.
$searchAndReplace = array(
'{user}' => 'Qiao',
'{site}' => 'stackoverflow'
);
$search = array_keys($searchAndReplace);
$replace = array_value($searchAndReplace);
# Our subject is the same as our first example.
echo str_replace ($search, $replace, $subject);
Results in Hello Qiao, welcome to stackoverflow..
$keys = array_keys($array);
$values = array_values($array);
$text = str_replace($key, $values, $string);

Categories