extract part of a path with filename with php - php

I need to extract a portion of urls using php. The last 6 segments of the url are the part I need. The first part of the url varies in length and number of directories. So if I have a url like this:
https://www.random.ccc/random2/part1/part2/part3/2017/08/file.txt
or this:
https://www.random.vov/part1/part2/part3/2016/08/file.pdf
What I need is this:
/part1/part2/part3/2017/08/file.txt
or this:
/part1/part2/part3/2016/08/file.pdf
I have tried this:
$string = implode("/",array_slice(explode("/",$string,8),6,4));
which works ok on the first example but not the second. I am not so good with regex and I suppose that is the way. What is the most graceful solution?

Your approach is fine, though adding parse_url in there to isolate just the path will help a lot:
$path = parse_url($url, PHP_URL_PATH); // just the path part of the URL
$parts = explode('/', $path); // all the components
$parts = array_slice($parts, -6); // the last six
$path = implode('/', $parts); // back together as a string
Try it online at 3v4l.org.
Now, to qualify: if you only need the string part of the path, then use parse_url. If, however, you need to work with each of the segments (such as removing only the last six, as asked), then use the common pattern of explode/manipulate/implode.
I have left each of these steps separate in the above so you can debug and choose the parts that work best for you.

Use this, substituting $url as you wish:
$url= "https://www.random.vov/part1/part2/part3/2016/08/file.pdf";
preg_match("%/[^/]*?/[^/]*?/[^/]*?/[^/]*?/[^/]*?/[^/]*?$%", $url, $matches);
echo $matches[0];
best regards!

Related

get last part of url dynamic

I found a way to get the last part of the url, I just don't know if there's an even better way since I want it to be dynamic.
This is the way I did it:
$url = $_SERVER['REQUEST_URI'];
$categoryName = basename($url);
The last part of the url in this case is always a category(horror for e.g) that's in my database, so the url will always looks like this:
http://localhost:8888/blog/public/index.php/categories/Horror
or
http://localhost:8888/blog/public/index.php/categories/Fantasy
I think you got my point.
Well, the question is, is there a better way or is mine okay? Especially when looking at the
$_SERVER['REQUEST_URI']
Use explode() to split URL by / delimiter and use end() to get last item of array.
$url = "http://localhost:8888/blog/public/index.php/categories/Horror";
$categoryName = #end(explode("/", $url));
// Horror
You can always use a simple regex to get it.
$re = '#.*/(.*)#m';
$str = 'http://localhost:8888/blog/public/index.php/categories/Horror';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
echo $matches[0][1];
//outputs `Horror`
if you are using laravel or symphony use end(Request::segments())

PHP: How can I separate this split this string correctly?

I have a bunch of URL segments.
I know they will either come in the form of:
myString?var1=value
or
myString/name.php?var1=value
In either case I need to take the URL and get the myString bit out of it. So something that will:
Search for either the first / or ?
Turn whatever came before it into a var and throw the rest of the string out.
How can this be achieved?
A simple way to do it
$parts = explode('?', $str);
$parts = explode('/', $parts[0]);
$yourFinalString = $parts[0];

Convert absolute to relative url with preg_replace

(I searched, and found lots of questions about converting relative to absolute urls, but nothing for absolute to relative.)
I'd like to take input from a form field and end up with a relative url. Ideally, this would be able to handle any of the following inputs and end up with /page-slug.
http://example.com/page-slug
http://www.example.com/page-slug
https://example.com/page-slug
https://www.example.com/page-slug
example.com/page-slug
/page-slug
And maybe more I'm not thinking of...?
Edit: I'd also like this to work for something where the relative url is e.g. /page/post (i.e. something with more than one slash).
Take a look at parse_url if you are always working with URLs. Specifically:
parse_url($url, PHP_URL_PATH)
FYI, I tested it against all your input, and it worked on all except: example.com/page-slug
Try this regexp.
#^ The start of the string
(
:// Match either ://
| Or
[^/] Not a /
)* Any number of times
#
And replace it with the empty string.
$pattern = '#^(://|[^/])+#';
$replacement = '';
echo preg_replace($pattern, $replacement, $string);
I think you want the part of the URL after the hostname, you can use parse_url:
$path = parse_url($url, PHP_URL_PATH);
Note that this gets the whole of the URL after the hostname, so http://example.com/page/slug will give /page/slug.
I would just do this a little hacky way if you know your application. I would use a regex to search for
[a-z].([(com|org|net)])

PHP Regex on URL - split into variables

I am trying to implement a php script which will run on every call to my site, look for a certain pattern of URL, then explode the URL and perform a redirect.
Basically I want to run this on a new CMS to catch all incoming links from the old CMS, and redirect, based on mapping, say an article id stripped form the URL to the same article ID imported into the new CMS's DB.
I can do the implementation, the redirect etc, but I am lost on the regex.
I need to catch any occurrences of:
domain.com/content/view/*/34/ or domain.com/content/view/*/30/ (where * is a wildcard) and capture * and the 30 or 34 in a variable which I will then use in a DB query.
If the following is encountered:
domain.com/content/view/*/34/1/*/
I need to capture the first * and the second *.
Be very grateful for anyone who can give me a hand on this.
I'm not sure regular expressions are the way to go. I think it would probably be easier to use explode ('/' , $url) and check by looping over that array.
Here are the steps I would follow:
$url = parse_url($url, PHP_URL_PATH);
$url = trim($url, '/');
$parts = explode ('/' , $url);
Then you can check if
($parts[0]=='content' && $parts[1]=='view' && $parts[3]=='34')
You can also easily get the information you want with $parts[2].
It's actually very simple, a more flexible and straightforward approach is to explode() the url into an array called something like $segments, and then test on there. If you have a very small number of expected URLs, then this kind of approach is probably easier to maintain and to read.
I wouldn't recommend doing this in the htaccess file because of the performance overhead.
First, I would use the PHP function parse_url() to get the path, devoid of any protocol or hostname.
Once you have that the following code should get you the info you need.
<?php
$url = 'http://domain.com/content/view/*/34/'; // first example
$url = 'http://domain.com/content/view/*/34/1/*/'; // second example
$url_array = parse_url($url);
$path = $url_array['path'];
// Match the URL against regular expressions
if (preg_match('/content\/view\/([^\/]+)\/([0-9]+)\//i', $path, $matches)){
print_r($matches);
}
if (preg_match('/content\/view\/([^\/]+)\/([0-9]+)\/([0-9]+)\/([^\/]+)/i', $path, $matches)){
print_r($matches);
}
?>
([^/]+) matches any sequence of characters except a forward slash
([0-9]+) matches any sequence of numbers
Though you can probably write a single regular expression to match most URL variants, consider using multiple regular expressions to check for different types of URLs. Depending on how much traffic you get, the speed hit won't be all that terrible.
Also, I recommend reading Mastering Regular Expressions by O'reilly. A good knowledge of regular expressions will come in handy quite often.
http://www.regular-expressions.info/php.html

Regular expression to extract from URI

I need a regular expression to extract from two types of URIs
http://example.com/path/to/page/?filter
http://example.com/path/to/?filter
Basically, in both cases I need to somehow isolate and return
/path/to
and
?filter
That is, both /path/to and filter is arbitrary. So I suppose I need 2 regular expressions for this? I am doing this in PHP but if someone could help me out with the regular expressions I can figure out the rest. Thanks for your time :)
EDIT: So just want to clearify, if for example
http://example.com/help/faq/?sort=latest
I want to get /help/faq and ?sort=latest
Another example
http://example.com/site/users/all/page/?filter=none&status=2
I want to get /site/users/all and ?filter=none&status=2. Note that I do not want to get the page!
Using parse_url might be easier and have fewer side-effects then regex:
$querystring = parse_url($url, PHP_URL_QUERY);
$path = parse_url($var, PHP_URL_PATH);
You could then use explode on the path to get the first two segments:
$segments = explode("/", $path);
Try this:
^http://[^/?#]+/([^/?#]+/[^/?#]+)[^?#]*\?([^#]*)
This will get you the first two URL path segments and query.
not tested but:
^https?://[^ /]+[^ ?]+.*
which should match http and https url with or without path, the second argument should match until the ? (from the ?filter for instance) and the .* any char except the \n.
Have you considered using explode() instead (http://nl2.php.net/manual/en/function.explode.php) ? The task seems simple enough for it. You would need 2 calls (one for the / and one for the ?) but it should be quite simple once you did that.

Categories