getting file name with regular expression php - php

I'am trying to use regular expression to get just file name from URL for example:
$link = "http://localhost/website/index.php";
$pattern = '/.*?\.php';
preg_match($pattern, $link, $matches);
but it returns "//localhost/website/index.php" instead of "index".

Does your code even run? You haven't used any delimiters...
With preg_match, you could use a negated class instead, because / matches the first / then .*? will match all the characters up to .php... and if you want to get only index, it would be simplest to use a capture group like so:
$link = "http://localhost/website/index.php";
$pattern = '~([^/]+)\.php~';
preg_match($pattern, $link, $matches);
echo $matches[1]; # Get the captured group from the array $matches
Or you can simply use the basename function:
echo basename($link, ".php");

I think you would be much better off using a function dedicated to the purpose, rather than a custom regular expression.
Since the example you provided is actually a URL, you could use the parse_url function:
http://php.net/manual/en/function.parse-url.php
You should also look at the pathinfo (well done PHP on the naming consistency there!):
http://php.net/manual/en/function.pathinfo.php
You could then do something like this:
$url = 'http://localhost/path/file.php';
$url_info = parse_url($url);
$full_path = $url_info['path'];
$path_info = pathinfo($full_path);
$file_name = $path_info['filename'] . '.' . $path_info['extension'];
print $file_name; // outputs "file.php"
This might seem more verbose than using regular expressions, but it likely to be much faster and, more importantly, much more robust.

Related

PHP Regex to get second occurance from the path

I have a path "../uploads/e2c_name_icon/" and I need to extract e2c_name_icon from the path.
What I tried is using str_replace function
$msg = str_replace("../uploads/","","../uploads/e2c_name_icon/");
This result in an output "e2c_name_icon/"
$msg=str_replace("/","","e2c_name_icon/")
There is a better way to do this. I am searching alternative method to use regex expression.
Try this. Outputs: e2c_name_icon
<?php
$path = "../uploads/e2c_name_icon/";
// Outputs: 'e2c_name_icon'
echo explode('/', $path)[2];
However, this is technically the third component of the path, the ../ being the first. If you always need to get the third index, then this should work. Otherwise, you'll need to resolve the relative path first.
Use basename function provided by PHP.
$var = "../uploads/e2c_name_icon/";
echo basename( $var ); // prints e2c_name_icon
If you are strictly want to get the last part of the url after '../uploads'
Then you could use this :
$url = '../uploads/e2c_name_icon/';
$regex = '/\.\.\/uploads\/(\w+)/';
preg_match($regex, $url, $m)
print_r ($m); // $m[1] would output your url if possible
You can trim after the str_replace.
echo $msg = trim(str_replace("../uploads/","","../uploads/e2c_name_icon/"), "/");
I don't think you need to use regex for this. Simple string functions are usually faster
You could also use strrpos to find the second last /, then trim off both /.
$path = "../uploads/e2c_name_icon/";
echo $msg = trim(substr($path, strrpos($path, "/",-2)),"/");
I added -2 in strrpos to skip the last /. That means it returns the positon of the / after uploads.
So substr will return /e2c_name_icon/ and trim will remove both /.
You'd be much better off using the native PHP path functions vs trying to parse it yourself.
For example:
$path = "../uploads/e2c_name_icon/";
$msg = basename(dirname(realpath($path))); // e2c_name_icon

PHP regex: How to remove ?file in url?

My url like this:
http://mywebsite.com/movies/937-lan-kwai-fong-2?file=Rae-Ingram&q=
http://mywebsite.com/movies/937-big-daddy?file=something&q=
I want to get "lan-kwai-fong-2" and "big-daddy", so I use this code but it doesn't work. Please help me fix it ! If you can shorten it, it is so great !
$url= $_SERVER['REQUEST_URI'];
preg_replace('/\?file.*/','',$url);
preg_match('/[a-z][\w\-]+$/',$url,$matches);
$matches= str_replace("-"," ",$matches[0]);
First there are issue with your code which im going to go over because they are general things:
preg_replace does not work by reference so you are never actually modifying the url. You need to assign the result of the replace to a variable:
// this would ovewrite the current value of url with the replaced value
$url = preg_replace('/\?file.*/','',$url);
It is possible that preg_match will not find anything so you need to test the result
// it should also be noted that sometimes you may need a more exact test here
// because it can return false (if theres an error) or 0 (if there is no match)
if (preg_match('/[a-z][\w\-]+$/',$url,$matches)) {
// do stuff
}
Now with that out of the way you are making this more difficult than it needs to be. There are specific function for working with urls parse_url and parse_str.
You can use these to easily work with the information:
$urlInfo = parse_url($_SERVER['REQUEST_URI']);
$movie = basename($urlInfo['path']); // yields 937-the-movie-title
Just replace
preg_replace('/\?file.*/','',$url);
with
$url= preg_replace('/\?file.*/','',$url);
Regex works, and parse_url is the right way to do it. But for something quick and dirty I would usually use explode. I think it's clearer.
#list($path, $query) = explode("?", $url, 2); // separate path from query
$match = array_pop(explode("/", $path)); // get last part of path
How about this:
$url = $_SERVER['REQUEST_URI'];
preg_match('/\/[^-]+-([^?]+)\?/', $url, $matches);
$str = isset($matches[1]) ? $matches[1] : false;`
match last '/'
match anything besides '-' until '-'
capture anything besides '?' until (not including) '?'

Get two results without repeating preg_match and file_get_contents

I'm newbie to php
And I need to get two results from the same page. og:image and og:video
This my current code
preg_match('/property="og:video" content="(.*?)"/', file_get_contents($url), $matchesVideo);
preg_match('/property="og:image" content="(.*?)"/', file_get_contents($url), $matchesThumb);
$videoID = ($matchesVideo[1]) ? $matchesVideo[1] : false;
$videoThumb = ($matchesThumb[1]) ? $matchesThumb[1] : false;
Is there a way to execute the same operation without duplicating my code
Save the file contents to a variable, and if you want to run a single regular expression, you can opt for:
$file = file_get_contents($url);
preg_match_all('/property="og:(?P<type>video|image)" content="(?P<content>.*?)"/', $file, $matches, PREG_SET_ORDER);
foreach ($matches as $match) {
$match['type'] ...
$match['content'] ...
}
As #hakre points out, the first parenthesis pair is not needed:
The first parenthesis pair uses the no capture modifier ?:, it causes a match but is not stored
Capture groups use named subpatterns ?P<name>, the second capture group establish any of the two words is a possible match image|video.
There is no problem with having those two lines. What I would change though is the double call to file_get_contents($url).
Just change it to:
$html = file_get_contents($url);
preg_match('/property="og:video" content="(.*?)"/', $html, $matchesVideo);
preg_match('/property="og:image" content="(.*?)"/', $html, $matchesThumb);
Is there a way to execute the same operation without duplicating my code
There are always two ways to do that:
Buffer an execution result - instead of executing multiple times.
Encode the repetition - extract parameters from code.
In programming you normally make use of both. For example the buffering of the file I/O operation:
$buffer = file_get_contents($url);
And for the matching, you encode the repetition:
$match = function ($what) use ($buffer) {
$pattern = sprintf('/property="og:%s" content="(.*?)"/', $what);
$result = preg_match($pattern, $buffer, $matches);
return $result ? $matches[1] : NULL;
}
$match('video');
$match('image');
This is only exemplary to show what I meant. It depends a bit how much you want to do this, e.g. the later allows to replace the matching with a different implementation like using a HTML parser but you might find it too much code at the moment for what you need to do and only go with the buffering.
E.g. the following could be applicable as well:
$buffer = file_get_contents($url);
$mask = '/property="og:%s" content="(.*?)"/';
preg_match(sprintf($mask, 'video'), $buffer, $matchesVideo);
preg_match(sprintf($mask, 'image'), $buffer, $matchesThumb);
Hope this helps.

PHP get specific string from url before and after unknown characters

I know it may sound as a common question but I have difficulty understanding this process.
So I have this string:
http://domain.com/campaign/tgadv?redirect
And I need to get only the word "tgadv". But I don't know that the word is "tgadv", it could be whatever.
Also the url itself may change and become:
http://domain.com/campaign/tgadv
or
http://domain.com/campaign/tgadv/
So what I need is to create a function that will get whatever word is after campaign and before any other particular character. That's the logic..
The only certain thing is that the word will come after the word campaign/ and that any other character that will be after the word we are searching is a special one ( i.e. / or ? )
I tried understanding preg_match but really cannot get any good result from it..
Any help would be highly appreciated!
I would not use a regex for that. I would use parse_url and basename:
$bits = parse_url('http://domain.com/campaign/tgadv?redirect');
$filename = basename($bits['path']);
echo $filename;
However, if want a regex solution, use something like this:
$pattern = '~(.*)/(.*)(\?.*)~';
preg_match($pattern, 'http://domain.com/campaign/tgadv?redirect', $matches);
$filename = $matches[2];
echo $filename;
Actually, preg_match sounds like the perfect solution to this problem. I assume you are having problems with the regex?
Try something like this:
<?php
$url = "http://domain.com/campaign/tgadv/";
$pattern = "#campaign/([^/\?]+)#";
preg_match($pattern, $url, $matches);
// $matches[1] will contain tgadv.
$path = "http://domain.com/campaign/tgadv?redirect";
$url_parts = parse_url($path);
$tgadv = strrchr($url_parts['path'], '/');
You don't really need a regex to accomplish this. You can do it using stripos() and substr().
For example:
$str = '....Your string...';
$offset = stripos($str, 'campaign/');
if ( $offset === false ){
//error, end of h4 tag wasn't found
}
$offset += strlen('campaign/');
$newStr = substr($str, $offset);
At this point $newStr will have all the text after 'campaign/'.
You then just need to use a similar process to find the special character position and use substr() to strip the string you want out.
You can also just use the good old string functions in this case, no need to involve regexps.
First find the string /campaign/, then take the substring with everything after it (tgadv/asd/whatever/?redirect), then find the next / or ? after the start of the string, and everything in between will be what you need (tgadv).

Determine User Input Contains URL

I have a input form field which collects mixed strings.
Determine if a posted string contains an URL (e.g. http://link.com, link.com, www.link.com, etc) so it can then be anchored properly as needed.
An example of this would be something as micro blogging functionality where processing script will anchor anything with a link. Other sample could be this same post where 'http://link.com' got anchored automatically.
I believe I should approach this on display and not on input. How could I go about it?
You can use regular expressions to call a function on every match in PHP. You can for example use something like this:
<?php
function makeLink($match) {
// Parse link.
$substr = substr($match, 0, 6);
if ($substr != 'http:/' && $substr != 'https:' && $substr != 'ftp://' && $substr != 'news:/' && $substr != 'file:/') {
$url = 'http://' . $match;
} else {
$url = $match;
}
return '' . $match . '';
}
function makeHyperlinks($text) {
// Find links and call the makeLink() function on them.
return preg_replace('/((www\.|(http|https|ftp|news|file)+\:\/\/)[_.a-z0-9-]+\.[a-z0-9\/_:#=.+?,##%&~-]*[^.|\'|\# |!|\(|?|,| |>|<|;|\)])/e', "makeLink('$1')", $text);
}
?>
You will want to use a regular expression to match common URL patterns. PHP offers a function called preg_match that allows you to do this.
The regular expression itself could take several forms, but here is something to get you started (also maybe just Google 'URL regex':
'/^(((http|https|ftp)://)?([[a-zA-Z0-9]-.])+(.)([[a-zA-Z0-9]]){2,4}([[a-zA-Z0-9]/+=%&_.~?-]))$/'
So your code should look something this:
$matches = array(); // will hold the results of the regular expression match
$string = "http://www.astringwithaurl.com";
$regexUrl = '/^(((http|https|ftp):\/\/)?([[a-zA-Z0-9]\-\.])+(\.)([[a-zA-Z0-9]]){2,4}([[a-zA-Z0-9]\/+=%&_\.~?\-]*))*$/';
preg_match($regexUrl, $string, $matches);
print_r($matches); // an array of matched patterns
From here, you just want to wrap those URL patterns in an anchor/href tag and you're done.
Just how accurate do you want to be? Given just how varied URLs can be, you're going to have to draw the line somewhere. For instance. www.ca is a perfectly valid hostname and does bring up a site, but it's not something you'd EXPECT to work.
You should investigate regular expressions for this.
You will build a pattern that will match the part of your string that looks like a URL and format it appropriately.
It will come out something like this (lifted this, haven't tested it);
$pattern = "((https?|ftp|gopher|telnet|file|notes|ms-help):((//)|(\\\\))+[\w\d:##%/;$()~_?\+-=\\\.&]*)";
preg_match($pattern, $input_string, $url_matches, PREG_OFFSET_CAPTURE, 3);
$url_matches will contain an array of all of the parts of the input string that matched the url pattern.
You can use $_SERVER['HTTP_HOST'] to get the host information.
<?php
$host = $SERVER['HTTP_HOST'];
?>
Post

Categories