PHP Regular expression

PHP Regular expression - php

I would like to capture the last folder in paths without the year. For this string path I would need just 'Millers Crossing' not 'Movies\Millers Crossing' which is what my current regex captures.
G:\Movies\Millers Crossing [1990]
preg_match('/\\\\(.*)\[\d{4}\]$/i', $this->parentDirPath, $title);

How about basename [docs] and substr [docs] instead of complicated expressions?
$title = substr(basename($this->parentDirPath), 0, -6);
This assumes that there will always be a year in the format [xxxx] at the end of the string.
(And it works on *nix too ;))
Update: You can still use basename to get the folder and then apply a regular expression:
$folder = basename($this->parentDirPath);
preg_match('#^(.*?)(?:\[\d{4}\])?$#', $str, $match);
$title = $match[1];

Try
preg_match('/\\\\([^\\]*)\[\d{4}\]$/i', $this->parentDirPath, $title);
Basically, instead of matching any character with ., you're matching any character but \.

It looks like you want something like this:
/([^\\])+\s\[\d{4}\]$/
That's what I'd go with, at least. Should only include whatever comes after the last backslash in the string, and the movie title will be in the first capture group.

Simpler approach:
([^\\]*)\s?\[\d{4}\]$
I believe your issue is also with you including "double backslashes" (e.g. \\\\ instead of a single \\. You can also make life easier by using a class to include characters you don't want by prefixing it with a caret (^).

Related

PHP regex last occurrence of words

My string is: /var/www/domain.com/public_html/foo/bar/folder/another/..
I want to remove the root folder from this string, to get only public folder, because some servers have multiple websites inside.
My actual regex is: /^(.*?)(www|public_html|public|html)/s
My actual result is: /domain.com/public_html/foo/bar/folder/another/..
But i want to remove the last ocorrence, and get somethig like this: /foo/bar/folder/another/..
Thanks!

You have to use a greedy quantifier and to check if the alternative is enclosed between slashes using lookarounds:
/^.*(?<![^\/])(?:www|public(?:_html)?|html)(?![^\/])/
About the lookarounds: I use negative lookarounds with a negated character class to check if there is a slash or the limit of the string at the same time. This way you are sure that for instance html is a folder and not the part of another folder name.
I removed the s modifier that is useless. I removed the capture groups too since the goal is to replace all with an empty string.

The ? makes your expression non-greedy which is not actually what you want here. Try:
^(.*)(www|public_html|public|html)
which should keep going until the last match.
Demo: https://regex101.com/r/v5WbB3/1/

Regex After Last / and Before period

Sorry if the title is confusing. All I'm trying to do is some simple regex:
The text: /thing/images/info.gif
And what I want is: info
My regex (not fully working): ([^\/]+$)(.*?)(?=\.gif)
(Note: [^\/]+$ returns info.gif)
Thanks for any help!

I'd say you don't need to match all the string, so you can be much more generic. If you know your string always contains a path you can just use:
preg_match( '/([^\/]+)\.\w+$/', "/thing/images/info.gif", $matches) ;
print_r( $matches );
and it will be valid for any filename, even names that contains dots like my_file.name.jpg or spaces like /thing/images/my image.gif
Demo here.
The structure is (from the end of the regex moving to the left):
Match before the end of the string
any number of characters preceded by a dot
any character that is not a slash (your filename, if there is a slash, there starts the directories)

Not sure how much more complex the string is but this seems to work on the test string:
preg_match('![^/.]+(?=\.gif)!', '/thing/images/info.gif', $m);
Matching NOT / NOT . followed by .gif.

In editors (Sublime):
Find:^(.*)(\/)(.*)(\.)(.*)$
Replace it with:\3
In PHP:
<?php
preg_match('/^(.*)(\/)(.*)(\.)(.*)$/', '/thing/images/info.gif', $match);
echo $match[3];

PHP preg_replace pattern only seems to work if its wrong?

I have a string that looks like this
../Clean_Smarty_Projekt/tpl/templates_c\.
../Clean_Smarty_Projekt/tpl/templates_c\..
I want to replace ../, \. and \.. with a regulare expression.
Before, I did this like this:
$result = str_replace(array("../","\..","\."),"",$str);
And there it (pattern) has to be in this order because changing it makes the output a little buggy. So I decided to use a regular expression.
Now I came up with this pattern
$result = preg_replace('/(\.\.\/)|(\\[\.]{1,2})/',"",$str);
What actually returns only empty strings...
Reason: (\\[\.]{1,2})
In Regex101 its all ok. (Took me a couple of minutes to realize that I don't need the /g in preg_replace)
If I use this pattern in preg_replace I have to do (\\\\[\.]{1,2}) to get it to work. But that's obviously wrong because im not searching for two slashes.
Of course I know the escaping rulse (escaping slashes).
Why doesn't this match correctly ?

I suggest you to use a different php delimiter. Within the / delimiter, you need to use three \\\ or four \\\\ backslashes to match a single backslash.
$string = '../Clean_Smarty_Projekt/tpl/templates_c\.'."\n".'../Clean_Smarty_Projekt/tpl/templates_c\..';
echo preg_replace('~\.\./|\\\.{1,2}~', '', $string)
Output:
Clean_Smarty_Projekt/tpl/templates_c
Clean_Smarty_Projekt/tpl/templates_c

Regular Expression to collect everything after the last /

I'm new at regular expressions and wonder how to phrase one that collects everything after the last /.
I'm extracting an ID used by Google's GData.
my example string is
http://spreadsheets.google.com/feeds/spreadsheets/p1f3JYcCu_cb0i0JYuCu123
Where the ID is: p1f3JYcCu_cb0i0JYuCu123
Oh and I'm using PHP.

This matches at least one of (anything not a slash) followed by end of the string:
[^/]+$
Notes:
No parens because it doesn't need any groups - result goes into group 0 (the match itself).
Uses + (instead of *) so that if the last character is a slash it fails to match (rather than matching empty string).
But, most likely a faster and simpler solution is to use your language's built-in string list processing functionality - i.e. ListLast( Text , '/' ) or equivalent function.
For PHP, the closest function is strrchr which works like this:
strrchr( Text , '/' )
This includes the slash in the results - as per Teddy's comment below, you can remove the slash with substr:
substr( strrchr( Text, '/' ), 1 );

Generally:
/([^/]*)$
The data you want would then be the match of the first group.
Edit   Since you’re using PHP, you could also use strrchr that’s returning everything from the last occurence of a character in a string up to the end. Or you could use a combination of strrpos and substr, first find the position of the last occurence and then get the substring from that position up to the end. Or explode and array_pop, split the string at the / and get just the last part.

You can also get the "filename", or the last part, with the basename function.
<?php
$url = 'http://spreadsheets.google.com/feeds/spreadsheets/p1f3JYcCu_cb0i0JYuCu123';
echo basename($url); // "p1f3JYcCu_cb0i0JYuCu123"
On my box I could just pass the full URL. It's possible you might need to strip off http:/ from the front.
Basename and dirname are great for moving through anything that looks like a unix filepath.

/^.*\/(.*)$/
^ = start of the row
.*\/ = greedy match to last occurance to / from start of the row
(.*) = group of everything that comes after the last occurance of /

you can also normal string split
$str = "http://spreadsheets.google.com/feeds/spreadsheets/p1f3JYcCu_cb0i0JYuCu123";
$s = explode("/",$str);
print end($s);

This pattern will not capture the last slash in $0, and it won't match anything if there's no characters after the last slash.
/(?<=\/)([^\/]+)$/
Edit: but it requires lookbehind, not supported by ECMAScript (Javascript, Actionscript), Ruby or a few other flavors. If you are using one of those flavors, you can use:
/\/([^\/]+)$/
But it will capture the last slash in $0.

Not a PHP programmer, but strrpos seems a more promising place to start. Find the rightmost '/', and everything past that is what you are looking for. No regex used.
Find position of last occurrence of a char in a string

based on #Mark Rushakoff's answer the best solution for different cases:
<?php
$path = "http://spreadsheets.google.com/feeds/spreadsheets/p1f3JYcCu_cb0i0JYuCu123?var1&var2#hash";
$vars =strrchr($path, "?"); // ?asd=qwe&stuff#hash
var_dump(preg_replace('/'. preg_quote($vars, '/') . '$/', '', basename($path))); // test.png
?>
Regular Expression to collect everything after the last /
How to get file name from full path with PHP?

How to write regex to find one directory in a URL?

Here is the subject:
http://www.mysite.com/files/get/937IPiztQG/the-blah-blah-text-i-dont-need.mov
What I need using regex is only the bit before the last / (including that last / too)
The 937IPiztQG string may change; it will contain a-z A-Z 0-9 - _
Here's what I tried:
$code = strstr($url, '/http:\/\/www\.mysite\.com\/files\/get\/([A-Za-z0-9]+)./');
EDIT: I need to use regex because I don't actually know the URL. I have string like this...
a song
more text
oh and here goes some more blah blah
I need it to read that string and cut off filename part of the URLs.

You really don't need a regexp here. Here is a simple solution:
echo basename(dirname('http://www.mysite.com/files/get/937IPiztQG/the-blah-blah-text-i-dont-need.mov'));
// echoes "937IPiztQG"
Also, I'd like to quote Jamie Zawinski:
"Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems."

This seems far too simple to use regex. Use something similar to strrpos to look for the last occurrence of the '/' character, and then use substr to trim the string.

/http:\/\/www.mysite.com\/files\/get\/([^/]+)\/
How about something like this? Which should capture anything that's not a /, 1 or more times before a /.

The greediness of regexp will assure this works fine ^.*/

The strstr() function does not use a regular expression for any of its arguments it's the wrong function for regex replacement.
Are you thinking of preg_replace()?
But a function like basename() would be more appropriate.

Try this
$ok=preg_match('#mysite\.com/files/get/([^/]*)#i',$url,$m);
if($ok) $code=$m[1];
Then give a good read to these pages
http://www.php.net/preg_match
preg_replace
Note
the use of "#" as a delimiter to avoid getting trapped into escaping too many "/"
the "i" flag making match insensitive
(allowing more liberal spellings of the MySite.com domain name)
the $m array of captured results

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

PHP Regular expression - php

I would like to capture the last folder in paths without the year. For this string path I would need just 'Millers Crossing' not 'Movies\Millers Crossing' which is what my current regex captures. G:\Movies\Millers Crossing [1990] preg_match('/\\\\(.*)\[\d{4}\]$/i', $this->parentDirPath, $title);

Try preg_match('/\\\\([^\\]*)\[\d{4}\]$/i', $this->parentDirPath, $title); Basically, instead of matching any character with ., you're matching any character but \.

It looks like you want something like this: /([^\\])+\s\[\d{4}\]$/ That's what I'd go with, at least. Should only include whatever comes after the last backslash in the string, and the movie title will be in the first capture group.

Simpler approach: ([^\\]*)\s?\[\d{4}\]$ I believe your issue is also with you including "double backslashes" (e.g. \\\\ instead of a single \\. You can also make life easier by using a class to include characters you don't want by prefixing it with a caret (^).

Related

PHP regex last occurrence of words

Regex After Last / and Before period

PHP preg_replace pattern only seems to work if its wrong?

Regular Expression to collect everything after the last /

How to write regex to find one directory in a URL?

Categories

Resources