php regex vsprintf addign slash on file extension

php regex vsprintf addign slash on file extension - php

I have a problem with the following piece of code: pastebin. For example:
/^\/index\.php\/index\/home\/(\w+)$/
It adds a slash before the .php extension. Any ideas how to fix it?

Well, if you pass that example as the uri I see that on line 10 you have preg_quote($uri). That should be the reason. Since dot (.) has a meaning in Regex the function is escaping it.
But that is what you want I believe since if you strip that slash your regex will match ANY character instead of the dot (including the dot). So any of these will be valid:
indexBphp
index-php
indexmphp
index.php
etc...
Dot in Regex means match any character at this position. So I believe that there is nothing wrong, right?
One way to fix this if you still want to have that dot there is to build the regex in two separate parts:
$urlDivided = explode('.php', $url);
$this->finalRegex = preg_quote($urlDivided[0]) . '.php' . preg_quote($urlDivided[1]);
Obviously, the method above assumes that you always have the '.php' extension in the url. You should do sanity checks.

Related

Regex to replace domain of url if it's ending with .css

I'm trying to write a php script to replace ONLY domain of every URL in the content with new domain if the URL ends with .css.
For example:
www.example.com/asset/css/style.css
After checking condition and replacement we have:
www.new-domain.net/asset/css/style.css
Would anyone please help me to find the correct pattern for this.
So far I've tried this:
preg_replace('/[http://].*\.(css)/i','www.new-domain.net',$Html_contents)

If I correctly understood, you should try something like:
preg_replace('/(https?:\/\/|)?[^\/]*(?=\/.*\.css$)/i','$1www.new-domain.net',$Html_contents)
Where
(https?:\/\/|) means that the string http:// (or https://) is optional
[^\/]* means "anithing but /"
(?=\/.*\.css$) means "a /, followed by anything, followed by a literal dot, followed by css, followed by end of string"
See demo here.

if the domain is static you can try this without using regex
$old_domain = 'https://www.example.com/asset/css/style.css';
if (substr($old_domain, -4) == '.css'){
echo str_replace('www.example.com', 'www.new-domain.net', $old_domain);
}

regex to clean up url

I am looking for a way to get a valid url out of a string like:
$string = 'http://somesite.com/directory//sites/9/my_forms/3-895a3e/somefilename.jpg|:||:||:||:|19845';
My original solution was:
preg_match('#^[^:|]*#', str_replace('//', '/', $string), $modifiedPath);
But obviously its going to remove a slash from the http:// instead of the one in the middle of the string.
My expected output that I want from the original is:
http://somesite.com/directory/sites/9/my_forms/3-895a3e/somefilename.jpg
I could always break off the http part of the string first but would like a more elegant solution in the form of regex if possible. Thanks.

This will do exactly what you are asking:
<?php
$string = 'http://somesite.com/directory//sites/9/my_forms/3-895a3e/somefilename.jpg|:||:||:||:|19845';
preg_match('/^([^|]+)/', $string, $m); // get everything up to and NOT including the first pipe (|)
$string = $m[1];
$string = preg_replace('/(?<!:)\/\//', '/' ,$string); // replace all occurrences of // as long as they are not preceded by :
echo $string; // outputs: http://somesite.com/directory/sites/9/my_forms/3-895a3e/somefilename.jpg
exit;
?>
EDIT:
(?<!X) in regular expressions is the syntax for what is called a lookbehind. The X is replaced with the character(s) we are testing for.
The following expression would match every instance of double slashes (/):
\/\/
But we need to make sure that the match we are looking for is NOT preceded by the : character so we need to 'lookbehind' our match to see if the : character is there. If it is then we don't want it to be counted as a match:
(?<!:)\/\/
The ! is what says NOT to match in our lookbehind. If we changed it to (?=:)\/\/ then it would only match the double slashes that did have the : preceding them.
Here is a Quick tutorial that can explain it all better than I can lookahead and lookbehind tutorial

Assuming all your strings are in the form given, you don't need any but the simplest of regexes to do this; if you want an elegant solution, then a regex is definitely not what you need. Also, double slashes are legal in a URL, just like in a Unix path, and mean the same thing a single slash does, so you don't really need to get rid of them at all.
Why not just
$url = array_shift(preg_split('/\|/', $string));
?
If you really, really care about getting rid of the double slashes in the URL, then you can follow this with
$url = preg_replace('/([^:])\/\//', '$1/', $url);
or even combine them into
$url = preg_replace('/([^:])\/\//', '$1/', array_shift(preg_split('/\|/', $string)));
although that last form gets a little bit hairy.

Since this is a quite strictly defined situation, I'd consider just one preg to be the most elegant solution.
From the top of my head:
$sanitizedURL = preg_replace('~((?<!:)/(?=/)|\\|.+)~', '', $rawURL);
Basically, what this does is look for any forward slash that IS NOT preceded by a colon (:), and IS followed bij another forward slash. It also searches for any pipe character and any character following it.
Anything found is removed from the result.
I can explain the RegEx in more detail if you like.

Regex to find lines that start with /*

I need a regular expression to find all the lines that begins with /*
$num_queries = preg_match_all(
'REG_EXP',
file_get_contents(__DIR__ . DIR_PLANTILLAS . '/' . 'temp_template.sql')
);
I try this '^\/\*.*' but it does not work.

If you use this string: /^\/\*.*/ in the preg_match() function, it'll work. This pattern matches /* followed by maybe some text.
Make sure the regular expression will be performed on each line. I recommend that you first split the string (file contents) by a newline. You can use the function preg_split() in order to do so.
If you don't want to split the file contents by each line first, then you can use the following pattern: /(^|\n)\/\*(.*)/. That pattern matches first either the beginning of the string or a newline, followed by /*, followed by maybe some text.
Notice that in the patterns /^\/\*.*/ and /(^|\n)\/\*(.*)/ the / is used as delimiter. That means that further occurences of / must be escaped.

Please, note, what you deal with multiline content, but ^ means beginning of content, not a beginning of a line.
try (\/[^\r\n]+)[\r\n$]+

Try this.
^\/\*+[^\n]*$
Edit: correction re escaping the /

php PCRE regex to get only the file name that terminates in .txt

so I am trying to form a PCRE regex in php, specifically for use with preg_replace, that will match any number of characters that make up a text(.txt) file name, from this I will derive the directory of the file.
my initial approach was to define the terminating .txt string, then attempt to specify a character match on every character except for the / or \, so I ended up with something like:
'/[^\\\\/]*\.txt$/'
but this didn't seem to work at all, I assume it might be interpreting the negation as the demorgan's form aka:
(A+B)' <=> A'B'
but after attempting this test:
'/[^\\\\]\|[^/]*\.txt$/'
I came to the same result, which made me think that I shouldn't escape the or operator(|), but this also failed to match. Anyone know what I'm doing wrong?

The foloowing regular expression should work for getting the filename of .txt files:
$regex = "#.*[\\\\/](.*?\.txt)$#";
How it works:
.* is greedy and thus forces match to be as far to the right as possible.
[\\\\/] ensures that we have a \ or / in front of the filename.
(.*?\.txt) uses non-greedy matching to ensure that the filename is as small as possible, followed by .txt, capturing it into group 1.
$ forces match to be at end of string.

Try this pattern '/\b(?P<files>[\w-.]+\.txt)\b/mi'
$PATTERN = '/\b(?P<files>[\w-.]+\.txt)\b/mi';
$subject = 'foo.bar.txt plop foo.bar.txtbaz foo.txt';
preg_match_all($PATTERN, $subject, $matches);
var_dump($matches["files"]);

removing dots and slashes regex - non relative

how could I remove the trailing slashes and dots from a non root-relative path.
For instance, ../../../somefile/here/ (independently on how deep it is) so I just get /somefile/here/

No regex needed, rather use ltrim() with /. . Like this:
echo "/".ltrim("../../../somefile/here/", "/.");
This outputs:
/somefile/here/

You could use the realpath() function PHP provides. This requires the file to exist, however.

If I understood you correctly:
$path = "/".str_replace("../","","../../../somefile/here/");

This should work:
<?php
echo "/".preg_replace('/\.\.\/+/',"","../../../somefile/here/")
?>
You can test it here.

You could try :
<?php
$str = '../../../somefile/here/';
$str = preg_replace('~(?:\.\./)+~', '/', $str);
echo $str,"\n";
?>

(\.*/)*(?<capturegroup>.*)
The first group matches some number of dots followed by a slash, an unlimited number of times; the second group is the one you're interested in. This will strip your leading slash, so prepend a slash.
Beware that this is doing absolutely no verification that your leading string of slashes and periods isn't something patently stupid. However, it won't strip leading dots off your path, like the obvious ([./])* pattern for the first group would; it finds the longest string of dots and slashes that ends with a slash, so it won't hurt your real path if it begins with a dot.
Be aware that the obvious "/." ltrim() strategy will strip leading dots from directory names, which is Bad if your first directory has one- entirely plausible, since leading dots are used for hidden directories.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

php regex vsprintf addign slash on file extension - php

I have a problem with the following piece of code: pastebin. For example: /^\/index\.php\/index\/home\/(\w+)$/ It adds a slash before the .php extension. Any ideas how to fix it?

Related

Regex to replace domain of url if it's ending with .css

regex to clean up url

Regex to find lines that start with /*

php PCRE regex to get only the file name that terminates in .txt

removing dots and slashes regex - non relative

Categories

Resources