Rapidshare URL not matching correctly

Rapidshare URL not matching correctly - php

I'm trying to make sure that a Rapidshare URL is valid when a user submits it through my form.
This is the regex that I've come up with so far:
http://rapidshare.com/files/[0-9]+/[a-zA-Z0-9\._-]+
A rapidshare link looks like this:
http://rapidshare.com/files/168501977/some_random-file.zip
My pattern matches, but not entirely correctly. For example, if we use this input:
http://rapidshare.com/files/168501977/some_random-file.zip£%^$
It will still match using the PHP function preg_match(), and let it go through, even though there are illegal symbols on the end of the URL. I want the pattern to match the entire input, and not just a random length that matches.
Any help would be appreciated, cheers!

You need to anchor the regex pattern. Use ^ to anchor the beginning and $ to anchor the end. So the pattern becomes:
^http://rapidshare.com/files/[0-9]+/[a-zA-Z0-9\._-]+$
This prevents a partial match of the string like the example is generating.

Validate the start and the end of your string using ^ and $. Example:
^ht{2}p:\/{2}rapidshare\.com\/files\/\d+\/[\.a-zA-Z_-]+$

Related

PHP Regex to match url contains url fragment

I have one url fragment: page/login and i need to know if another url fragment contains them.
These, will match:
/admin/page/login/
/admin/page/login
admin/page/login
http://www.dot.com/admin/page/login
/admin/page/login?id=10
/admin/page/login/id/10
/admin/page/login/?id=10
/admin/page/login/user?id=10
/admin/page/login/user/?id=10
page/login
page/login/
page/login/id/10
/page/login/id/10
And these not:
/admin/firstpage/login
admin/page/loginOk
/admin/page/loginOk/id/10
mypage/login/id/10
/mypage/login/id/10
mypage/login
I tried: page\/login[\/\s\?], \/?page\/login[\/\s\?] without any result

You can use a word boundary so partial matches aren't matched.
\bpage\/login[\/\s?]
Demo: https://regex101.com/r/yhNsdw/1/
Also if you change your delimiter none of the forward slashes will need to be escaped.

Regex in php: Compulsory second occurence of word

I need to match a few urls for an application I'm working on;
So, I've got this reference string:
content/course/32/lesson/61/content/348
and I need a pattern that matches either
content
OR
content/course/[number]/lesson/[number]/content/[number]
What I've done so far is come up with this pattern:
$my_regex = "/content(\/?|(\/course\/\d{1,4}\/lesson\/\d{1,4}\/content\/\d{1,4}))$/";
which however has the following problem: This string returns a match which should otherwise not:
content/course/32/lesson/61/content
I'm thinking that it's got something to do with the word content repeating twice but I'm not entirely sure.
Any help is much appreciated.

The reason for the match is the alternation.
content\/?$
matches
content/course/32/lesson/61/content
To fix this, add a ^ (beginning of line) to the start of your regex to ensure the entire string is matched and not only the ending:
/^content(\/?|(\/course\/\d{1,4}\/lesson\/\d{1,4}\/content\/\d{1,4}))$/
See it in action

this works:
/(^content\/?|content\/course\/\d{1,4}\/lesson\/\d{1,4}\/content\/\d{1,4})$/

multiple text replace inside a string, while keeping the selected variable

I have a web script that creates a HTML page into a PHP string, then delivers it to the user. All of the pages are generated by index.php, with a unique url.
domain.host.com/index.php?loadpage=/BLAH
The homepage is static HTML, but every other page is dynamically generated into this PHP string. It may seem like im rambling, just trying to give as much info as possible. I have created a javascript code to modify the link url:
BLAH Link
This basically shows the nice neat link in the status bar, but the javascript sends it to the URL i want (I have no need to modify the url bar, as this is in an iframe)
These links are fine on the static page. But on the dynamically generated page thats in the PHP string is a little harder. I need to search through a string for every occurence of:
href="?loadpage=/ [WILDCARD] " title=
and replace it with:
href="http://domain.com/ [WILDCARD] " onclick="location.href='?loadpage=/ [WILDCARD] '; return false;" title=
This seems very complicated to me and I think it could be ereg / preg match / replace, but have no clue with regex.
In a short summary, I need some way of searching through a php string that contains the full page html, and replacing the first string with the second (on every occurance of a link with '?loadpage/'. But each link will have a different [WILDCARD] so i'm presuming, that the script will need to find every occurance, save the [WILDCARD] to a variable, then do the replace function, and insert the word its just saved as a variable from the first url.
EDIT.
Just to clarify what the original link looks like:
<a id="random" href="?loadpage=/BLAH" title="BLAH Title"></a>
this is why i am only searching from the href attribute.

You are right, what you need is a regex. (Your need for a wildcard replace is the clue). This answer is not supposed to be a complete solution, just give you an idea how regexes work. I will leave it to you to integrate this with php (try preg_match_all)
This is the pattern you want to match:
"\?loadpage=\/([^"]*)"
The \ is an escape for characters that have special meaing in regexes
So ignoring the escapes this is
"?loadpage=/ //the start of the string up to the wildcard part
() // capturing parentheses, indicating a part that
// you want to access in the replace string
[^"]* // any number of occurences of any character that is NOT doublequote
// ^ is the negation symbol
// * indicates "zero or more occurrences"
followed by...
" doublequote character
Now you need a replacement string ... for this you just need to know that your (capture parentheses) allow you to recall that part of the match. In most regex flavours your can capture these to a series numbered variables, usually represented as $1, $2, $3.. \1 \2 \3... In your case you only have one capture variable to deal with.
So you replacement string could look like
"http://domain.com/$1/" onclick="location.href='?loadpage=/$1'; return false"
In perl you would put the whole thing together like this:
$string =~ s|"\?loadpage=\/([^"]*)"|"http://domain.com/$1/" onclick=\"location.href='?loadpage=/$1'\; return false"|g;
Note that you don't need to escape your quotemarks. This may differ in php.
As you will see it easily gets very cryptic. regular-expressions.info is a useful online reference.
just so you know what you are looking at (you won't need to do this in php)...
=~ is the perl regex operator (you won't use this in php, take a look at the preg_match documentation)
then you have the form
s|match_pattern|replace_pattern|g;
where s indicates replacement (as opposed to simple matching)
g indicates global matching (otherwise process will stop on first match)
||| are the separators. Usually written /// but then you would have to escape all of your URL //s, which doubles the illegibility.
But this is now too much perl-specifc detail, read the php regex docs!

Convert PHP RegEx to JavaScript RegEx

I have a PHP regular expression I'm using to get the YouTube video code out of a URL.
I'd love to match this with a client-side regular expression in JavaScript. Can anyone tell me how to convert the following PHP regex to JavaScript?
preg_match("#(?<=v=)[a-zA-Z0-9-]+(?=&)|(?<=v\/)[^&\n]+(?=\?)|(?<=embed/)[^&\n]+|(?<=v=)[^&\n]+|(?<=youtu.be/)[^&\‌n]+#", $url, $matches);
Much appreciated, thanks!

I think the only problem is to get rid of the lookbehind assertions (?<=...), they are not supported in Javascript.
The advantage of them is, you can use them to ensure that a pattern is before something, but they are NOT included in the match.
So, you need to remove them, means change (?<=v=)[a-zA-Z0-9-]+(?=&) to v=[a-zA-Z0-9-]+(?=&), but now your match starts with "v=".
If you just need to validate and don't need the matched part, then its fine, you are done.
But if you need the part after v= then put instead the needed pattern into a capturing group and continue working with those captured values.
v=([a-zA-Z0-9-]+)(?=&)
You will then find the matched substring in $1 for the first group, $2 for the second, $3 ...

you can replace your look behind assertion using this post
Javascript: negative lookbehind equivalent?

Regex in preg_replace to detect url format and extract elements

I need to replace certain user-entered URLs with embedded flash objects...and I'm having trouble with a regex that I'm using to match the url...I think mainly because the URLs are SEO-friendly and therefore a bit more difficult to parse
URL structure: http://www.site.com/item/item_title_that_can_include_1('_etc-32CHARACTERALPHANUMERICGUID
I need to both detect a match of an URL in that format and capture the 32CHARACTERALPHANUMERICGUID which is always placed after the - in the url
something like this:
$ret = preg_replace('#http://www\.site\.com/item/([^-])-([a-zA-Z0-9]+)#','<embed>itemid=$2</embed>', $ret);
For some reason, the above does not find a match for an URL in the specified format. I'm new to regexes, so I think I'm missing something fairly obvious.

You should check out parse_url().
Examine the results - it was made for parsing URLs. You'll be able to extract the data you require from the tokens returned.
If you are regex crazy, try this...
/^http:\/\/www\.site\.com\/item\/[^-]*\-([a-zA-Z0-9]{32})$/
Your example is almost there, but...
When you do the not character range, i.e. [^-], you still need a quantifier. I placed *, or 0 or more.
You don't seem to use the item title, so we won't bother capturing it.
You should use beginning (^) and end ($) anchors if the string is always exactly like that.
You say the GUID is 32 chars, so we may as well explicitly state that with the {32} quantifier.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Rapidshare URL not matching correctly - php

You need to anchor the regex pattern. Use ^ to anchor the beginning and $ to anchor the end. So the pattern becomes: ^http://rapidshare.com/files/[0-9]+/[a-zA-Z0-9\._-]+$ This prevents a partial match of the string like the example is generating.

Validate the start and the end of your string using ^ and $. Example: ^ht{2}p:\/{2}rapidshare\.com\/files\/\d+\/[\.a-zA-Z_-]+$

Related

PHP Regex to match url contains url fragment

Regex in php: Compulsory second occurence of word

multiple text replace inside a string, while keeping the selected variable

Convert PHP RegEx to JavaScript RegEx

Regex in preg_replace to detect url format and extract elements

Categories

Resources