Follow cloaked link with with chain of numbers to front - php

I'm building a small cloaking link script but I need to find each one with a different string number eg( 'mylinkname1'-1597). By the way: the number is always integer.
The problem is that I never know the string number so I was thinking to use regex but something is failing.
Here's what I got now:
$pattern = '/-([0-9]+)/';
$v = $_GET['v']
if ($v == 'mylinkname1'.'-'.$pattern) {$link = 'http://example1.com/';}
if ($v == 'mylinkname2'.'-'.$pattern) {$link = 'http://example2.com/';}
if ($v == 'mylinkname3'.'-'.$pattern) {$link = 'http://example3.com/';}
header("Location: $link") ;
exit();

The dash is already in the pattern so you don't have to add it in the if clause.
You can omit the capturing group around the digits -[0-9]+, and you have to use the pattern with preg_match.
You might update the format of the if statements to:
$pattern = '-[0-9]+';
if (preg_match("/mylinkname1$pattern/", $v)) {$link = 'http://example1.com/';}
To prevent mylinkname1-1597 being part of a larger word, you might surround the pattern with anchors ^ and $ to assert the start and end of the string or word boundaries \b

no need for regular expressions here at all just split the string on the hyphen and only match that, also I recommend a case\switch when you 3 or if\eleses:
$v=explode('-',$_GET['v']);
switch ($v[0]) {
case "mylinkname1":
$link = 'http://example1.com/';
break;
case "mylinkname2":
$link = 'http://example2.com/';
break;
case "mylinkname3":
$link = 'http://example3.com/';
break;
default:
echo "something not right";
}
header("Location: $link") ;
exit();

Related

What is the Regex for Only One # Character?

I am looking for a Regex to use in PHP in order to match one character; the # symbol.
For example, if I typed: P#ssword into an input, the Regex will match. If I typed P##ssword into an input, the regex will not match.
Here is my PHP Code that I am using:
<?php
session_start();
if($_SERVER['REQUEST_METHOD'] == "POST") {
$conn=mssql_connect('d','dd','d');
mssql_select_db('d',$conn);
if(! $conn )
{
die('Could not connect: ' . mssql_get_last_message());
}
$username = ($_POST['username']);
$password = ($_POST['password']);
if (preg_match("[\W]",$_POST["password"]))
{
if (!preg_match("^[^#]*#[^#]*$",$_POST["password"]))
{
header("location:logingbm.php");
} else {
}
}
if(!filter_var($_POST['username'], FILTER_VALIDATE_EMAIL))
{
if ($_POST["username"])
{
if ($_POST["password"])
{
$result = mssql_query("SELECT * FROM staffportal WHERE email='".$username."' AND
password='".$password."'");
if(mssql_num_rows($result) > 0) {
$_SESSION['staff_logged_in'] = 1;
$_SESSION['username'] = $username;
}}}} else {
if ($_POST["password"])
{
$result = mssql_query("SELECT * FROM staffportal WHERE email='".$username."' AND
password='".$password."'");
if(mssql_num_rows($result) > 0) {
$_SESSION['staff_logged_in'] = 1;
$_SESSION['username'] = $username;
}}}}
if(!isset($_SESSION['staff_logged_in'])) {
header("location:logingbm.php");
echo "<script>alert('Incorrect log-in information!');</script>";
} else {
header("location:staffportal.php");
}
?>
Other lightweight approaches...
Without regex
Just use substr_count (see demo)
<?php
$str1 = "pa#s#s";
$str2 = "pa#ss";
echo (substr_count($str1,"#")==1)?"beauty\n":"abject\n"; // abject
echo (substr_count($str2,"#")==1)?"beauty\n":"abject\n"; // beauty
With regex
EDIT: just saw that Sam wrote something equivalent.
If you want to use regex, you could use this fairly simple regex:
#
How? This code (see demo)
<?php
$str1 = "pa#s#s";
$str2 = "pa#ss";
$regex = "~#~";
echo (preg_match_all($regex,$str1,$m)==1)?"beauty\n":"abject\n"; // abject
echo (preg_match_all($regex,$str2,$m)==1)?"beauty\n":"abject\n"; // beauty
The easiest way would be to use the return value of preg_match_all().
Returns the number of full pattern matches (which might be zero), or FALSE if an error occurred.
Example:
$count = preg_match_all('/#/', $password, $matches);
Non regex solution (based off of #cdhowie's comment):
$string = 'P#ssword';
$length = strlen($string);
$count = 0;
for($i = 0; $i < $length; $i++) {
if($string[$i] === '#') {
$count++;
}
}
This works because you can access characters of Strings as you would with normal arrays ($var = 'foo'; $var[0] = 'f';).
As I said in my comment, your pattern needs delimiters /, #, ~ or whatever you want (see the PHP doc for that and test yourself).
To be quickly sure that a string contains only one #, you can do that:
if (preg_match('~\A[^#]*#[^#]*\z~', $yourstr))
echo 'There is one #';
else
echo 'There is more than one # or zero #';
This regexp will do what you want:
^[^#]*#[^#]*$
This matches any line that contains one and only one #.
Explanation
^ matches the beginning of the line
[^#]* matches everything before the #
# matches the # character
[^#]* matches everything after the #
$ matches the end of the line
Use
preg_match("#^[^#]*#[^#]*$#", $passwd); //Matches $passwd if it contains only one character
Here's what your regex code means:
If there is at least one non-word character in the string ([\W]), there must be exactly one at-sign (#). There may be any number of any other characters before and after the at-sign: letters, digits, control characters, punctuation, anything. Anything but #.
What I'm wondering is, are you trying to say there can be not more than one at-sign (i.e. zero or one?) That's pretty simple, conceptually; just get rid of the first regex check ("[\W]") and change the second regex to this:
"^[^#]*(?:#[^#]*)?$"
In other words:
Start by consuming not at-signs you see. If you see a #, go ahead and consume it, then resume matching whatever not at-signs remain. If that doesn't leave you at the end of the string, it can only mean there were more than one #. Abandon the attempt immediately and report a failed match.
Of course, this still leaves you with the problem of which other characters you want to allow. I'm pretty sure [^#]* is not what you want.
Also, "[\W]" may be working as you intended, but it's only by accident. You could have written it "/\W/" or "~\W~" "(\W)" and it would work just the same. You may have meant those square brackets to form a character class, but they're not even part of the regex; they're the regex delimiters.
So why did it work, you ask? \W is a predefined character class, equivalent to [^\w]. You can use it inside a regular character class, but it works fine on its own.

Regular Expression not allow duble underscores

I am writing my website user registration part, I have a simple regular expression as follows:
if(preg_match("/^[a-z0-9_]{3,15}$/", $username)){
// OK...
}else{
echo "error";
exit();
}
I don't want to let users to have usernames like: '___' or 'x________y', this is my function which I just wrote to replace duble underscores:
function replace_repeated_underScores($string){
$final_str = '';
$str_len = strlen($string);
$prev_char = '';
for($i = 0; $i < $str_len; $i++){
if($i > 1){
$prev_char = $string[$i - 1];
}
$this_char = $string[$i];
if($prev_char == '_' && $this_char == '_'){
}else{
$final_str .= $this_char;
}
}
return $final_str;
}
And it works just fine, but I wonder if I could also check this with regular expression and not another function.
I would appreciate any help.
Just add negative look-ahead to check whether there is double underscore in the name or not.
/^(?!.*__)[a-z0-9_]{3,15}$/
(?!pattern), called zero-width negative look-ahead, will check that it is not possible to find the pattern, ahead in the string from the "current position" (current position is the position that the regex engine is at). It is zero-width, since it doesn't consume text in the process, as opposed to the part outside. It is negative, since the match would only continue if there is no way to match the pattern (all possibilities are exhausted).
The pattern is .*__, so it simply means that the match will only continue if it cannot find a match for .*__, i.e no double underscore __ ahead in the string. Since the group does not consume text, you will still be at the start of the string when it starts to match the later part of the pattern [a-z0-9_]{3,15}$.
You already allow uppercase username with strtolower, nevertheless, it is still possible to do validation with regex directly by adding case-insensitive flag i:
/^(?!.*__)[a-z0-9_]{3,15}$/i

PHP reg_ex finding multiple words in a string

The nature of the situation, I need 1 pattern to do the following:
Create pattern that should find
exact match for single words
an exact match for a combination of 2 words.
a match of 2 words that could be found in a string.
My issue is with #3. Currently I have:
$pattern = '/\s*(foo|bar|blah|some+text|more+texted)\s*/';
How can I append to this pattern that will find "bad text" in any combination in a string.
Any ideas?
To check string for word bad use regex
/\bbad\b/
To check string for phrase bad text use regex
/\bbad text\b/
To check string for any combination of words bad and text use regex
/\b(bad|text)\s+(?!\1)(?:bad|text)\b/
To check string for presence of words bad and text use regex
/(?=.*\bbad\b)(?=.*\btext\b)/
Couple ways to do this, but here's an easy one
$array_needles = array("needle1", "needle2", etc...);
$array_found_needles = array();
$haystack = "haystack";
foreach ($array as $key=>$val) {
if(stristr($haystack, $val) {
//do whatever you want if its found
$array_found_needles[] = $val; //save the value found
}
}
$found = count($array_found_needles);
if ($found == 0) {
//do something with no needles found
} else if($found == 1) {
//do something with 1 needle found
} else if($found == 2) {
//do something with two needles found, etc
}

Regex to match Youtube URL's

I am trying to validate a Youtube URL using regex:
preg_match('~http://youtube.com/watch\?v=[a-zA-Z0-9-]+~', $videoLink)
It kind of works, but it can match URL's that are malformed. For example, this will match ok:
http://www.youtube.com/watch?v=Zu4WXiPRek
But so will this:
http://www.youtube.com/watch?v=Zu4WX£&P!ek
And this wont:
http://www.youtube.com/watch?v=!Zu4WX£&P4ek
I think it's because of the + operator. It's matching what seems to be the first character after v=, when it needs to try and match everything behind v= with [a-zA-Z0-9-]. Any help is appreciated, thanks.
To provide an alternative that is larger and much less elegant than a regex, but works with PHP's native URL parsing functions so it might be a bit more reliable in the long run:
$url = "http://www.youtube.com/watch?v=Zu4WXiPRek";
$query_string = parse_url($url, PHP_URL_QUERY); // v=Zu4WXiPRek
$query_string_parsed = array();
parse_str($query_string, $query_string_parsed); // an array with all GET params
echo($query_string_parsed["v"]); // Will output Zu4WXiPRek that you can then
// validate for [a-zA-Z0-9] using a regex
The problem is that you are not requiring any particular number of characters in the v= part of the URL. So, for instance, checking
http://www.youtube.com/watch?v=Zu4WX£&P!ek
will match
http://www.youtube.com/watch?v=Zu4WX
and therefore return true. You need to either specify the number of characters you need in the v= part:
preg_match('~http://youtube.com/watch\?v=[a-zA-Z0-9-]{10}~', $videoLink)
or specify that the group [a-zA-Z0-9-] must be the last part of the string:
preg_match('~http://youtube.com/watch\?v=[a-zA-Z0-9-]+$~', $videoLink)
Your other example
http://www.youtube.com/watch?v=!Zu4WX£&P4ek
does not match, because the + sign requires that at least one character must match [a-zA-Z0-9-].
Short answer:
preg_match('%(http://www.youtube.com/watch\?v=(?:[a-zA-Z0-9-])+)(?:[&"\'\s])%', $videoLink)
There are a few assumptions made here, so let me explain:
I added a capturing group ( ... ) around the entire http://www.youtube.com/watch?v=blah part of the link, so that we can say "I want get the whole validated link up to and including the ?v=movieHash"
I added the non-capturing group (?: ... ) around your character set [a-zA-Z0-9-] and left the + sign outside of that. This will allow us to match all allowable characters up to a certain point.
Most importantly, you need to tell it how you expect your link to terminate. I'm taking a guess for you with (?:[&"\'\s])
?) Will it be in html format (e.g. anchor tag) ? If so, the link in href will obviously end with a " or '.
?) Or maybe there's more to the query string, so there would be an & after the value of v.
?) Maybe there's a space or line break after the end of the link \s.
The important piece is that you can get much more accurate results if you know what's surrounding what you are searching for, as is the case with many regular expressions.
This non-capturing group (in which I'm making assumptions for you) will take a stab at finding and ignoring all the extra junk after what you care about (the ?v=awesomeMovieHash).
Results:
http://www.youtube.com/watch?v=Zu4WXiPRek
- Group 1 contains the http://www.youtube.com/watch?v=Zu4WXiPRek
http://www.youtube.com/watch?v=Zu4WX&a=b
- Group 1 contains http://www.youtube.com/watch?v=Zu4WX
http://www.youtube.com/watch?v=!Zu4WX£&P4ek
- No match
a href="http://www.youtube.com/watch?v=Zu4WX&size=large"
- Group 1 contains http://www.youtube.com/watch?v=Zu4WX
http://www.youtube.com/watch?v=Zu4WX£&P!ek
- No match
The "v=..." blob is not guaranteed to be the first parameter in the query part of the URL. I'd recommend using PHP's parse_url() function to break the URL into its component parts. You can also reassemble a pristine URL if someone began the string with "https://" or simply used "youtube.com" instead of "www.youtube.com", etc.
function get_youtube_vidid ($url) {
$vidid = false;
$valid_schemes = array ('http', 'https');
$valid_hosts = array ('www.youtube.com', 'youtube.com');
$valid_paths = array ('/watch');
$bits = parse_url ($url);
if (! is_array ($bits)) {
return false;
}
if (! (array_key_exists ('scheme', $bits)
and array_key_exists ('host', $bits)
and array_key_exists ('path', $bits)
and array_key_exists ('query', $bits))) {
return false;
}
if (! in_array ($bits['scheme'], $valid_schemes)) {
return false;
}
if (! in_array ($bits['host'], $valid_hosts)) {
return false;
}
if (! in_array ($bits['path'], $valid_paths)) {
return false;
}
$querypairs = explode ('&', $bits['query']);
if (count ($querypairs) < 1) {
return false;
}
foreach ($querypairs as $querypair) {
list ($key, $value) = explode ('=', $querypair);
if ($key == 'v') {
if (preg_match ('/^[a-zA-Z0-9\-_]+$/', $value)) {
# Set the return value
$vidid = $value;
}
}
}
return $vidid;
}
Following regex will match any youtube link:
$pattern='#(((http(s)?://(www\.)?)|(www\.)|\s)(youtu\.be|youtube\.com)/(embed/|v/|watch(\?v=|\?.+&v=|/))?([a-zA-Z0-9._\/~#&=;%+?-\!]+))#si';

find occurence of a set of words

I have a pattern with a small list of words that are illegal to use as nicknames set in a pattern variable like this:
$pattern = webmaster|admin|webadmin|sysadmin
Using preg_match, how can I achieve so that nicknames with these words are forbidden, but registering something like "admin2" or "thesysadmin" is allowed?
This is the expression I have so far:
preg_match('/^['.$pattern.']/i','admin');
// Should not be allowed
Note: Using a \b didn't help much.
What about not using regex at all ?
And working with explode and in_array ?
For instance, this would do :
$pattern = 'webmaster|admin|webadmin|sysadmin';
$forbidden_words = explode('|', $pattern);
It explodes your pattern into an array, using | as separator.
And this :
$word = 'admin';
if (in_array($word, $forbidden_words)) {
echo "<p>$word is not OK</p>";
} else {
echo "<p>$word is OK</p>";
}
will get you
admin is not OK
Whereas this (same code ; only the word changes) :
$word = 'admin2';
if (in_array($word, $forbidden_words)) {
echo "<p>$word is not OK</p>";
} else {
echo "<p>$word is OK</p>";
}
will get you
admin2 is OK
This way, no need to worry about finding the right regex, to match full-words : it'll just match exact words ;-)
Edit : one problem might be that the comparison will be case-sensitive :-(
Working with everything in lowercase will help with that :
$pattern = strtolower('webmaster|admin|webadmin|sysadmin'); // just to be sure ;-)
$forbidden_words = explode('|', $pattern);
$word = 'aDMin';
if (in_array(strtolower($word), $forbidden_words)) {
echo "<p>$word is not OK</p>";
} else {
echo "<p>$word is OK</p>";
}
Will get you :
aDMin is not OK
(I saw the 'i' flag in the regex only after posting my answer ; so, had to edit it)
Edit 2 : and, if you really want to do it with a regex, you need to know that :
^ marks the beginning of the string
and $ marks the end of the string
So, something like this should do :
$pattern = 'webmaster|admin|webadmin|sysadmin';
$word = 'admin';
if (preg_match('#^(' . $pattern . ')$#i', $word)) {
echo "<p>$word is not OK</p>";
} else {
echo "<p>$word is OK</p>";
}
$word = 'admin2';
if (preg_match('#^(' . $pattern . ')$#i', $word)) {
echo "<p>$word is not OK</p>";
} else {
echo "<p>$word is OK</p>";
}
Parentheses are probably not necessary, but I like using them, to isolate what I wanted.
And, you'll get the same kind of output :
admin is not OK
admin2 is OK
You probably don't want to use [ and ] : they mean "any character that is between us", and not "the whole string that is between us".
And, as the reference : manual of the preg syntax ;-)
So, the forbidden words can be part of their username but not the whole thing?
In .NET, the pattern would be:
Allowed = Not RegEx.Match("admin", "^(webmaster|admin|webadmin|sysadmin)$")
The "^" matches the beginning of the string, the "$" matches the end, so it's looking for an exact match on one of those words. I'm a bit fuzzy on the corresponding PHP syntax.

Categories