I have a PHP script that include different pages for special referers:
$ref_found = false;
// get referer if exists
$referer = false;
if ( isset($_SERVER['HTTP_REFERER']) ) {
$referer = $_SERVER['HTTP_REFERER'];
// get content of list.txt
$list = explode(chr(10), file_get_contents('list.txt'));
foreach ( $list as $l ) {
if ( strlen($l) > 0 ) {
if ( strpos( $referer, $l ) ) {
$ref_found = true;
}
}
}
}
// include the correct file
if ( $ref_found ) {
require_once('special_page.html');
} else {
require_once('regular_page.html');
}
Referer DB is in simple txt file (list.txt) and it looks like this:
domain1.com
domain2.com
domain3.com
Unfortunalty this script works only for last domain from the list (domain3.com).
What shoud I add? \n ?
Or it's better idea to create domains DB in different way?
The problem is that when you explode() your list of domain names, you end up with whitespace around each item. At the very least, you will have a newline (\n) somewhere, since the linebreaks in your file are probably \r\n.
So you're checking against something like " domain1.com" or maybe "\ndomain1.com", or maybe "domain1.com\n". Since this extra whitespace doesn't exists in the referrer header, it's not matching when you expect it to.
By calling trim() on each value you find, you'll get a clean domain name that you can use to do a more useful comparison:
$list = explode("\n", file_get_contents('list.txt'));
foreach ($list as $l) {
$l = trim($l);
if ((strlen($l) > 0) && (strpos($referer, $l) !== false)) {
$ref_found = true;
break;
}
}
I made a couple other minor updates to your code as well:
I switched away from using chr() and just used a string literal ("\n"). As long as you use double-quotes, it'll be a literal newline character, instead of an actual \ and n, and the string literal is much easier to understand for somebody reading your code.
I switched from a "\r" character (chr 10) to a "\n" character (chr 13). There's several different newline formats, but the most common are "\n" and "\r\n". By exploding on "\n", your code will work with both formats, where "\r" will only work with the second.
I combined your two if statements. This is a very minor update that doesn't have much effect except to (in my opinion) make the code easier to read.
I updated your strpos() to do a literal comparison to false (!==). It's probably not an issue with this code because the referrer value will start with http://, but it's a good habit to get into. If the substring happens to occur at the beginning of the parent string, strpos() will return 0, which will be interpreted as false in your original code.
I added a break statement in your loop if you found a matching domain name. Once you find one and set the flag, there's no reason to continue checking the rest of the domains in the list, and break allows you to cancel the rest of the foreach loop.
chr(13) == "\n"
chr(10) == "\r"
"\n" is most likely what you want.
Related
When we check:
dir1/dir2/../file.txt ==== this is same as =====> dir1/file.txt
I am interested is something same thing available in PHP, like:
$name= "Hello ". $variable . "World";
if i had $variable = "../Hi" (or anything like that) so, it removed (like backslashing) the previous part, printed Hi World ?
(p.s. I dont control the php file, I ask about how attackers can achieve that).
(p.s.2. I dont have words to downvoters for closing this. I think you have problems with analysing of questions before you close).
In PHP there exist no special ../ (or any other string) that when concatenated to another string generates any string other than the combine original string concatenated with the new string. Concatenation, regardless of content of strings always results in:
"<String1><String2>" = "<String1>"."<String2>";
Nothing will not 'erase' prior tokens in a string or anything like that and is completely harmless.
Caveat!!!! Of course if the string is being used somewhere that interprets it in some specific way where any character or group of characters in the ../ is treated special such as:
In a string used for regex pattern
In a string used as a file path (in that case, when it's evaluated it will do exactly what you'd expect if you'd typed it.
A string used in a SQL query without properly escaping (as with binding params/values via prepared statements)
etc...
Now, if you want to remove the word prior to each occurence of ../ starting a word in a sentence, sort-of replicating how the .. in a path means, go up one level (in effect undoing the step made to the directory in the path prior to it).
Here's a basic algorithm to start you out (if you are able to change the source code) :
Use explode with delimiter " " on the string.
Create a new array
Iterate the returned array, if not ../ insert at end of new array
if entry starts with ../, remove the end element of the 2nd array
insert the the ../somestring with the ../ string replaced with empty string "" on the end of the 2nd array
Once at end of array (all strings processed), implode() with delimiter " "
Here's an example:
<?php
$variable = "../Hi";
$string = "Hello ". $variable . " World"; // Note: I added a space prior to the W
$arr = array();
foreach(explode(" ", $string) as $word) {
if (substr( $word, 0, 3 ) === "../") {
if(!empty($arr)){
array_pop($arr);
}
$arr[] = str_replace("../", "", $word);
} else {
$arr[] = $word;
}
}
echo implode(" ", $arr);
I am getting an "Array to string conversion error on PHP";
I am using the "variable" (that should be a string) as the third parameter to str_replace. So in summary (very simplified version of whats going on):
$str = "very long string";
str_replace("tag", $some_other_array, $str);
$str is throwing the error, and I have been trying to fix it all day, the thing I have tried is:
if(is_array($str)) die("its somehow an array");
serialize($str); //inserted this before str_replace call.
I have spent all day on it, and no its not something stupid like variables around the wrong way - it is something bizarre. I have even dumped it to a file and its a string.
My hypothesis:
The string is too long and php can't deal with it, turns into an array.
The $str value in this case is nested and called recursively, the general flow could be explained like this:
--code
//pass by reference
function the_function ($something, &$OFFENDING_VAR, $something_else) {
while(preg_match($something, $OFFENDING_VAR)) {
$OFFENDING_VAR = str_replace($x, y, $OFFENDING_VAR); // this is the error
}
}
So it may be something strange due to str_replace, but that would mean that at some point str_replace would have to return an array.
Please help me work this out, its very confusing and I have wasted a day on it.
---- ORIGINAL FUNCTION CODE -----
//This function gets called with multiple different "Target Variables" Target is the subject
//line, from and body of the email filled with << tags >> so the str_replace function knows
//where to replace them
function perform_replacements($replacements, &$target, $clean = TRUE,
$start_tag = '<<', $end_tag = '>>', $max_substitutions = 5) {
# Construct separate tag and replacement value arrays for use in the substitution loop.
$tags = array();
$replacement_values = array();
foreach ($replacements as $tag_text => $replacement_value) {
$tags[] = $start_tag . $tag_text . $end_tag;
$replacement_values[] = $replacement_value;
}
# TODO: this badly needs refactoring
# TODO: auto upgrade <<foo>> to <<foo_html>> if foo_html exists and acting on html template
# Construct a regular expression for use in scanning for tags.
$tag_match = '/' . preg_quote($start_tag) . '\w+' . preg_quote($end_tag) . '/';
# Perform the substitution until all valid tags are replaced, or the maximum substitutions
# limit is reached.
$substitution_count = 0;
while (preg_match ($tag_match, $target) && ($substitution_count++ < $max_substitutions)) {
$target = serialize($target);
$temp = str_replace($tags,
$replacement_values,
$target); //This is the line that is failing.
unset($target);
$target = $temp;
}
if ($clean) {
# Clean up any unused search values.
$target = preg_replace($tag_match, '', $target);
}
}
How do you know $str is the problem and not $some_other_array?
From the manual:
If search and replace are arrays, then str_replace() takes a value
from each array and uses them to search and replace on subject. If
replace has fewer values than search, then an empty string is used for
the rest of replacement values. If search is an array and replace is a
string, then this replacement string is used for every value of
search. The converse would not make sense, though.
The second parameter can only be an array if the first one is as well.
I was working on a dynamic way to update a config.php file and I ran into an interesting glitch that I can't quite solve. Below is my code for updating the config.php file:
if( isset( $_POST['submitted'] ) ) {
$config_keys = array();
foreach( $_POST as $key => $value ) {
if( substr( $key, 0, 7 ) == 'config-' ) {
$config_keys[ substr( $key, 7 ) ] = $value;
}
}
$config_content = file_get_contents( dirname(__FILE__) . '/../../inc/config.php' );
foreach( $config_keys as $key => $value ) {
$config_content = preg_replace(
"/config\['$key'\](\s*)=(\s*)(['\"]?).*?(['\"]?);/",
"config['$key']$1=$2$3$value$4;",
$config_content
);
$config[$key] = $value;
}
file_put_contents( dirname(__FILE__) . '/../../inc/config.php', $config_content );
}
The logic is fairly sound. It searches for any POST variables prefixed by "config-" then uses everything after "config-" as the name of the key in our config file to update. The config file takes the form:
$config['var1'] = 'value1';
$config['var2'] = 123;
$config['var3'] = '...';
In 90% of cases this works perfectly, however if $value begins with a numeral then $3 and the first numeral of $value are completely ignored during the replacement.
For instance, I have the following value in my config file:
$config['ls_key'] = '136609a7b4....'; // Rest of key has been truncated
If I don't change this value and leave the key untouched but submit my form then this line suddenly looks like so:
$config['ls_key'] = 36609a7b4...'; // Rest of key has been truncated
The lack of single quote prevents the config file from parsing (breaking the entire site) and we've lost data to boot! After reading the PHP preg_replace manual I have tried using braces in several locations (modifying "Example #1 Using backreferences followed by numeric literals"). None of the following worked:
"config['$key']$1=$2${3}$value$4;",
"config['$key']$1=$2$3${value}$4;",
"config['$key']$1=$2$3{$value}$4;",
"config['$key']$1=$2{$3}$value$4;", // This one actually leads to syntax errors
"config['$key']${1}=${2}${3}$value${4};",
The first 3 lead to the exact same problem, having no effect on the replacement. The fourth doesn't work at all (syntax errors), and the fifth actually causes EVERY backreference to be ignored. I've also tried using single quotes and concatenation like so:
'config[\'$key\']$1=$2$3' . $value . '$4;',
Again, I had the same problem as the 3 prior examples and my original script.
Hoping someone has solved this before or at least has a new idea.
Seems the double quoting interpolation is messing things up. This replacement works:
'config[\''.$key.'\']$1=$2${3}'.$value.'$4;'
Also note that you should properly escape the following (meta characters):
$key in the regex with preg_quote
$key and $value in the replacement, there is no built in function to do this (preg_quote escapes too much)
And also escape the regex delimiter and quote delimiter used if present.
Try \g{1} for the group 1 (and accordingly for the other groups)
See the php manual on backreferences
Update:
Of course Qtax is right, this is the syntax for backreferences within the regular expression. (+1 for Qtax)
So I'm writing a PHP script that will read in a CSS file, then put the comments and actual CSS in separate arrays. The script will then build a page with the CSS and comments all nicely formatted.
The basic logic for the script is this:
Read in a new line
If it starts with a forward slash or
ends with an opening bracket, set a
bool for CSS or comments to true
Add that line to the appropriate
element in the appropriate array
If the last character is a backslash
(end of a comment) or the first
character is a closing bracket (end
of a CSS tag), set necessary bool to
false
Rinse, repeat
If someone sees an error in that, feel free to point it out, but I think it should do what I want.
The tricky part is the last if statement, checking if the last character is a backslash. Right now I have:
if ($line{(strlen($line) - 3)} == "\\") {do stuff}
where $line is the last line read in from the file. Not entirely sure why I have to go back 3 characters, but I'm guessing it's because there's a newline at the end of each string when reading it in from a file. However, this if statement is never true, even though there are definitely lines which end with slashes. This
echo "<br />str - 3: " . $line{(strlen($line)-3)};
even returns a backslash, yet the if statement is never trigged.
That would be because $line{(strlen($line) - 3)} in your if statement is returning one backslash, while the if statement is looking for two. Try using
substr($line, -2)
instead. (You might have to change it to -3. The reason for this is because the newline character might be included at the end of the string.)
#mcritelli: CSS comments look like /* comment */ though, so just searching for a backslash won't tell you if it's starting or ending the comment. Here's a very basic script I tested which loops through a 'line' and can do something at the beginning and end of a comment --
<?php
$line = "/* test rule */";
$line .= ".test1 { ";
$line .= " text-decoration: none; ";
$line .= "}/* end of test rule */";
for ($i = 0; $i < strlen($line); $i++)
{
if ($line[$i] . $line[$i + 1] == "/*")
{
// start of a comment, do something
}
elseif ($line[$i] . $line[$i + 1] == "*/")
{
// end of a comment, do something
}
}
?>
I am trying to validate a Youtube URL using regex:
preg_match('~http://youtube.com/watch\?v=[a-zA-Z0-9-]+~', $videoLink)
It kind of works, but it can match URL's that are malformed. For example, this will match ok:
http://www.youtube.com/watch?v=Zu4WXiPRek
But so will this:
http://www.youtube.com/watch?v=Zu4WX£&P!ek
And this wont:
http://www.youtube.com/watch?v=!Zu4WX£&P4ek
I think it's because of the + operator. It's matching what seems to be the first character after v=, when it needs to try and match everything behind v= with [a-zA-Z0-9-]. Any help is appreciated, thanks.
To provide an alternative that is larger and much less elegant than a regex, but works with PHP's native URL parsing functions so it might be a bit more reliable in the long run:
$url = "http://www.youtube.com/watch?v=Zu4WXiPRek";
$query_string = parse_url($url, PHP_URL_QUERY); // v=Zu4WXiPRek
$query_string_parsed = array();
parse_str($query_string, $query_string_parsed); // an array with all GET params
echo($query_string_parsed["v"]); // Will output Zu4WXiPRek that you can then
// validate for [a-zA-Z0-9] using a regex
The problem is that you are not requiring any particular number of characters in the v= part of the URL. So, for instance, checking
http://www.youtube.com/watch?v=Zu4WX£&P!ek
will match
http://www.youtube.com/watch?v=Zu4WX
and therefore return true. You need to either specify the number of characters you need in the v= part:
preg_match('~http://youtube.com/watch\?v=[a-zA-Z0-9-]{10}~', $videoLink)
or specify that the group [a-zA-Z0-9-] must be the last part of the string:
preg_match('~http://youtube.com/watch\?v=[a-zA-Z0-9-]+$~', $videoLink)
Your other example
http://www.youtube.com/watch?v=!Zu4WX£&P4ek
does not match, because the + sign requires that at least one character must match [a-zA-Z0-9-].
Short answer:
preg_match('%(http://www.youtube.com/watch\?v=(?:[a-zA-Z0-9-])+)(?:[&"\'\s])%', $videoLink)
There are a few assumptions made here, so let me explain:
I added a capturing group ( ... ) around the entire http://www.youtube.com/watch?v=blah part of the link, so that we can say "I want get the whole validated link up to and including the ?v=movieHash"
I added the non-capturing group (?: ... ) around your character set [a-zA-Z0-9-] and left the + sign outside of that. This will allow us to match all allowable characters up to a certain point.
Most importantly, you need to tell it how you expect your link to terminate. I'm taking a guess for you with (?:[&"\'\s])
?) Will it be in html format (e.g. anchor tag) ? If so, the link in href will obviously end with a " or '.
?) Or maybe there's more to the query string, so there would be an & after the value of v.
?) Maybe there's a space or line break after the end of the link \s.
The important piece is that you can get much more accurate results if you know what's surrounding what you are searching for, as is the case with many regular expressions.
This non-capturing group (in which I'm making assumptions for you) will take a stab at finding and ignoring all the extra junk after what you care about (the ?v=awesomeMovieHash).
Results:
http://www.youtube.com/watch?v=Zu4WXiPRek
- Group 1 contains the http://www.youtube.com/watch?v=Zu4WXiPRek
http://www.youtube.com/watch?v=Zu4WX&a=b
- Group 1 contains http://www.youtube.com/watch?v=Zu4WX
http://www.youtube.com/watch?v=!Zu4WX£&P4ek
- No match
a href="http://www.youtube.com/watch?v=Zu4WX&size=large"
- Group 1 contains http://www.youtube.com/watch?v=Zu4WX
http://www.youtube.com/watch?v=Zu4WX£&P!ek
- No match
The "v=..." blob is not guaranteed to be the first parameter in the query part of the URL. I'd recommend using PHP's parse_url() function to break the URL into its component parts. You can also reassemble a pristine URL if someone began the string with "https://" or simply used "youtube.com" instead of "www.youtube.com", etc.
function get_youtube_vidid ($url) {
$vidid = false;
$valid_schemes = array ('http', 'https');
$valid_hosts = array ('www.youtube.com', 'youtube.com');
$valid_paths = array ('/watch');
$bits = parse_url ($url);
if (! is_array ($bits)) {
return false;
}
if (! (array_key_exists ('scheme', $bits)
and array_key_exists ('host', $bits)
and array_key_exists ('path', $bits)
and array_key_exists ('query', $bits))) {
return false;
}
if (! in_array ($bits['scheme'], $valid_schemes)) {
return false;
}
if (! in_array ($bits['host'], $valid_hosts)) {
return false;
}
if (! in_array ($bits['path'], $valid_paths)) {
return false;
}
$querypairs = explode ('&', $bits['query']);
if (count ($querypairs) < 1) {
return false;
}
foreach ($querypairs as $querypair) {
list ($key, $value) = explode ('=', $querypair);
if ($key == 'v') {
if (preg_match ('/^[a-zA-Z0-9\-_]+$/', $value)) {
# Set the return value
$vidid = $value;
}
}
}
return $vidid;
}
Following regex will match any youtube link:
$pattern='#(((http(s)?://(www\.)?)|(www\.)|\s)(youtu\.be|youtube\.com)/(embed/|v/|watch(\?v=|\?.+&v=|/))?([a-zA-Z0-9._\/~#&=;%+?-\!]+))#si';