Building Reducisaurus URLs - php

I'm trying to use Reducisaurus Web Service to minify CSS and Javascript but I've run into a problem...
Suppose I've two unminified CSS at:
http:/domain.com/dynamic/styles/theme.php?color=red
http:/domain.com/dynamic/styles/typography.php?font=Arial
According to the docs I should call the web service like this:
http:/reducisaurus.appspot.com/css?url=http:/domain.com/dynamic/styles/theme.php?color=red
And if I want to minify both CSS files at once:
http:/reducisaurus.appspot.com/css?url1=http:/domain.com/dynamic/styles/theme.php?color=red&url2=http:/domain.com/dynamic/styles/theme.php?color=red
If I wanted to specify a different number of seconds for the cache (3600 for instance) I would use:
http:/reducisaurus.appspot.com/css?url=http:/domain.com/dynamic/styles/theme.php?color=red&expire_urls=3600
And again for both CSS files at once:
http:/reducisaurus.appspot.com/css?url1=http:/domain.com/dynamic/styles/theme.php?color=red&url2=http:/domain.com/dynamic/styles/theme.php?color=red&expire_urls=3600
Now my question is, how does Reducisaurus knows how to separate the URLs I want? How does it know that &expire_urls=3600 is not part of my URL? And how does it know that &url2=... is not a GET argument of url1? I'm I doing this right? Do I need to urlencode my URLs?
I took a peek into the source code and although my Java is very poor it seems that the methods acquireFromRemoteUrl() and getSortedParameterNames() from the BaseServlet.java file hold the answers to my question - if a GET argument name contains - or _ they should be ignored?!
What about multiple &url(n)s?

Yes, you need to URL encode your URLs before you submit them as a parameter to another webservice.
E.g.
http://google.com
Becomes
http%3A%2F%2Fgoogle.com
If you do that, no special characters like ?, &, = et cetera survive the process that could confuse the webservice.
(Not quite sure what you're asking with your second question, sorry.)

everything which starts with url is threated as a new url, so you cannot pass a parameter called url2 as a get argument of url1.
Every param name that does not contain a '-' will be treated as input.
So if you do
...?file1=...&url1=...&max-age=604800,
the max-age will not be treated as input.
However,
...?file1=...&url1=...&maxage=604800
here the maxage will be treated as input.

Related

Grab all text after variable

I'm a little unsure of how to word this one but essentially, I want to achieve the following:
http://my.website/?url=http://another.website/?var1=data&var2=moredata&id=119
And for the URL variable to be: http://another.website/?var1=data&var2=moredata&id=119
Naturally, PHP sees var2 and id as new variables. This would be used to pass a full URL from one page to another, however, it poses an issue when the page already has its own variables in the URL!
Any help appreciated!
You need to encode the secondary url which you're putting inside the url variable when you create it. This will ensure it doesn't contain special querystring characters that the receiving website will misunderstand. If the code in my.website is PHP too then the urlencode function (http://php.net/manual/en/function.urlencode.php) is your friend. For example:
urlencode("http://another.website/?var1=data&var2=moredata&id=119")
produces
http%3A%2F%2Fanother.website%2F%3Fvar1%3Ddata%26var2%3Dmoredata%26id%3D119
which will not be misunderstood by the PHP code reading it as containing further separate variables.

How to check if a given URL string ends with "/index.*" or simply "/" and remove it, in PHP?

I'm trying to check for /index.[ php | htm | html | asp ], etc. (basically any wildcard suffix)
or simply "/"
to convert a canonical URL such as http://www.example.com/index.php or http://www.example.com/ to just http://www.example.com so I can just use one consistent PHP variable in my canonicalmeta tag throughout all the pages on my site.
I dont want the script to effect URLs that are NOT homepages such as http://www.example.com/page.php
REGEX preferred, but not necessary.
I'm working from the variable $current_Webpage_Complete_URL_Address in "domain_info.php" from this script: http://www.perfecterrordocs.com
I want to use just one PHP variable, formed from modifying $current_Webpage_Complete_URL_Address, if that makes any sense.
Please ask for further clarification, if neccassary.
Edit: Also, I want the newly formed variable to be named
$current_Webpage_Canonical_URL_Address
Edit(2): I just ran into another problem. Even when I do find a match with preg_match how do I remove that particular ending sub-string?
Final Answer (Works Perfectly!!):
$current_Webpage_Canonical_URL_Address = preg_replace('/((\\index)\.[a-z]+)$/', '', $current_Webpage_Complete_URL_Address);
$current_Webpage_Canonical_URL_Address = preg_replace('/\/$/', '', $current_Webpage_Canonical_URL_Address);

Which special characters can cause a file path to be misinterpreted?

For example, there is function (pseudo code):
if ($_GET['path'] ENDS with .mp3 extension) { read($_GET['path']); }
but is it possible, that hacker in a some way, used a special symbol/method, i.e.:
path=file.php^example.mp3
or
path=file.php+example.mp3
or etc...
if something such symbol exists in php, as after that symbol, everything was ignored, and PHP tried to open file.php..
p.s. DONT POST ANSWERS about PROTECTION! I NEED TO KNOW IF THIS CODE can be bypassed, as I AM TO REPORT MANY SCRIPTS for this issue (if this is really an issue).
if something such symbol exists in php, as after that symbol, everything was ignored, and PHP tried to open file.php..
Yes, such a symbol exists; it is called the 'null byte' ("\0").
Because in C (the language used to write the PHP engine) the end of a 'string' is signalled by the null byte. So, whenever a null byte is encountered, the string will end.
If you want the string to end with .mp3 you should manually append it.
Having said that, it is, generally speaking, a very bad idea to accept a user supplied path from a security standpoint (and I believe you are interested in the security aspect of this, because you originally posted this question on security.SE).
Consider the situation where:
$_GET['path'] = "../../../../../etc/passwd\0";
or a variation on this theme.
The leading concept in programming is "Don't trust user input". So the main problem in your case is not a special character its how you work with your data. So you shouldn't use a path given by a user because the user can manipulate the path or other variables.
To escape a user input to prevent bad characters you can use htmlspecialchars or you can filter your get input with filter_input something like that:
$search_html = filter_input(INPUT_GET, 'search', FILTER_SANITIZE_SPECIAL_CHARS);
WE CAN'T TELL IF YOU IF THE CODE CAN BE "BYPASSED" BECAUSE YOU'VE NOT GIVEN US ANY PHP CODE
As to the question of whether its possible to trick PHP into processing a file it shouldn't based on the end of the string, then the answer is only if there is another file somewhere else which has the same ending. However, by default, PHP will happily read from URLs using the same functionality as reading from local files, consider:
http://yourserver.com/yourscript.php?path=http%3A%2F%2Fevilserver.com%2Fpwnd_php.txt%3Ffake_end%3Dmp3

cutting special chars in folder name when using GET

I've been visiting stackoverflow.com for a long time and always found the solution to my problem. But this time it's different. That's why I'm posting my first question here.
The situation looks like this: My website provides a directory explorer which allows users to download whole directory as a zip file. The problem is I end up with error when I want to download a dir containg special characters in it's name, i.e. 'c++'. I don't want to force users to NOT name their folders with those special chars, so I need a clue on this one. I noticed that the whole problem comes down to GET protocol. I use ajax POST for example to roll out the directory content, but for making a .zip file and downloading it I need GET:
var dir_clicked = $(e.target).attr('path'); //let's say it equals '/c++'
window.location = 'myDownloadSite.php?directory_path='+dir_clicked;
I studied whole track of dir_clicked variable, step by step, and it seems that the variable in adress is sent correctly (I see the correct url in browser) but typing:
echo $_GET['directory_path']
in myDownloadSite.php prints
'/c'
instead of
'/c++'
Why the GET protocol is cutting my pluses?
You can use:
encodeURIComponent() //to get the url then use
decodeURIComponent() //to decode and access ur filename.
Use urlencode() and urldecode() on server side.
Try encoding your URI with encodeURI(url) JavaScript function.
window.location = encodeURI('myDownloadSite.php?directory_path=' + dir_clicked);
Maybe use encodeURIComponent() and then remove all %xx occurrences?
When the information is posted it is encoded with special chars, sounds like you just need to decode them before using the information.
You can use php function urldecode() to decode the folder names before using them...
$_GET[directory_path]=urldecode($_GET[directory_path]);

Jquery, ajax and the ampersand conundrum

I know that I should encodeURI any url passed to anything else, because I read this:
http://www.digitalbart.com/jquery-and-urlencode/
I want to share the current time of the current track I am listening to.
So I installed the excellent yoururls shortener.
And I have a bit of code that puts all the bits together, and makes the following:
track=2&time=967
As I don't want everyone seeing my private key, I have a little php file which takes the input, and appends the following, so it looks like this:
http://myshorten.example/yourls-api.php?signature=x&action=shorturl&format=simple&url=http://urltoshorten?track=2&time=967
So in the main page, I call the jquery of $("div.shorturl").load(loadall);
It then does a little bit of CURL and then shortener returns a nice short URL.
Like this:
$myurl='http://myshorten.example/yourls-api.php?signature=x&action=shorturl&format=simple&url=' . $theurl;
$ch = curl_init($myurl);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$data = curl_exec($ch);
curl_close($ch);
if ($data === false) {
echo 'cURL failed';
exit;
}
echo $data;
All perfect.
Except... the URL which is shortened is always in the form of http://urltoshorten?track=2 - anything after the ampersand is shortened.
I have tried wrapping the whole URL in php's URLencode, I've wrapped the track=2&time=967 in both encodeURI and encodeURIComponent, I've evem tried wrapping the whole thing in one or both.
And still, the & breaks it, even though I can see the submitted url looks like track=1%26time%3D5 at the end.
If I paste this or even the "plain" version with the unencoded url either into the yoururls interface, or submit it to the yoururls via the api as a normal URL pasted into the location bar of the browser, again it works perfectly.
So it's not yoururls at fault, it seems like the url is being encoded properly, the only thing I can think of is CURL possibly?
Now at this point you might be thinking "why not replace the & with a * and then convert it back again?".
OK, so when the url is expanded, I get the values from
var track = $.getUrlVar('track');
var time = $.getUrlVar('time');
so I COULD lose the time var, then do a bit of finding on where the * is in track and then assume the rest of anything after * is the time, but it's a bit ugly, and more to the point, it's not really the correct way to do things.
If anyone could help me, it would be appreciated.
I have tried wrapping the whole URL in php's URLencode
That is indeed what you have to do (assuming by ‘URL’ you mean inner URL being passed as a component of the outer URL). Any time you put a value in a URL component, you need to URL-encode, whether the value you're setting is a URL or not.
$myurl='http://...?...&url='.rawurlencode($theurl);
(urlencode() is OK for query parameters like this, but rawurlencode() is also OK for path parts, so unless you really need spaces to look slightly prettier [+ vs %20], I'd go for rawurlencode() by default.)
This will give you a final URL like:
http://myshorten.example/yourls-api.php?signature=x&action=shorturl&format=simple&url=http%3A%2F%2Furltoshorten%3Ftrack%3D2%26time%3D967
Which you should be able to verify works. If it doesn't, there is something wrong with yourls-api.php.
I have tried wrapping the whole URL in php's URLencode, I've wrapped the track=2&time=967 in both encodeURI and encodeURIComponent, I've evem tried wrapping the whole thing in one or both. And still, the & breaks it, even though I can see the submitted url looks like track=1%26time%3D5 at the end.
Maybe an explanation of how HTTP variables work will help you out.
If I'm getting a page with the following variables and values:
var1 = Bruce Oxford
var2 = Brandy&Wine
var3 = ➋➌➔ (unicode chars)
We uri-encode the var name and the value of the var, ie:
var1 = Bruce+Oxford
var2 = Brandy%26Wine
var3 = %E2%9E%8B%E2%9E%8C%E2%9E%94
What we are not doing is encoding the delimiting charecters, so what the request data will look like for the above is:
?var1=Bruce+Oxford&var2=Brandy%26Wine&var3=%E2%9E%8B%E2%9E%8C%E2%9E%94
Rather than:
%3Fvar1%3DBruce+Oxford%26var2%3DBrandy%26Wine%26var3%3D%E2%9E%8B%E2%9E%8C%E2%9E%94
Which is of course just gibberish.

Categories