char limit with php file_get_contents() and Google Chart API? - php

the specific issue I am working on is enabling https with Google charts API, and a possible character limit when using php file_get_contents on a url string. Let me take you through what is going on. I have made good progress using some tutorials on the net, specifically to enable the https. I am using their 'basic method' from this tutorial:
http://webguru.org/2009/11/09/php/how-to-use-google-charts-api-in-your-secure-https-webpage/
I have a chart.php file with this code in it:
<?php
$url = urldecode($_GET['api_url']);
$image_contents = file_get_contents($url);
echo $image_contents;
exit;
?>
I am calling this file from my main page, passing a 'test' Google chart URL (I have used many different ones) to it, which is 513 chars long:
$chartUrl = urlencode('http://chart.apis.google.com/chart?chxl=0:|Jan|Feb|Mar|Jun|Jul|Aug|1:|100|75|50|25|0&chxt=x,y&chs=300x150&cht=lc&chd=t:60.037,57.869,56.39,51.408,42.773,39.38,38.494,31.165,30.397,26.876,23.841,20.253,16.232,13.417,12.677,15.248,16.244,13.434,10.331,10.58,9.738,10.717,11.282,10.758,10.083,17.299,6.142,19.044,7.331,8.898,14.494,17.054,16.546,13.559,13.892,12.541,16.004,20.026,18.529,20.265,23.13,27.584,28.966,31.691,36.72,40.083,41.538,42.788,42.322,43.593,44.326,46.152,46.312,47.454&chg=25,25&chls=0.75,-1,-1');
To display the image in my main page I am using this code:
<img src="https://mysite.com/chart.php?api_url=<?php echo $chartUrl; ?>" />
The example $chartUrl string should display nothing. It will work fine until the $chartUrl string exceeds 512 characters in length (unencoded). For example if you use this string below (512 chars long):
$chartUrl = urlencode('http://chart.apis.google.com/chart?chxl=0:|Jan|Feb|Mar|Jun|Jul|Aug|1:|100|75|50|25|0&chxt=x,y&chs=300x150&cht=lc&chd=t:60.037,57.869,56.39,51.408,42.773,39.38,38.494,31.165,30.397,26.876,23.841,20.253,16.232,13.417,12.677,15.248,16.244,13.434,10.331,10.58,9.738,10.717,11.282,10.758,10.083,17.299,6.142,19.044,7.331,8.898,14.494,17.054,16.546,13.559,13.892,12.54,16.004,20.026,18.529,20.265,23.13,27.584,28.966,31.691,36.72,40.083,41.538,42.788,42.322,43.593,44.326,46.152,46.312,47.454&chg=25,25&chls=0.75,-1,-1');
The chart should show up. The difference between the strings is one character. The 'real' Google chart API string that I will be using in the final version is about 1250 chars long.
So is this a limit on get_file_contents()? I have looked at cURL as an alternative, but its specifics go over my head. Can someone confirm the char limit, and if possible make some suggestions?
Many thanks,
Neil

Edit: Unlike I stated below, this is probably not a server problem: Apache's limit on GET strings is said to be around 4000 bytes. The workaround I suggest is still valid, though, so I'm leaving this answer in place.
This is an awful lot of data to put in a GET string, and could be a server side limitation (Apache handling the request) as much as a client side one (file_get_contents sending the request).
I would look for an alternative way of doing this, e.g. by storing the long URL in a session variable with a random key:
$_SESSION["URL_1923843294284"] = $loooooong_url;
and pass that random key in the URL:
<img src="https://mysite.com/chart.php?api_url=1923843294284" />
Update: There does not seem to be a native length limit to file_get_contents() according to this question. This may well be a server issue.

Related

file_get_contents not working on url with hashtag symbol in it

I have url which I want to parse:
http://www.ntvplus.ru/tv/#genre=all&channel=3385&channel=3384&channel=3416&date=22.02.2016&date=23.02.2016&date=24.02.2016&date=25.02.2016&time=day
I tried using both file_get_contents and CURL, but they both seem to only send request to http://www.ntvplus.ru/tv/ and ignore everything after hash symbol.
I tried escaping it but still no luck.
Can someone please provide working solution on this?
# elements are client-side only. They won't be parsed with a headless request such as file_get_contents() (or even cURL).
As #brevis has suggested in the comments, you have to use the query string version:
$content = file_get_contents('http://www.ntvplus.ru/tv/?genre=all&channel=3385&channel=3384&channel=3416&date=22.02.2016&date=23.02.2016&date=24.02.2016&date=25.02.2016&time=day');

Decode a byte encoded string via my URL

We have a PHP site on Zend Framework with a backend Postgresql database. Our primary character encoding is UTF-8.
I just checked our error log and found a strange entry. My URL is as follows:
www.mydomain.com/schuhe-für-breite-füsse
however someone (or maybe a bot) has tried to access this URL as follows:
www.mydomain.com/schuhe-f\xc3\xbcr-breite-f\xc3\xbcsse/
It's the first time I've seen something like the above. Two things are happening on my page:
1) The above URL is queried against our CMS. This works fine for some reason, I think Postgresql reaslises it is byte-encoded and then converts it back when tried to find this SEF URL in our database.
2) An Ajax request is made on the page, passing the same SEF URL. This fails. I believe the slashes are causing a problem on Javascript.
To avoid this I want to decode any URL that is encoded like this. However a quick test of the following code did not decode anything for me :(
$landing_sef_url = $this->_getParam('landing_sef_url');
$utf8=html_entity_decode($landing_sef_url);
$iso8859=utf8_decode($utf8);
$test3 = html_entity_decode($landing_sef_url, 1, "ISO-8859-1");
$test4 = urldecode($landing_sef_url);
echo utf8_decode("$landing_sef_url");
echo "<br/><br/>";
die($landing_sef_url . " -- $utf8 -- $iso8859 <br/>$test3<br/>$test4");
I found the above via various posts online but they all print back the same result - schuhe-f\xc3\xbcr-breite-f\xc3\xbcsse
Any help would be MUCH appreciated. Many thanks!
This method seems to do what you're looking for:
http://li.php.net/manual/en/function.stripcslashes.php
But if you're just looking to unescape \x## sequences, you could also do this with a fairly simple regular expression.

node.js hmac.digest() output seems wrong

I'm trying to implement a Facebook library with node.js, and the request signing isn't working. I have the PHP example seen here translated into node. I'm trying it out with the example given there, where the secret is the string "secret". My code looks like this:
var signedRequest = request.signed_request.split('.');
var sig = b64url.decode(signedRequest[0]);
var expected = crypto.createHmac('sha256', 'secret').update(signedRequest[1]).digest();
console.log(sig == expected); // false
I can't console.log the decoded strings themselves, because they have special characters that cause the console to clear (if you have a suggestion to get around that please let me know) but I can output the b64url encodings of them.
The expected encoded sig, as you can see on the FB documentation, is
vlXgu64BQGFSQrY0ZcJBZASMvYvTHu9GQ0YM9rjPSso
My expected value, when encoded, is
wr5Vw6DCu8KuAUBhUkLCtjRlw4JBZATCjMK9wovDkx7Dr0ZDRgzDtsK4w49Kw4o
So why do I think it's digest that's wrong? Maybe the error is on my side? Well, if I execute the exact example in PHP given in the documentation, the correct result comes out. But if I change the hash_hmac call so the last parameter is false, outputting hex, I get
YmU1NWUwYmJhZTAxNDA2MTUyNDJiNjM0NjVjMjQxNjQwNDhjYmQ4YmQzMWVlZjQ2NDM0NjBjZjZiOGNmNGFjYQ==
Now, if I go back to my javascript code, and change my hmac code to .digest("hex") instead of the default "binary" and log the base64 encoding of the result, I get... surprise!
YmU1NWUwYmJhZTAxNDA2MTUyNDJiNjM0NjVjMjQxNjQwNDhjYmQ4YmQzMWVlZjQ2NDM0NjBjZjZiOGNmNGFjYQ
Same, except the == signs are missing off the end, but I think that's a console thing. I can't imagine that being the issue, without them it's not even a valid base64 string length.
So, how come the digest method outputs the correct result when using hex, but the wrong answer when using binary? Is the binary not quite the same as the "raw" output of the PHP equivalent? And if that's the case what is the correct way to call it?
We have discovered that this was indeed a bug in the crypto lib, and was a known issue logged on github. We will have to upgrade and get the fix.
I am Tesserex's partner. I believe the answer may have been combination of both Tesserex's self posted answer and Juicy Scripter's answer. We were still using Node ver. 0.4.7. The bug Tesserex mentioned can be found here: https://github.com/joyent/node/issues/324. I'm not entirely certain that this bug affected us, but it seems a good possibility. We updated Node to ver 0.6.5 and applied Juicy Scripter's solution and everything is now working. Thank you.
As a note about the suggestion of using existing libraries. Most of the existing libraries require express, this is something we are trying to avoid do to some of the specifics of our application. Also the existing libraries tend to assume that your using node.js like a web server and answering a single users request at a time. We are using persistent connections with websockets and our facebook client will be handling session data for multiple users simultaneously. Eventually I hope to make our Facebook client open source for use with applications like ours.
Actually there is no problem with digest, the results of b64url.decode are in utf8 encoding by default (which can be specified by second parameter) if you use:
var sig = b64url.decode(signedRequest[0], 'binary');
var expected = crypto.createHmac('sha256', 'secret').update(signedRequest[1]).digest();
// sig === expected
signature and result of digest will be the same.
You may also check this by turning digest results into utf8 encoded string:
var sig = b64url.decode(signedRequest[0]);
var expected = crypto.createHmac('sha256', 'secret').update(signedRequest[1]).digest();
var expected_buffer = new Buffer(expected_sig.digest(), 'binary');
// sig === expected_buffer.toString()
Also you may consider using existing libraries to do that kind of work (and probably more), to name a few:
facebook-wrapper
everyauth

cutting special chars in folder name when using GET

I've been visiting stackoverflow.com for a long time and always found the solution to my problem. But this time it's different. That's why I'm posting my first question here.
The situation looks like this: My website provides a directory explorer which allows users to download whole directory as a zip file. The problem is I end up with error when I want to download a dir containg special characters in it's name, i.e. 'c++'. I don't want to force users to NOT name their folders with those special chars, so I need a clue on this one. I noticed that the whole problem comes down to GET protocol. I use ajax POST for example to roll out the directory content, but for making a .zip file and downloading it I need GET:
var dir_clicked = $(e.target).attr('path'); //let's say it equals '/c++'
window.location = 'myDownloadSite.php?directory_path='+dir_clicked;
I studied whole track of dir_clicked variable, step by step, and it seems that the variable in adress is sent correctly (I see the correct url in browser) but typing:
echo $_GET['directory_path']
in myDownloadSite.php prints
'/c'
instead of
'/c++'
Why the GET protocol is cutting my pluses?
You can use:
encodeURIComponent() //to get the url then use
decodeURIComponent() //to decode and access ur filename.
Use urlencode() and urldecode() on server side.
Try encoding your URI with encodeURI(url) JavaScript function.
window.location = encodeURI('myDownloadSite.php?directory_path=' + dir_clicked);
Maybe use encodeURIComponent() and then remove all %xx occurrences?
When the information is posted it is encoded with special chars, sounds like you just need to decode them before using the information.
You can use php function urldecode() to decode the folder names before using them...
$_GET[directory_path]=urldecode($_GET[directory_path]);

Why doesn't jQuery.parseJSON() work on all servers?

Hey there, I have an Arabic contact script that uses Ajax to retrieve a response from the server after filling the form.
On some apache servers, jQuery.parseJSON() throws an invalid json excepion for the same json it parses perfectly on other servers. This exception is thrown only on chrome and IE.
The json content gets encoded using php's json_encode() function. I tried sending the correct header with the json data and setting the unicode to utf-8, but that didn't help.
This is one of the json responses I try to parse (removed the second part of if because it's long):
{"pageTitle":"\u062e\u0637\u0623 \u0639\u0646\u062f \u0627\u0644\u0625\u0631\u0633\u0627\u0644 !"}
Note: This language of this data is Arabic, that's why it looks like this after being parsed with php's json_encode().
You can try to make a request in the examples given down and look at the full response data using firebug or webkit developer tools. The response passes jsonlint!
Finally, I have two urls using the same version of the script, try to browse them using chrome or IE to see the error in the broken example.
The working example : http://namodg.com/n/
The broken example: http://www.mt-is.co.cc/my/call-me/
Updated: To clarify more, I would like to note that I manged to fix this by using the old eval() to parse the content, I released another version with this fix, it was like this:
// Parse the JSON data
try
{
// Use jquery's default parser
data = $.parseJSON(data);
}
catch(e)
{
/*
* Fix a bug where strange unicode chars in the json data makes the jQuery
* parseJSON() throw an error (only on some servers), by using the old eval() - slower though!
*/
data = eval( "(" + data + ")" );
}
I still want to know if this is a bug in jquery's parseJSON() method, so that I can report it to them.
Found the problem! It was very hard to notice, but I saw something funny about that opening brace... there seemed to be a couple of little dots near it. I used this JavaScript bookmarklet to find out what it was:
javascript:window.location='http://www.google.com/search?q=u+'+('000'+prompt('String?').charCodeAt(prompt('Index?')).toString(16)).slice(-4)
I got the results page. Guess what the problem is! There is an invisible character, repeated twice actually, at the beginning of your output. The zero width non-breaking space is also called the Unicode byte order mark (BOM). It is the reason why jQuery is rejecting your otherwise valid JSON and why pasting the JSON into JSONLint mysteriously works (depending on how you do it).
One way to get this unwanted character into your output is to save your PHP files using Windows Notepad in UTF-8 mode! If this is what you are doing, get another text editor such as Notepad++. Resave all your PHP files without the BOM to fix your problem.
Step 1: Set up Notepad++ to encode files in UTF-8 without BOM by default.
Step 2: Open each existing PHP file, change the Encoding setting, and resave it.
You should try using json2.js (it's on https://github.com/douglascrockford/JSON-js)
Even John Resig (creator of jQuery) says you should:
This version of JSON.js is highly recommended. If you're still using the old version, please please upgrade (this one, undoubtedly, cause less issues than the previous one).
http://ejohn.org/blog/the-state-of-json/
I don't see anything related to parseJSON()
The only difference I see is that in the working example a session-cookie is set(guess it is needed for the "captcha", the mathematical calculation), in the other example no session-cookie is set. So maybe the comparision of the calculation-result fails without the session-cookie.

Categories