Google App Engine (PHP) Response too large - php

When using the get_headers() php function in a deployed app, e.g.:
$aHeaders = get_headers("http://[...].mp3", 1);
echo $aHeaders['Content-Length'];
I get the following error:
PHP Warning: get_headers(http://[...].mp3): failed to open stream:
Response too large in /base/data/home/apps/[...]/main.php
The error doesn't appear when the file is small (e.g. 100kb).
I need to get the size of a file on an external server without having to download it. Also, I can't use curl as it is not supported by GAE. Any ideas?

do you try to do an HEAD request instead of a GET (that downloads all content)?
stream_context_set_default(
array(
'http' => array(
'method' => 'HEAD'
)
)
);
$headers = get_headers('http://[...].mp3', 1);

Related

PHP script no longer receiving data from context stream

We have a "legacy" script that stopped working a little while back. Pretty sure it's because the endpoint it's connecting to changed from http to https, and the old http address now returns a 301.
I've never done anything other than tiny changes to PHP scripts, so am a little out of my depth here.
Note that our PHP version is old - 5.3.0. This may well be part of the problem.
The script as-is (relevant bit anyway):
$uri = "http://www.imf.org/external/np/fin/data/rms_mth.aspx"
."?SelectDate=$date&reportType=CVSDR&tsvflag=Y";
$opts = array('http' => array(
'proxy' => 'tcp://internal.proxy.address:port',
'method' => 'GET',
'request_fulluri' => true)
);
$ctx = stream_context_create($opts);
$lines = file($uri, false, $ctx);
foreach ($lines as $line)
...
This returns nothing any more. The link btw is the IMF link for exchange rates, so that is open to all - if you open it you'll get a download with a rate table in it. The rest of the script basically parses this for the data we want.
Now, pretty sure our proxy is OK. Running some tests with curl gives the following results:
curl --proxy tcp://internal.proxy.address:port -v https://www.imf.org/external/np/fin/data/rms_mth.aspx?SelectDate=05/28/2020&reportType=CVSDR&tsvflag=Y
(specify https) works just fine.
curl --proxy tcp://internal.proxy.address:port -v http://www.imf.org/external/np/fin/data/rms_mth.aspx?SelectDate=05/28/2020&reportType=CVSDR&tsvflag=Y
(specify http) does not work, and shows a 301 error
curl --proxy tcp://internal.proxy.address:port -v -L http://www.imf.org/external/np/fin/data/rms_mth.aspx?SelectDate=05/28/2020&reportType=CVSDR&tsvflag=Y
(specify http with follow redirects) then works OK.
I've tried a few things after some googling. It seems I need opts for 'ssl' as well when using https. So I've made the following changes
$uri = "https://www.imf.org/external/np/fin/data/rms_mth.aspx"
."?SelectDate=$date&reportType=CVSDR&tsvflag=Y";
$opts = array('http' => array(
'proxy' => 'tcp://internal.proxy.address:port',
'method' => 'GET',
'request_fulluri' => true),
'ssl' => array(
'verify_peer' => false,
'verify_peer_name' => false,
'SNI_enabled' => false)
);
Sadly, the SNI_enabled flag was introduced after 5.3.0, so I don't think this helps. There's also a follow_location context option for http, but that was introduced in 5.3.4, so also no use.
(BTW, I have little to no control over the version of PHP we have, so while I appreciate higher versions may offer better solutions, that's not a lot of use to me I'm afraid).
Basically, I am now stuck. No combination of these parameters or settings returns any data at all. I can see it works via curl and the proxy, so it's not a general connectivity issue.
Any and all suggestions gratefully received!
Update: After adding the lines to enable error reporting, the error code is for the stream connecting:
Warning: file(https://www.imf.org/external/np/fin/data/rms_mth.aspx?SelectDate=05/28/2020&reportType=CVSDR&tsvflag=Y): failed to open stream: Cannot connect to HTTPS server through proxy in /usr/bass/apps/htdocs/BASS/mods/module.XSM.php on line 79
(line 79 is the $lines = ... line)
So it doesn't connect in the php script, but running the same connection via the proxy in curl works fine. What's the difference in php that causes this?
You can use php curl functions to get the response from your given url. And then you can use explode() function to break the response line by line.
$uri = "https://www.imf.org/external/np/fin/data/rms_mth.aspx"
."?SelectDate=$date&reportType=CVSDR&tsvflag=Y";
$opts = array(
CURLOPT_URL => $uri,
CURLOPT_PROXY => 'tcp://internal.proxy.address:port',
CURLOPT_HEADER => false,
CURLOPT_RETURNTRANSFER => true
);
$ch = curl_init();
curl_setopt_array($ch, $opts);
$lines = curl_exec($ch);
curl_close($ch);
$lines = explode("\n", $lines); // breaking the whole response string line by line
foreach ($lines as $line)
...

PHP simple_load_file and file_get_content with email and password

I'd like to get password protected xml file in my script via url like:
account#domain.com:password#domain2.com/file.xml
I found working in web browser version (just type in browser and xml appear with no redirection or password prompt) of url:
account%40domain.com:password#domain2.com/file.xml
So I try use this url with simple_load_file() and file_get_contents(), but it did not work (I know this functions need urlencode due to special character in url but account%40domain.com:password%40domain2.com/file.xml does not work).
So I try another solution found on stackoverlow:
$username = 'account#domain.com';
$password = 'password';
$context = stream_context_create(array(
'http' => array(
'header' => "Authorization: Basic " . base64_encode("$username:$password")
)
));
$data = file_get_contents('domain2.com/file.xml', false, $context);
And I get error:
...failed to open stream: No such file or directory in...
I also try curl, no success too (returned code 302). I'm out of idea to fix this problem. Anyone can help me, please?
file_get_contents will follow redirects (e.g. 302) automatically, but you do need to tell it that you're fetching a remote file, by providing the scheme. Change
$data = file_get_contents('domain2.com/file.xml', false, $context);
to
$data = file_get_contents('http://domain2.com/file.xml', false, $context);
(or https:// if appropriate)
If you don't provide the scheme, it will try to open a file on the local machine - i.e. a file called file.xml under the directory domain2.com

file_get_contents() gets 403 from api.github.com every time

I call myself an experienced PHP developer, but this is one drives me crazy. I'm trying to get release informations of a repository for displaying update-warnings, but I keep returning 403 errors. For simplifying it I used the most simple usage of GitHubs API: GET https://api.github.com/zen. It is kind of a hello world.
This works
directly in the browser
with a plain curl https://api.github.com/zen in a terminal
with a PHP-Github-API-Class like php-github-api
This works not
with a simple file_get_contents()from a PHP-Skript
This is my whole simplified code:
<?php
$content = file_get_contents("https://api.github.com/zen");
var_dump($content);
?>
The browser shows Warning: file_get_contents(https://api.github.com/zen): failed to open stream: HTTP request failed! HTTP/1.0 403 Forbidden, the variable $content is a boolean and false.
I guess I'm missing some sort of http-header-fields, but neither can I find those informations in the API-Docs, nor uses my terminal curl-call any special header files and works.
This happens because GitHub requires you to send UserAgent header. It doesn't need to be anything specific. This will do:
$opts = [
'http' => [
'method' => 'GET',
'header' => [
'User-Agent: PHP'
]
]
];
$context = stream_context_create($opts);
$content = file_get_contents("https://api.github.com/zen", false, $context);
var_dump($content);
The output is:
string(35) "Approachable is better than simple."

File cache in gae PHP

I'm having some trouble using gae php as a simple proxy using "file_get_contents"
When i load a file for the first time I get the latest version available.
But if I change the content of the file, I dont get the latest version immediately.
$result = file_get_contents('http://example.com/'.$url);
The temporary solution I found was to add a random variable at the end of the query string, which allowed me to get a fresh version of the file every time :
$result = file_get_contents('http://example.com/'.$url.'?r=' . rand(0, 9999));
But this trick doesn't work for api calls with parameters for example.
I tried disabling APC cache in the php.ini of gae (using apc.enabled = "0") and i used clearstatcache(); in my script, but neither work.
Any ideas ?
Thanks.
As described in the appengine documentation the http stream wrapper uses urlfetch. As seen in another question urlfetch provides a public/shared cache and as such does not allow individual apps to clear it. For your own services you can set the HTTP cache headers to reduce or void the cache as necessary.
Additionally, you can also add HTTP request headers indicating the maximum age of data that is allowed to be returned. The python example given in mailing list thread is:
result = urlfetch.fetch(url, headers = {'Cache-Control' : 'max-age=300'})
Per php.net file_get_contents http header example and HTTP header documentation a modified example would be:
<?php
$opts = [
'http' => [
'method' => 'GET',
'header' => "Cache-Control: max-age=60\r\n",
],
];
$context = stream_context_create($opts);
$file = file_get_contents('http://www.example.com/', false, $context);
?>

PHP Get Content of HTTP 400 Response

I am using PHP with the Amazon Payments web service. I'm having problems with some of my requests. Amazon is returning an error as it should, however the way it goes about it is giving me problems.
Amazon returns XML data with a message about the error, but it also throws an HTTP 400 (or even 404 sometimes). This makes file_get_contents() throw an error right away and I have no way to get the content. I've tried using cURL also, but never got it to give me back a response.
I really need a way to get the XML returned regardless of HTTP status code. It has an important "message" element that gives me clues as to why my billing requests are failing.
Does anyone have a cURL example or otherwise that will allow me to do this? All my requests currently use file_get_contents() but I am not opposed to changing them. Everyone else seems to think cURL is the "right" way.
You have to define custom stream context (3rd argument of function file_get_contents) with ignore_errors option on.
As a follow-up to DoubleThink's post, here is a working example:
$url = 'http://whatever.com';
//Set stream options
$opts = array(
'http' => array('ignore_errors' => true)
);
//Create the stream context
$context = stream_context_create($opts);
//Open the file using the defined context
$file = file_get_contents($url, false, $context);

Categories