PHP script no longer receiving data from context stream - php

We have a "legacy" script that stopped working a little while back. Pretty sure it's because the endpoint it's connecting to changed from http to https, and the old http address now returns a 301.
I've never done anything other than tiny changes to PHP scripts, so am a little out of my depth here.
Note that our PHP version is old - 5.3.0. This may well be part of the problem.
The script as-is (relevant bit anyway):
$uri = "http://www.imf.org/external/np/fin/data/rms_mth.aspx"
."?SelectDate=$date&reportType=CVSDR&tsvflag=Y";
$opts = array('http' => array(
'proxy' => 'tcp://internal.proxy.address:port',
'method' => 'GET',
'request_fulluri' => true)
);
$ctx = stream_context_create($opts);
$lines = file($uri, false, $ctx);
foreach ($lines as $line)
...
This returns nothing any more. The link btw is the IMF link for exchange rates, so that is open to all - if you open it you'll get a download with a rate table in it. The rest of the script basically parses this for the data we want.
Now, pretty sure our proxy is OK. Running some tests with curl gives the following results:
curl --proxy tcp://internal.proxy.address:port -v https://www.imf.org/external/np/fin/data/rms_mth.aspx?SelectDate=05/28/2020&reportType=CVSDR&tsvflag=Y
(specify https) works just fine.
curl --proxy tcp://internal.proxy.address:port -v http://www.imf.org/external/np/fin/data/rms_mth.aspx?SelectDate=05/28/2020&reportType=CVSDR&tsvflag=Y
(specify http) does not work, and shows a 301 error
curl --proxy tcp://internal.proxy.address:port -v -L http://www.imf.org/external/np/fin/data/rms_mth.aspx?SelectDate=05/28/2020&reportType=CVSDR&tsvflag=Y
(specify http with follow redirects) then works OK.
I've tried a few things after some googling. It seems I need opts for 'ssl' as well when using https. So I've made the following changes
$uri = "https://www.imf.org/external/np/fin/data/rms_mth.aspx"
."?SelectDate=$date&reportType=CVSDR&tsvflag=Y";
$opts = array('http' => array(
'proxy' => 'tcp://internal.proxy.address:port',
'method' => 'GET',
'request_fulluri' => true),
'ssl' => array(
'verify_peer' => false,
'verify_peer_name' => false,
'SNI_enabled' => false)
);
Sadly, the SNI_enabled flag was introduced after 5.3.0, so I don't think this helps. There's also a follow_location context option for http, but that was introduced in 5.3.4, so also no use.
(BTW, I have little to no control over the version of PHP we have, so while I appreciate higher versions may offer better solutions, that's not a lot of use to me I'm afraid).
Basically, I am now stuck. No combination of these parameters or settings returns any data at all. I can see it works via curl and the proxy, so it's not a general connectivity issue.
Any and all suggestions gratefully received!
Update: After adding the lines to enable error reporting, the error code is for the stream connecting:
Warning: file(https://www.imf.org/external/np/fin/data/rms_mth.aspx?SelectDate=05/28/2020&reportType=CVSDR&tsvflag=Y): failed to open stream: Cannot connect to HTTPS server through proxy in /usr/bass/apps/htdocs/BASS/mods/module.XSM.php on line 79
(line 79 is the $lines = ... line)
So it doesn't connect in the php script, but running the same connection via the proxy in curl works fine. What's the difference in php that causes this?

You can use php curl functions to get the response from your given url. And then you can use explode() function to break the response line by line.
$uri = "https://www.imf.org/external/np/fin/data/rms_mth.aspx"
."?SelectDate=$date&reportType=CVSDR&tsvflag=Y";
$opts = array(
CURLOPT_URL => $uri,
CURLOPT_PROXY => 'tcp://internal.proxy.address:port',
CURLOPT_HEADER => false,
CURLOPT_RETURNTRANSFER => true
);
$ch = curl_init();
curl_setopt_array($ch, $opts);
$lines = curl_exec($ch);
curl_close($ch);
$lines = explode("\n", $lines); // breaking the whole response string line by line
foreach ($lines as $line)
...

Related

cannot convert JSON response from windows-1253 to utf8

I'm trying to parse a JSON response from a web service I have no control over.
These are the headers
This is the body I see in php with sensitive parts hidden
I'm using guzzle http client to send the request and to retrieve the response
If I try to decode it directly I receive an empty object so I'm assuming a conversion is needed so I am trying to convert the response contents like this
json_decode(iconv($charset, 'UTF-8', $contents))
or
mb_convert_encoding($contents, 'UTF-8', $charset);
both of which throw an exception.
Notice: iconv(): Wrong charset, conversion from 'windows-1253' to 'UTF-8' is not allowed in Client.php on line 205
Warning: mb_convert_encoding(): Illegal character encoding specified in Client.php on line 208
I've used this piece of code successfully before but I can't understand why it fails now.
Sending the same request using POSTMAN correctly retrieves the data without broken characters and it seems to show the same headers and body received.
I'm updating based on comments.
mb_detect_encoding($response->getBody()) -> UTF-8
mb_detect_encoding($response->getBody->getContents()) -> ASCII
json_last_error_msg -> Malformed UTF-8 characters, possibly incorrectly encoded
Additionally as a trial and error attempt I tried all iconv encodings to see if any could convert it to utf-8 without an error to detect the encoding using this one
private function detectEncoding($str){
$iconvEncodings = [...]
$finalEncoding = "unknown";
foreach($iconvEncodings as $encoding){
try{
iconv($encoding, 'UTF-8', $str);
return $encoding;
}
catch (\Exception $exception){
continue;
}
}
return $finalEncoding;
}
Apparently no encoding worked and everything gave the same exception. I'm assuming the problem is with retrieving the response json correctly via guzzle and not with iconv itself. It can't be that it's not any of the 1000+ ones.
Some more info with CURL
I just retried the same payload using CURL
/**
* #param $options
* #return bool|string
*/
public function makeCurlRequest($options)
{
$payload = json_encode($options);
// Prepare new cURL resource
$ch = curl_init($this->softoneurl);
curl_setopt_array($ch, [
CURLOPT_RETURNTRANSFER => true, // return web page
CURLOPT_HEADER => false, // don't return headers
CURLOPT_FOLLOWLOCATION => true, // follow redirects
CURLOPT_MAXREDIRS => 10, // stop after 10 redirects
CURLOPT_ENCODING => "", // handle compressed
CURLOPT_USERAGENT => "test", // name of client
CURLOPT_AUTOREFERER => true, // set referrer on redirect
CURLOPT_CONNECTTIMEOUT => 120, // time-out on connect
CURLOPT_TIMEOUT => 120, // time-out on response
CURLINFO_HEADER_OUT => true,
CURLOPT_POST => true,
CURLOPT_POSTFIELDS => $payload,
]);
// Set HTTP Header for POST request
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Content-Type: application/json',
'Content-Length: ' . strlen($payload))
);
// Submit the POST request
$result = curl_exec($ch);
// Close cURL session handle
curl_close($ch);
return $result;
}
I received the exact same string and the exact same results with converting it. Perhaps an option I'm missing?
Apparently there's something wrong with iconv itself in the environment and it's not application specific. Running the following code via SSH
php -r "var_dump(iconv('Windows-1253', 'UTF-8', 'test'));"
yields
PHP Notice: iconv(): Wrong charset, conversion from `Windows-1253' to `UTF-8' is not allowed in Command line code on line 1
PHP Stack trace:
PHP 1. {main}() Command line code:0
PHP 2. iconv(*uninitialized*, *uninitialized*, *uninitialized*) Command line code:1
Command line code:1:
bool(false)
Perhaps some dependency is missing
About 14 hours of troubleshooting later I'm able to answer my own question correctly. In my case since this was running in the context of a CLI command, it caused an issue due to missing libraries. Basically the CLI php binary didn't have access to some libraries iconv needed.
More specifically the gconv libraries.
In my case in Debian 9 it was located in
/usr/lib/x86_64-linux-gnu/gconv
and this folder contains a lot of libraries for each encoding used.
A good way to understand this is if you run in a system you have root access the command
strace iconv -f <needed_encoding> -t utf-8
It will yield a lot of folders that iconv tries to access including the gconv folder and will point you to the location of the ones you need to include in your SSH environment. If you don't have access as root you have to ask your hosting provider.
Try this:
$response = $guzzle->request('GET', $url);
$type = $response->getHeader('content-type');
$parsed = Psr7\parse_header($type);
$original_body = (string)$response->getBody();
$utf8_body = mb_convert_encoding($original_body, 'UTF-8', $parsed[0]['charset'] ?: 'UTF-8');
For those who had the same issue there is a simpliest method to resolve it i know its 3 years later but u can also set some headers.
header('Content-Type: application/json; charset=windows-1253');
that solved my problem instantly.

How to get the PHP include function to work with proxy settings?

<?php
$incfile = $_REQUEST["file"];
include($incfile);
?>
upload.php file:
<?php
$context = array(
'http' => array(
'proxy' => "tcp://proxy.example.com:80",
'request_fulluri' => true,
'verify_peer' => false,
'verify_peer_name' => false,
)
);
stream_context_set_default($context);
?>
proxy.php file:
auto_prepend_file=proxy.php
allow_url_include=1
php.ini:
I browse to http://testexample.com/upload.php?file=http://example.com/file.php but http://example.com/file.php times out with error Warning: include(): failed to open stream: Connection timed out. I played with echo file_get_contents and used the URL path and that works fine as it appears to honor the proxy settings. So does anyone know what the issue might be with using include or why it does not use my proxy settings?
Edit: As a workaround I used this code below:
<?php
$incfile = $_REQUEST["file"];
$filecontent = file_get_contents($incfile);
eval($filecontent);
?>
The problem with this though is that it reads in the PHP as a string and not the whole file. So I have to remove the PHP beginning and ending tags which changes the GET request body so effects my results. So even though it kinds works, the include function is really what I need.
So you need your http requests for example.com to go to go through proxy.example.com. Would it suffice to simply override DNS for example.com to point to proxy.example.com - perhaps in the hosts file - on this development server? Then you could
include 'http://example.com/file.php';
If you want to limit the solution to PHP, you could define a custom stream wrapper for your proxy.
http://php.net/manual/en/function.stream-wrapper-register.php
http://php.net/manual/en/stream.streamwrapper.example-1.php

file_get_contents() gets 403 from api.github.com every time

I call myself an experienced PHP developer, but this is one drives me crazy. I'm trying to get release informations of a repository for displaying update-warnings, but I keep returning 403 errors. For simplifying it I used the most simple usage of GitHubs API: GET https://api.github.com/zen. It is kind of a hello world.
This works
directly in the browser
with a plain curl https://api.github.com/zen in a terminal
with a PHP-Github-API-Class like php-github-api
This works not
with a simple file_get_contents()from a PHP-Skript
This is my whole simplified code:
<?php
$content = file_get_contents("https://api.github.com/zen");
var_dump($content);
?>
The browser shows Warning: file_get_contents(https://api.github.com/zen): failed to open stream: HTTP request failed! HTTP/1.0 403 Forbidden, the variable $content is a boolean and false.
I guess I'm missing some sort of http-header-fields, but neither can I find those informations in the API-Docs, nor uses my terminal curl-call any special header files and works.
This happens because GitHub requires you to send UserAgent header. It doesn't need to be anything specific. This will do:
$opts = [
'http' => [
'method' => 'GET',
'header' => [
'User-Agent: PHP'
]
]
];
$context = stream_context_create($opts);
$content = file_get_contents("https://api.github.com/zen", false, $context);
var_dump($content);
The output is:
string(35) "Approachable is better than simple."

XML-RPC failing to respond to POST requests via cURL in PHP

I'm having some issues with calling WordPress XML-RPC via cURL in PHP. It's a WordPress.com hosted blog, and the XML-RPC file is located at http://sunseekerblogbook.com/xmlrpc.php.
Starting yesterday (or at least, yesterday was when it was noticed), cURL has been failing with error #52: Empty reply from server.
The code snippet we're using is below:
$ch = curl_init('http://sunseekerblogbook.com/xmlrpc.php');
curl_setopt_array($ch, [
CURLOPT_HEADER => false,
CURLOPT_HTTPHEADER => [
'Content-Type: text/xml'
],
CURLOPT_POSTFIELDS => xmlrpc_encode_request('wp.getPosts', [
1,
WP_USERNAME,
WP_PASSWORD,
[
'number' => 15
]
]),
CURLOPT_RETURNTRANSFER => true
]);
$ret = curl_exec($ch);
$data = xmlrpc_decode($ret, 'UTF-8');
Using cURL directly however, everything returns exactly as expected:
$output = [];
exec('curl -d "<?xml version=\"1.0\" encoding=\"UTF-8\"?><methodCall><methodName>wp.getPosts</methodName><params><param><value><int>1</int></value></param><param><value><string>' . WP_USERNAME . '</string></value></param><param><value><string>' . WP_PASSWORD . '</string></value></param><param><value><struct><member><name>number</name><value><int>15</int></value></member></struct></value></param></params></methodCall>" sunseekerblogbook.com/xmlrpc.php', $output);
$data = xmlrpc_decode(implode('', $output), 'UTF-8');
We've been successfully able to query WordPress since July 2013, and we're at a dead-end as to why this has happened. It doesn't look like PHP or cURL have been updated/changed recently on the server, but the first code snippet has failed on every server we've tried it on now (with PHP 5.4+).
Using the http://sunseekerblogbook.wordpress.com/xmlrpc.php link gives the same issue.
Is there anything missing from the PHP code that would cause this issue? That it's suddenly stopped working over 12 months down the line is what has flummoxed me.
Managed to fix it. Looking at the headers sent by cURL, the only differences were that the cURL command line uses Content-Type: application/x-www-form-urlencoded and that the user agent was set to User-Agent: curl/7.30.0.
The choice of content type didn't affect it, but setting a user agent sorted it! It seems WordPress.com (but not self-hosted WordPress.org sites running the latest v3.9.2) now requires a user agent for XML-RPC requests, though this hasn't been documented anywhere that I can find.

'&' becomes '&' when trying to get contents from a URL

I was running my WebServer for months with the same Algorithm where I got the content of a URL by using this line of code:
$response = file_get_contents('http://femoso.de:8019/api/2/getVendorLogin?' . http_build_query(array('vendor'=>$vendor,'user'=>$login,'pw'=>$pw),'','&'));
But now something must have changed as out of sudden it stopped working.
In earlier days the URL looked like it should have been:
http://femoso.de:8019/api/2/getVendorLogin?vendor=100&user=test&pw=test
but now I get an error in my nginx log saying that I requested the following URL which returned a 403
http://femoso.de:8019/api/2/getVendorLogin?vendor=100&user=test&pw=test
I know that something changed on the target server, but I think that shouldn't affect me or not?!
I already spent hours and hours of reading and searching through Google and Stackoverflow, but all the suggested ways as
urlencode() or
htmlspecialchars() etc...
didn't work for me.
For your information, the environment is a zend application with a nginx server on my end and a php webservice with apache on the other end.
Like I said, it changed without any change on my side!
Thanks
Let's find out the culprit!
1) Is it http_build_query ? Try replacing:
'http://femoso.de:8019/api/2/getVendorLogin?' . http_build_query(array('vendor'=>$vendor,'user'=>$login,'pw'=>$pw)
with:
"http://femoso.de:8019/api/2/getVendorLogin?vendor={$vendor}&user={$login}&pw={$pw}"
2) Is some kind of post-processing in the place? Try replacing '&' with chr(38)
3) Maybe give a try and play a little bit with cURL?
$ch = curl_init();
curl_setopt_array($ch, array(
CURLOPT_URL => 'http://femoso.de:8019/api/2/getVendorLogin?' . http_build_query(array('vendor'=>$vendor,'user'=>$login,'pw'=>$pw),
CURLOPT_RETURNTRANSFER => true,
CURLOPT_HEADER => true, // include response header in result
//CURLOPT_FOLLOWLOCATION => true, // uncomment to follow redirects
CURLINFO_HEADER_OUT => true, // track request header, see var_dump below
));
$data = curl_exec($ch);
curl_close($ch);
var_dump($data, curl_getinfo($ch, CURLINFO_HEADER_OUT));
exit;
Sounds like your arg_separator.output is set to "&" in your php.ini. Either comment that line out or change to just "&"
I'm no expert but that's the way the computer reads the address since it's a special character. Something with encoding. Simple fix would be to to filter by utilizing str_replace(). Something along those lines.

Categories