I am using below script to get data from a website. data is return but it is in gzip or some encoded format. I tried to use gzdecode but it is not working on it. is there any way to see clean data from this request.
I use
curl_setopt($ch, CURLOPT_ENCODING , 'deflate');
curl_setopt($ch, CURLOPT_ENCODING , 'gzip');
curl_setopt($ch, CURLOPT_ENCODING , 'br');
but none of them is working. below is curl request
$ch = curl_init();
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($ch, CURLOPT_URL, 'https://www.example.com');
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 20);
curl_setopt($ch, CURLOPT_TIMEOUT, 20);
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_ENCODING , 'deflate');
$response = curl_exec($ch);
$d = curl_getinfo( $ch );
curl_getinfo is showing below
I can see that site is using "br" encoding i.e Content-Encoding: br
br encoding is Brotli encoding. You can pass it in the Accept-Encoding header with curl_setopt($ch, CURLOPT_ENCODING , 'br'), but it won't be handled by curl, i.e., you will have to decode the output explicitly.
You can probably use this PHP extension: https://github.com/kjdev/php-ext-brotli
You can also try to use curl_setopt($ch, CURLOPT_ENCODING , 'identity'), and, if the server you are calling behaves properly, get the data uncompressed.
I guess you've already tried to leave the Accept-Encoding header completely out. Unfortunately, according the specs, this does not prevent the output to be encoded.
In header i allowed gzip and deflate only and removed br and it worked for me. So instead of this $header[] = 'Accept-Encoding: gzip, deflate, br'; i used $header[] = 'Accept-Encoding: gzip, deflate';
Thanks for help every one.
curl_setopt($ch, CURLOPT_ENCODING , 'deflate');
curl_setopt($ch, CURLOPT_ENCODING , 'gzip');
curl_setopt($ch, CURLOPT_ENCODING , 'br');
subsequent calls overwrite the previous value, it doesn't add to the previous value. if you want to support deflate, gzip, and br, then separate them with comma, eg
curl_setopt($ch, CURLOPT_ENCODING , 'gzip,deflate,br');
however, br is a recent addition to curl, br support was first added to curl at version 7.57.0, released at November 29 2017, so you might want to add
if(!definied("CURL_VERSION_BROTLI")){
// https://github.com/curl/curl/blob/f762fec323f36fd7da7ad6eddfbbae940ec3229e/include/curl/curl.h#L2720
define("CURL_VERSION_BROTLI",(1<<23));
}
if(!(curl_version()["features"] & CURL_VERSION_BROTLI)){
throw new \RuntimeException("this script requires brotli support added to libcurl (added in libcurl version 7.57.0, released November 29 2017), please update your libcurl installation.");
}
to ensure that br is actually supported by your php's libcurl, if you require it.
Related
I'm trying to curl (in PHP) a URL and send a custom header in the request. But then I also need to be able to view the response header that is returned. I'm querying an external API that I don't control.
I've tried using both the CURLOPT_HTTPHEADER and CURLOPT_HEADER options but they don't seem to work well together. CURLOPT_HEADER seems to overwrite the request headers so I can't authenticate my request BUT I then can view the headers in the response. If I take CURLOPT_HEADER out, I can successfully authenticate, but can't view headers.
PHP Code:
$url = "http://url-goes-here";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT ,0);
curl_setopt($ch, CURLOPT_TIMEOUT, 60);
$headers = array("X-Auth-Token: $token");
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_HEADER, 1);
$out = curl_exec($ch);
curl_close($ch);
As per the documentation, it has to be
curl_setopt($ch, CURLOPT_HEADER, 1); // or TRUE
instead of
curl_setopt($ch, CURLOPT_HEADER, $headers);
EDIT: I must say I can't replicate the issue. I've created two scripts: a.php (using your content, with $url changed to 'http://localhost/b.php') and b.php (queried by a.php) which contains this:
<?php
foreach (getallheaders() as $name => $value) {
echo "$name: $value\n";
}
So when I run php a.php, I get this:
HTTP/1.1 200 OK
Date: Tue, 16 Feb 2016 02:42:45 GMT
Server: Apache/2.4.10 (Fedora) PHP/5.6.15
X-Powered-By: PHP/5.6.15
Content-Length: 50
Content-Type: text/html; charset=UTF-8
Host: localhost
Accept: */*
X-Auth-Token: xxx
Which means I'm 1) getting the response headers successfully, and 2) receiving them successfully as well from a.php. I'd suggest you trying something similar and see if your web server (or your application) is playing tricks with you.
Consider following URL:
click here
There is some encoding into Japanese characters. Firefox browser on my PC is able to detect it automatically and show the characters. For Chrome, on the other hand, I have to change the encoding manually to "Shift_JIS" to see the japanese characters.
If I try to access the content via PHP-cURL, the encoded text appears garbled like this
���ϕi�̂��ƂȂ��I�݂��Ȃ̃N�`�R�~�T�C�g�������������i�A�b�g�R�X���j�ɂ��܂����I
I tried:
curl_setopt($ch, CURLOPT_ENCODING, 'Shift_JIS');
I also tried (after downloading the curl response):
$output_str = mb_convert_encoding($curl_response, 'Shift_JIS', 'auto');
$output_str = mb_convert_encoding($curl_response, 'SJIS', 'auto');
But that does not work either.
Here is the full code
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language: en-US,en;q=0.5',
'Connection: keep-alive'
));
//curl_setopt($ch, CURLOPT_ENCODING, 'SJIS');
curl_setopt($ch, CURLOPT_USERAGENT, $useragent);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
curl_setopt($ch, CURLOPT_TIMEOUT, 20);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
$response = curl_exec($ch);
That page doesn't return valid HTML, it's actually Javascript. If you fetch it with curl and output it, add header('Content-type: text/html; charset=shift_jis'); to your code and when you load it in Chrome the characters will display properly.
Since the HTML doesn't specify the character set, you can specify it from the server using header().
To actually convert the encoding so it will display properly in your terminal, you can try the following:
Use iconv() to convert to UTF-8
$curl_response = iconv('shift-jis', 'utf-8', $curl_response);
Use mb_convert_encoding() to convert to UTF-8
$curl_response = mb_convert_encoding($curl_response, 'utf-8', 'shift-jis');
Both of those methods worked for me and I was able to see Japanese characters displayed correctly on my terminal.
UTF-8 should be fine, but if you know your system is using something different, you can try that instead.
Hope that helps.
The following code will output the Japanese characters correctly in the browser:-
<?php
// create a new cURL resource
$ch = curl_init();
// set URL and other appropriate options
curl_setopt($ch, CURLOPT_URL, $setUrlHere);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
// grab URL content
$response = curl_exec($ch);
// close cURL resource, and free up system resources
curl_close($ch);
header('Content-type: text/html; charset=shift_jis');
echo $response;
I have been trying several ways, before deciding to ask this question here... no way has succeeded with me... I'm trying to decode and read the data from one site that use gzip.
I'm using cURL & PHP. When I try to decode and print the result, I'm getting a long list of garbled special characters such as:
JHWkdsU01EUXdWa1pXYTFOdFZsZFRiaz
VoVW14S2NGbFljRmRXYkdSWVpFZEdWRT
FYVWtoWmEyaExXVlpLTm1KR1VsWmlXR2
If I run the below PHP script I got an error like:
PHP Warning: gzdecode(): data error in /var/www/mn.php on line 20
Here's my current code:
<?
$data_string = '9999';
$ch = curl_init('http://example.com/getN.php&keyword=');
curl_setopt( $ch, CURLOPT_USERAGENT, 'Darwin/15.0.0' );
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'POST');
curl_setopt($ch, CURLOPT_POSTFIELDS, $data_string);
curl_setopt($ch,CURLOPT_ENCODING , 'gzip');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_TIMEOUT,5);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE); // Follow redirects
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Content-Type: application/x-www-form-urlencoded',
'Accept-Encoding: gzip, deflate',
'Content-Length: ' . strlen($data_string))
);
$result = gzdecode ( curl_exec($ch) );
curl_close($ch);
print_r($result);
?>
I also try to enable deflate module by:
a2enmod deflate
/etc/init.d/apache2 restart
and enable the zlib from php.ini
either I try to test it directly
curl -sH 'Accept-encoding: gzip' http://example.com/getN.php&keyword=9999 | gunzip -
I got the same result.
Here is the info from the site:
HTTP/1.1 200 OK
Server: nginx
Date: Thu, 15 Oct 2015 00:41:54 GMT
Content-Type: text/html; charset=utf-8
Transfer-Encoding: chunked
Vary: Accept-Encoding
X-Powered-By: PHP/5.4.31
X-Frame-Options: SAMEORIGIN
Content-Encoding: gzip
please help
I notice your code has
curl_setopt($ch,CURLOPT_ENCODING , 'gzip');
and a gzdecode() call later on. If instructed to accept encoded content, cURL handles decoding automatically for you, without the need to manually do it after curl_exec(). Its return value is already decoded if you told cURL to accept encoded transfer.
That said, the page you are trying to download may not be actually be encoded with gzip, but another method. As stated in the manual, try specifying an empty string:
# Enable all supported encoding types.
curl_setopt($ch, CURLOPT_ENCODING, '');
This enables all supported encoding types. And don't use gzdecode(). The result should be already decoded.
thanks all ,, finally start working after I take your advice and remove gzdecode and some others and keep the header to.. Accept Encoding to gzip and here the final code
<?
$data_string = '9999';
$ch = curl_init('http://example.com/getN.php&keyword=');
curl_setopt( $ch, CURLOPT_USERAGENT, 'Darwin/15.0.0' );
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'POST');
curl_setopt($ch, CURLOPT_POSTFIELDS, $data_string);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_TIMEOUT,5);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE); // Follow redirects
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Content-Type: application/x-www-form-urlencoded',
'Accept-Encoding: gzip',
'Content-Length: ' . strlen($data_string))
);
$result = curl_exec($ch);
curl_close($ch);
print $result;
?>
I'm trying to post a file with curl in php, but the file is never uploaded/accepted by the server. I have searched and tried for several hours, but I can't find whats wrong, everyone elses examples and codes seems to work, but not this one.
Here is the code:
<?php
$url = "http://jpptst.ams.se/0.52/default.aspx";
$headers = array(
"Content-Type: text/xml; charset=iso-8859-1",
"Accept: text/xml"
);
$data = array("file" => "#documents/xmls/1298634571.xml");
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_VERBOSE, false);
curl_setopt($ch, CURLOPT_TIMEOUT, 60);
curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
$response = curl_exec($ch);
curl_close($ch);
var_dump($response);
?>
The result I get:
string(904) "HTTP/1.1 100 Continue
HTTP/1.1 200 OK
Date: Mon, 25 Jul 2011 19:13:41 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 2.0.50727
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 659"
Thats all I get.. the file is never accepted by the server.
If anyone can help me with this problem it would be much appreciated :)
Thanks!
You're trying to upload a file via HTTP post, so sending a Content-type: text/xml header is inappropriate. An HTTP file upload is actually done as multipart/form-data, and is actually pretty much identical to a MIME-encoded email attachment. PHP's curl will fill in the header details for you automatically. As well, the Accept header is not necessary either.
Check that the path to the .xml file you're trying to upload is correct. You've not specified a leading / to it, so the path is relative to where your PHP script is executing from.
Replace:
$data = array("file" => "#documents/xmls/1298634571.xml");
With this:
$data = array("file" => "#".realpath('documents/xmls/1298634571.xml'));
Try it, might work, i'm not sure tho.
EDIT:
Try this out:
<?php
$xmldatafile="documents/xmls/1298634571.xml"; // Make sure the file path is correct
function postData($postFields,$url){
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_POST ,1);
curl_setopt($ch, CURLOPT_POSTFIELDS ,$postFileds);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION ,1);
curl_setopt($ch, CURLOPT_HEADER ,0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER ,1);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
$xmlData = file_get_contents($xmldatafile);
$postFileds = 'data='.$xmlData;
$result = postData($postFields,"http://jpptst.ams.se/0.52/default.aspx");
?>
I am trying to post a xml string to a remote perl script via cURL. I want the xml string to be posted as a post parameter 'myxml'. See the code I am using below:
$url = 'http://myurl.com/cgi-bin/admin/xml/xml_append_list_init.pl';
$xml = '<?xml version="1.0" standalone="yes"?>
<SUB_appendlist>
<SUB_user>username</SUB_user>
<SUB_pass>password</SUB_pass>
<list_id>129</list_id>
<append>
<subscriber>
<address>test#test.comk</address>
<first_name>Test</first_name>
<last_name>Test</last_name>
</subscriber>
</append>
</SUB_appendlist>';
$ch = curl_init(); //initiate the curl session
curl_setopt($ch, CURLOPT_URL, $url); //set to url to post to
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // tell curl to return data in a variable
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Content-Type: text/xml", "Content-length: ".strlen($xml)));
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, 'myxml='.urlencode($xml)); // post the xml
curl_setopt($ch, CURLOPT_TIMEOUT, (int)30); // set timeout in seconds
$xmlResponse = curl_exec($ch);
curl_close ($ch);
However the remote server is not seeing the data in the 'myxml' parameter. And I get the following response back in $xmlResponse
HTTP/1.1 200 OK
Date: Fri, 15 Apr 2011 12:00:44 GMT
Server: Apache/2.2.9 (Debian)
Vary: Accept-Encoding
Content-Length: 0
Content-Type: text/html; charset=ISO-8859-1
I'm not a cURL expert by any measure so I may be doing something in mu cURL request which is obviously wrong. Would appreciate it if anyone can shed any light or spot any problems in this. Hope that is enough information.
Cheers,
Adrian.
The body of your message is not text/xml data. It is application/x-www-form-urlencoded data. You have form data containing XML, not plain XML.
Your problem is akin to trying to open MyDoc.zip in MS Word. You have to deal with it as a zip file before dealing with it as Word.
Based on my reading of the PHP manual, you want to remove:
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Content-Type: text/xml", "Content-length: ".strlen($xml)));
and change the POSTFIELDS line to:
curl_setopt($ch, CURLOPT_POSTFIELDS, array(
'myxml' => $xml
));
That is not the correct content type for all browsers.
see this article
sometimes the content type for xml is: application/rss+xml