I'm trying to translate a curl command I was given into a command I can run through PHP.
The command is:
curl -F customerid=902 -F username=API1 -F password=somepassword -F reportname=1002 http://somerandomurl.com/api/v1/getreportcsv
When I try to run this through PHP (and eventually through C#) however, the web service returns an error. Any idea what could be wrong with my code that is making it error? I think the web service must be very specific about the headers/request:
$url = "http://somerandomurl.com/api/v1/getreportcsv";
$fields = [
"customerid" => "902",
"username" => "API1",
"password" => "somepassword",
"reportname" => "1002"
];
$fields_string = "";
foreach($fields as $key=>$value) { $fields_string .= $key.'='.$value.'&'; }
rtrim($fields_string,'&');
//open connection
$ch = curl_init();
curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_POST,count($fields));
curl_setopt($ch,CURLOPT_POSTFIELDS,$fields_string);
//execute post
$result = curl_exec($ch);
print $result;
Wireshark shows the following differences:
The below is the one that works:
POST /somefolder/api/v1/getreportcsv HTTP/1.1
Host: somehost
Accept: */*
Content-Length: 65
Content-Type: application/x-www-form-urlencoded
customerid=902&username=API1&password=somepassword&reportname=1002&HTTP/1.1 200 OK
Server: GlassFish Server Open Source Edition 3.1.1
Content-Type: text/html;charset=UTF-8
Content-Length: 6
Date: Thu, 26 Nov 2015 22:51:23 GMT
ERROR
Whereas this one works:
POST /someurl/api/v1/getreportcsv HTTP/1.1
User-Agent: curl/7.33.0
Host: somehost
Accept: */*
Content-Length: 459
Expect: 100-continue
Content-Type: multipart/form-data; boundary=------------------------4b0d14cc31a40c5b
HTTP/1.1 100 Continue
--------------------------4b0d14cc31a40c5b
Content-Disposition: form-data; name="customerid"
902
--------------------------4b0d14cc31a40c5b
Content-Disposition: form-data; name="username"
API1
--------------------------4b0d14cc31a40c5b
Content-Disposition: form-data; name="password"
somepassword
--------------------------4b0d14cc31a40c5b
Content-Disposition: form-data; name="reportname"
1002
--------------------------4b0d14cc31a40c5b--
HTTP/1.1 200 OK
Server: GlassFish Server Open Source Edition 3.1.1
Content-Type: text/html;charset=UTF-8
Transfer-Encoding: chunked
Date: Thu, 26 Nov 2015 23:13:57 GMT
2000
...snip... the results of the api
Obviously they are very different requests, however I wouldn't expect something to be so specific?
The question seems very specific to the service in question.
However, the problem might be with headers. According to curl man page:
-F [...] causes curl to POST data using the Content-Type
multipart/form-data according to RFC 2388
However, according to PHP manual, CURLOPT_POST option will send data using application/x-www-form-urlencoded.
According to the same manual, if value of CURLOPT_POSTFIELDS is an array, the Content-Type header will be set to multipart/form-data. You may also try to set the content type explicitly as a header.
Try setting the following cURL options:
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_POST, count($fields));
curl_setopt($ch, CURLOPT_POSTFIELDS, $fields);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-type: multipart/form-data'));
If that does not work, it might help analysing all the headers send by the command line curl using the -v parameter and try to set them. Also might be good idea to set content length header explicitly.
Related
This question already has answers here:
Downloading files using GZIP
(4 answers)
Closed 3 years ago.
I am trying to get the contents of a page from Amazon using file_get_html() but the output comes with weird characters on echo. Can anyone please explain how can I resolve this issue?
I also found the following two related questions on Stack Overflow but they did not solve my issue. :)
file_get_html() returns garbage
Uncompress gzip compressed http response
Here is my code:
$options = array(
'http'=>array(
'header'=>
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\n".
"Accept-language: en-US,en;q=0.5\r\n" .
"User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.6) Gecko/20091201 Firefox/3.5.6\r\n"
)
);
$context = stream_context_create($options);
$amazon_url = 'https://www.amazon.com/my-url';
$amazon_html = file_get_contents($amazon_url, false, $context);
Here is the output I get:
��T]o�6}��`���0��݊-��"[�bh�tN�b0��.%%�$P��#�(Ų�� ������F#����A�
about 115k characters like this show up in the browser window.
These are my new headers:
$options = array(
'http'=>array(
'header'=>
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\n".
"Accept-language: en-US,en;q=0.5\r\n"
)
);
Will using cURL resolve this issue?
Update:
I tried cURL. Still getting the garbage output. Here are my response headers:
HTTP/1.1 200 OK
Date: Sun, 18 Nov 2018 20:29:28 GMT
Server: Apache/2.4.33 (Win32) OpenSSL/1.1.0h PHP/7.2.5
X-Powered-By: PHP/7.2.5
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8
Can anyone explain the negative votes?
I did a research myself.
Found some related questions on Stack Overflow which did not solve my problem.
Provided all the information that I thought would be helpful.
What else should I include in the question?
Here is my whole code for curl at present. This is the URL I am scraping.
$handle = curl_init();
curl_setopt($handle, CURLOPT_URL, $amazon_url);
curl_setopt($handle, CURLOPT_RETURNTRANSFER, true);
$data = curl_exec($handle);
curl_close($handle);
echo $data;
The output is just a bunch of characters I mentioned above. Here are my request headers:
Host: localhost
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:63.0) Gecko/20100101 Firefox/63.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cookie: AMCV_17EB401053DAF4840A490D4C%40AdobeOrg=-227196251%7CMCIDTS%7C17650%7CMCMID%7C67056225185486460220940124683302119708%7CMCAID%7CNONE%7CMCOPTOUT-1524907071s%7CNONE; mjx.menu=renderer%3ACommonHTML; _ga=GA1.1.2019605490.1529649408; csm-hit=adb:adblk_no&tb:s-3521C4J8F2EP1V0MMQEP|1542578145652&t:1542578146256
Upgrade-Insecure-Requests: 1
Pragma: no-cache
Cache-Control: no-cache
These are from the Network Tab. The response headers are the same as I mentioned above.
Here is the output after adding curl_setopt($handle, CURLOPT_HEADER, 1); to my code:
HTTP/1.1 200 OK Server: Server Content-Type: text/html; charset=UTF-8
Strict-Transport-Security: max-age=47474747; includeSubDomains;
preload x-amz-id-1: 7A162B8JKV6MGZQ3PCH2 Vary:
Accept-Encoding,User-Agent,X-Amzn-CDN-Cache Content-Encoding: gzip
x-amz-rid: 7A162B8JKV6MGZQ3PCH2 Cache-Control: no-transform
X-Frame-Options: SAMEORIGIN Date: Sun, 18 Nov 2018 22:42:51 GMT
Transfer-Encoding: chunked Connection: keep-alive Connection:
Transfer-Encoding Set-Cookie:
x-wl-uid=1a4u8+XgF+IhFF/iavy9mKZCAA0g4HiIYZXR8hKjxGtmOtBW+j67wGABv7ZOTxDRcab+7Qmpjqds=;
Here's the solution:
I ran into the same issue when scraping Amazon.
Simply add the following option before sending your cURL request:
curl_setopt($handle, CURLOPT_ENCODING, 'gzip,deflate,sdch');
I have to upload a ZIP file using HTTPS and this is working only via Linux cURL command. I don't understand what i am missing in PHP cURL request...
Linux cURL [working]:
curl -v -x http://api.test.sandbox.mobile.de:8080 -u USER:PASS -X POST --data-binary #502.zip https://services.mobile.de/upload-api/upload/502.zip
Response:
POST /upload-api/upload/502.zip HTTP/1.1
User-Agent: curl/7.38.0
Host: services.mobile.de
Accept: */*
Content-Length: 6026
Content-Type: application/x-www-form-urlencoded
Expect: 100-continue
HTTP/1.1 100 Continue } [data not shown]
HTTP/1.1 201 Created
Date: Tue, 06 Dec 2016 12:40:41 GMT
Content-Type: text/html;charset=utf-8
Vary: Accept-Encoding
Transfer-Encoding: chunked
PHP cURL [not working]:
$ch = curl_init();
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Authorization: Basic '. base64_encode("USER:PASS"),
'Content-Type: text/plain'
));
curl_setopt($ch,CURLOPT_PROXY, 'api.test.sandbox.mobile.de:8080');
curl_setopt($ch,CURLOPT_URL, 'https://services.mobile.de/upload-api/upload/502.zip');
curl_setopt($ch,CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch,CURLOPT_POST, 1);
curl_setopt($ch,CURLOPT_POSTFIELDS, [ 'file' => new CURLFile('502.zip') ]);
curl_setopt($ch,CURLOPT_VERBOSE, 1);
$result = curl_exec($ch);
curl_close($ch);
Response:
POST /upload-api/upload/502.zip HTTP/1.1
Host: services.mobile.de
Accept: */*
Content-Length: 6225
Expect: 100-continue
Content-Type: text/plain; boundary=------------------------835f6ea7 5f783449
HTTP/1.1 100 Continue
HTTP/1.1 201 Created
Date: Tue, 06 Dec 2016 13:36:21 GMT
Content-Type: text/html;charset=utf-8
Vary: Accept-Encoding
Transfer-Encoding: chunked
On site documentation it's written:
"The upload file must be sent as an HTTP-Payload and in binary format, Multipart and Encoding are not supported."
I also noticed that the Content-Length is not the same... Why?
Thank you in advance for your advice!
Get rid of the line:
'Content-Type: text/plain'
You are setting the content type for the entire message and it is not formatting the POST data correctly.
I am trying to post a form in a remote website via php cURL.
here is my cURL configuration (I added several explanations in the comments):
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $action);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'POST'); //without this line the request is being sent with GET method (I can see that with curl_getinfo)
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($ch, CURLOPT_TIMEOUT, 50);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $postData); //$postData is an urlencoded string
curl_setopt($ch, CURLINFO_HEADER_OUT, true);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36');
curl_setopt($ch, CURLOPT_REFERER,$url);
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'Accept-Encoding: gzip, deflate',
'Accept-Language: en-US,en;q=0.8',
'Expect:',
'Content-Type: application/x-www-form-urlencoded',
'Connection: keep-alive',
'Cache-Control: max-age=0',
'Origin: http://XXX',
));
After executing such configuration, I receive such response:
string(610) "HTTP/1.1 302 Found
Location: http://XXX
Vary: Accept-Encoding
Content-type: text/html; charset=utf-8
Server: DWS
Content-Length: 15536
Accept-Ranges: bytes
Date: Wed, 17 Dec 2014 10:59:35 GMT
X-Varnish: 2567206754
Age: 0
Via: 1.1 varnish
Connection: keep-alive
HTTP/1.1 411 Length Required
Content-Type: text/html
Server: DWS
Content-Length: 357
Accept-Ranges: bytes
Date: Wed, 17 Dec 2014 10:59:35 GMT
X-Varnish: 2567207187
Age: 0
Via: 1.1 varnish
Connection: keep-alive
I tried to add a Content-Length header to the cURL configuration:
'Content-Length: ' . strlen($postData)
But then cURL fails with error 52 (Empty reply from server).
In order to make sure that the content length that i am specifying is in fact correct, I tried to add a cus tom string to CURLOPT_POSTFIELDS (like 'foo=bar'), and set Content-Length: 7, but the result was the same.
I also tried to covert whole code and use Zend 2 Http Client, but with no luck.
I think I've read all other posts about the cURL 52 error, but none of them seemed to have anything in common with Content-Length header, so I hope that someone here might help me out.
Please let me know if you need any more information from my part.
The POST request to the URL specified in the $action variable returns a HTTP 302 redirect, in which CURL will send the next request using GET, see the 2nd paragraph here: http://curl.haxx.se/docs/manpage.html#-L.
You already use CURLOPT_CUSTOMREQUEST to get around this, but it needs CURLOPT_POSTREDIR as well, as documented in the Strings section here: http://evertpot.com/curl-redirect-requestbody/
You should not explicitly set the Content-Length but have CURL handle that.
Alternatively you could manually "follow" the Location header and use the URL from that header in the $action variable (it may be useful to try that for testing first).
I am using Speech API v2 with PHP, here is a code:
$file_to_upload = array('myfile'=>'#'.$filename.'.flac');
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "https://www.google.com/speech-api/v2/recognize?output=json&lang=ru-RU&key=___my_api_key___");
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Content-Type: audio/x-flac; rate=8000"));
curl_setopt($ch, CURLOPT_POSTFIELDS, $file_to_upload);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$result=curl_exec ($ch);
Google responses with two JSON objects, first is empty, second has valid response as I expect. That causes difficulties in parsing and further processing. See HTTP dump:
My POST request:
POST /speech-api/v2/recognize?output=json&lang=ru-RU&key=___my_api_key___ HTTP/1.1
Host: www.google.com
Accept: */*
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.131 Safari/537.36
Content-Length: 13123
Expect: 100-continue
Content-Type: audio/x-flac; rate=8000; boundary=----------------------------9641e899ac92
------------------------------9641e899ac92
Content-Disposition: form-data; name="myfile"; filename="/tmp/voice/1400157667.6440-in.wav.flac"
Content-Type: application/octet-stream
fLaC..."......e..\......! ..{..!y>..7..............................( ...reference libFLAC 1.2.1 20070917.
...encoded binary data...
------------------------------9641e899ac92--
Response with duplicate result of recognition:
HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
Content-Disposition: attachment
Cache-Control: no-transform
X-Content-Type-Options: nosniff
Pragma: no-cache
Date: Thu, 15 May 2014 12:41:09 GMT
Server: S3 v1.0
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Alternate-Protocol: 80:quic
Transfer-Encoding: chunked
e
{"result":[]} <--- first one
f8
{"result":[{"alternative":[{"transcript":"............","confidence":0.73531097},{"transcript":"................"},{"transcript":".............."},{"transcript":"................"},{"transcript":"............ .."}],"final":true}],"result_index":0} <--- second one
0
Why could it happen? When I used API v1, it had the only response. Other examples of v2 in the internet also have only one.
Thanks a lot.
First of all, be sure that the language you are using provides Speaker Diarization. For instance, for spanish in Colombia Google does not provide speaker diarization, but for spanish from Spain, it does:
Language Support
Besides, sometimes a slight alteration of audio is needed, what can be achieved using ffmpeg:
ffmpeg -i input.wav -ac 1 -ab 128k -filter:a volume=0.9 -filter:a equalizer=f=4000:t=h:w=200:g=-2 output.wav
I have spotted a "weird" php CURL behavior that is sending me nuts. Basically what I am doing is making a digest authenticated call with curl. Here's an extract of my code:
curl_setopt($this->c, CURLOPT_HTTPAUTH, CURLAUTH_DIGEST);
curl_setopt($this->c, CURLOPT_USERPWD, $username . ":" . $password);
It works fine and the server actually comes back with a "YES, YOU PROVIDED THE RIGHT CREDENTIALS" kind of message. Only trouble is, the raw http response is a bit odd as it includes, as a matter of fact, 2 responses instead of one. Here's what curl_exec($this->c) spits out:
HTTP/1.0 401 Unauthorized
Date: Tue, 23 Oct 2012 08:41:18 GMT
Server: Apache/2.2.20 (Ubuntu)
X-Powered-By: PHP/5.3.6-13ubuntu3.9
WWW-Authenticate: Digest realm="dynamikrest-testing",qop="auth",nonce="5086582e95104",opaque="4b24e95490812b28b3bf139f9fbc9a66"
Vary: Accept-Encoding
Content-Length: 9
Connection: close
Content-Type: text/html
HTTP/1.1 200 OK
Date: Tue, 23 Oct 2012 08:41:18 GMT
Server: Apache/2.2.20 (Ubuntu)
X-Powered-By: PHP/5.3.6-13ubuntu3.9
Vary: Accept-Encoding
Content-Length: 9
Connection: close
Content-Type: text/html
"success"
I don't get why it includes the first response from the server (the one in which it states it requires authentication).
Can anyone throw some light on the issue? How do I avoid the responses' cumulation?
Cheers
It looks like curl has the same behavior if you use the -I option for headers:
curl -I --digest -u root:somepassword http://localhost/digest-test/
returns:
HTTP/1.1 401 Authorization Required
Date: Fri, 31 May 2013 13:48:35 GMT
Server: Apache/2.2.22 (Ubuntu)
WWW-Authenticate: Digest realm="Test Page", nonce="9RUL3wPeBAA=52ef6531dcdd1de61f239ed6dd234a3288d81701", algorithm=MD5, domain="/digest-test/ http://localhost", qop="auth"
Vary: Accept-Encoding
Content-Type: text/html; charset=iso-8859-1
HTTP/1.1 200 OK
Date: Fri, 31 May 2013 13:48:35 GMT
Server: Apache/2.2.22 (Ubuntu)
Authentication-Info: rspauth="4f5f8237e9760f777255f6618c21df4c", cnonce="MTQ3NDk1", nc=00000001, qop=auth
Vary: Accept-Encoding
Content-Type: text/html;charset=UTF-8
X-Pad: avoid browser bug
To only get the second header you could try this (not very optimal solution):
<?php
$ch = curl_init();
// set url
curl_setopt($ch, CURLOPT_URL, "http://localhost/digest-test/");
curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_DIGEST);
curl_setopt($ch, CURLOPT_USERPWD, "root:test");
// first authentication with a head request
curl_setopt($ch, CURLOPT_NOBODY, 1);
curl_exec($ch);
// the get the real output
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_HTTPGET, 1);
$output = curl_exec($ch);
echo $output;
I hit the same problem, and I think it was caused by PHP being compiled against an ancient version of libcurl (7.11.0 in my case, which is now nearly 10 years old). On a different machine with a more recent version of libcurl (7.29.0) the same code was fine, and my problems ended after getting my host to recompile their PHP to use the latest they had available (7.30.0).
This fix was suggested by a thread on the curl-library mailing list from 2008, where a user discovered the problem affected version 7.10.6 but not 7.12.1. I've searched the libcurl changelog around 7.12.0 and failed to find any clear entry about fixing this problem, though it might be covered by "general HTTP authentication improvements". Still, I'm now pretty confident that an old libcurl is the problem.
You can check which version of libcurl is used by your PHP from the 'cURL Information' entry in the output of phpinfo();