Linux cURL vs PHP cURL - POST Request - php

I have to upload a ZIP file using HTTPS and this is working only via Linux cURL command. I don't understand what i am missing in PHP cURL request...
Linux cURL [working]:
curl -v -x http://api.test.sandbox.mobile.de:8080 -u USER:PASS -X POST --data-binary #502.zip https://services.mobile.de/upload-api/upload/502.zip
Response:
POST /upload-api/upload/502.zip HTTP/1.1
User-Agent: curl/7.38.0
Host: services.mobile.de
Accept: */*
Content-Length: 6026
Content-Type: application/x-www-form-urlencoded
Expect: 100-continue
HTTP/1.1 100 Continue } [data not shown]
HTTP/1.1 201 Created
Date: Tue, 06 Dec 2016 12:40:41 GMT
Content-Type: text/html;charset=utf-8
Vary: Accept-Encoding
Transfer-Encoding: chunked
PHP cURL [not working]:
$ch = curl_init();
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Authorization: Basic '. base64_encode("USER:PASS"),
'Content-Type: text/plain'
));
curl_setopt($ch,CURLOPT_PROXY, 'api.test.sandbox.mobile.de:8080');
curl_setopt($ch,CURLOPT_URL, 'https://services.mobile.de/upload-api/upload/502.zip');
curl_setopt($ch,CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch,CURLOPT_POST, 1);
curl_setopt($ch,CURLOPT_POSTFIELDS, [ 'file' => new CURLFile('502.zip') ]);
curl_setopt($ch,CURLOPT_VERBOSE, 1);
$result = curl_exec($ch);
curl_close($ch);
Response:
POST /upload-api/upload/502.zip HTTP/1.1
Host: services.mobile.de
Accept: */*
Content-Length: 6225
Expect: 100-continue
Content-Type: text/plain; boundary=------------------------835f6ea7 5f783449
HTTP/1.1 100 Continue
HTTP/1.1 201 Created
Date: Tue, 06 Dec 2016 13:36:21 GMT
Content-Type: text/html;charset=utf-8
Vary: Accept-Encoding
Transfer-Encoding: chunked
On site documentation it's written:
"The upload file must be sent as an HTTP-Payload and in binary format, Multipart and Encoding are not supported."
I also noticed that the Content-Length is not the same... Why?
Thank you in advance for your advice!

Get rid of the line:
'Content-Type: text/plain'
You are setting the content type for the entire message and it is not formatting the POST data correctly.

Related

Getting garbage output when scraping a webpage in PHP [duplicate]

This question already has answers here:
Downloading files using GZIP
(4 answers)
Closed 3 years ago.
I am trying to get the contents of a page from Amazon using file_get_html() but the output comes with weird characters on echo. Can anyone please explain how can I resolve this issue?
I also found the following two related questions on Stack Overflow but they did not solve my issue. :)
file_get_html() returns garbage
Uncompress gzip compressed http response
Here is my code:
$options = array(
'http'=>array(
'header'=>
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\n".
"Accept-language: en-US,en;q=0.5\r\n" .
"User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.6) Gecko/20091201 Firefox/3.5.6\r\n"
)
);
$context = stream_context_create($options);
$amazon_url = 'https://www.amazon.com/my-url';
$amazon_html = file_get_contents($amazon_url, false, $context);
Here is the output I get:
��T]o�6}��`���0��݊-��"[�bh�tN�b0��.%%�$P��#�(Ų�� ������F#����A�
about 115k characters like this show up in the browser window.
These are my new headers:
$options = array(
'http'=>array(
'header'=>
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\n".
"Accept-language: en-US,en;q=0.5\r\n"
)
);
Will using cURL resolve this issue?
Update:
I tried cURL. Still getting the garbage output. Here are my response headers:
HTTP/1.1 200 OK
Date: Sun, 18 Nov 2018 20:29:28 GMT
Server: Apache/2.4.33 (Win32) OpenSSL/1.1.0h PHP/7.2.5
X-Powered-By: PHP/7.2.5
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8
Can anyone explain the negative votes?
I did a research myself.
Found some related questions on Stack Overflow which did not solve my problem.
Provided all the information that I thought would be helpful.
What else should I include in the question?
Here is my whole code for curl at present. This is the URL I am scraping.
$handle = curl_init();
curl_setopt($handle, CURLOPT_URL, $amazon_url);
curl_setopt($handle, CURLOPT_RETURNTRANSFER, true);
$data = curl_exec($handle);
curl_close($handle);
echo $data;
The output is just a bunch of characters I mentioned above. Here are my request headers:
Host: localhost
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:63.0) Gecko/20100101 Firefox/63.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cookie: AMCV_17EB401053DAF4840A490D4C%40AdobeOrg=-227196251%7CMCIDTS%7C17650%7CMCMID%7C67056225185486460220940124683302119708%7CMCAID%7CNONE%7CMCOPTOUT-1524907071s%7CNONE; mjx.menu=renderer%3ACommonHTML; _ga=GA1.1.2019605490.1529649408; csm-hit=adb:adblk_no&tb:s-3521C4J8F2EP1V0MMQEP|1542578145652&t:1542578146256
Upgrade-Insecure-Requests: 1
Pragma: no-cache
Cache-Control: no-cache
These are from the Network Tab. The response headers are the same as I mentioned above.
Here is the output after adding curl_setopt($handle, CURLOPT_HEADER, 1); to my code:
HTTP/1.1 200 OK Server: Server Content-Type: text/html; charset=UTF-8
Strict-Transport-Security: max-age=47474747; includeSubDomains;
preload x-amz-id-1: 7A162B8JKV6MGZQ3PCH2 Vary:
Accept-Encoding,User-Agent,X-Amzn-CDN-Cache Content-Encoding: gzip
x-amz-rid: 7A162B8JKV6MGZQ3PCH2 Cache-Control: no-transform
X-Frame-Options: SAMEORIGIN Date: Sun, 18 Nov 2018 22:42:51 GMT
Transfer-Encoding: chunked Connection: keep-alive Connection:
Transfer-Encoding Set-Cookie:
x-wl-uid=1a4u8+XgF+IhFF/iavy9mKZCAA0g4HiIYZXR8hKjxGtmOtBW+j67wGABv7ZOTxDRcab+7Qmpjqds=;
Here's the solution:
I ran into the same issue when scraping Amazon.
Simply add the following option before sending your cURL request:
curl_setopt($handle, CURLOPT_ENCODING, 'gzip,deflate,sdch');

How to make a cURL request that produces the same response headers as Firefox

When I browse to a page with Firefox and click a download link, the following headers are shown when I inspect the request in network inspector:
Connection: keep-alive
Content-Disposition: attachment; filename="example_file.mp3"
Content-Length: 35181829
Content-Transfer-Encoding: binary
Content-Type: audio/mpeg
Date: Fri, 19 Aug 2016 18:19:02 GMT
Keep-Alive: timeout=60
Server: nginx
X-Powered-By: PHP/5.4.45
However, when I use cURL to visit the same address, I get this:
Connection: keep-alive
Content-Length: 1918
Content-Type: text/html; charset=UTF-8
Date: Fri, 19 Aug 2016 20:46:23 GMT
Keep-Alive: timeout=60
Server: nginx
X-Powered-By: PHP/5.4.45
How can I form a request with cURL that gives me the same response as Firefox?
In Firefox, open up the Net tab in the developer options(F12) and open the URL of the page you need.
Take note of all the Request Headers in the request sent to the server:
Example:
Accept
text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Encoding
gzip, deflate
Accept-Language
nl,en-US;q=0.7,en;q=0.3
Connection
keep-alive
Cookie
_ga=GA1.2.598213448.1471644637; _gat=1
Host
mariannesdelights.be
User-Agent
Mozilla/5.0 (Windows NT 10.0; WOW64; rv:47.0) Gecko/20100101 Firefox/47.0
Put all the headers in an array in this way
$headers = array('HeaderName:HeaderValue','HeaderName2:HeaderValue2');
Use the php function curl_setoption() to set the headers in the request:
curl_setopt($ch,CURLOPT_HTTPHEADER,$headers);
That should produce the exact same HTTP-Response headers.

Webservice works through curl command line but not PHP

I'm trying to translate a curl command I was given into a command I can run through PHP.
The command is:
curl -F customerid=902 -F username=API1 -F password=somepassword -F reportname=1002 http://somerandomurl.com/api/v1/getreportcsv
When I try to run this through PHP (and eventually through C#) however, the web service returns an error. Any idea what could be wrong with my code that is making it error? I think the web service must be very specific about the headers/request:
$url = "http://somerandomurl.com/api/v1/getreportcsv";
$fields = [
"customerid" => "902",
"username" => "API1",
"password" => "somepassword",
"reportname" => "1002"
];
$fields_string = "";
foreach($fields as $key=>$value) { $fields_string .= $key.'='.$value.'&'; }
rtrim($fields_string,'&');
//open connection
$ch = curl_init();
curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_POST,count($fields));
curl_setopt($ch,CURLOPT_POSTFIELDS,$fields_string);
//execute post
$result = curl_exec($ch);
print $result;
Wireshark shows the following differences:
The below is the one that works:
POST /somefolder/api/v1/getreportcsv HTTP/1.1
Host: somehost
Accept: */*
Content-Length: 65
Content-Type: application/x-www-form-urlencoded
customerid=902&username=API1&password=somepassword&reportname=1002&HTTP/1.1 200 OK
Server: GlassFish Server Open Source Edition 3.1.1
Content-Type: text/html;charset=UTF-8
Content-Length: 6
Date: Thu, 26 Nov 2015 22:51:23 GMT
ERROR
Whereas this one works:
POST /someurl/api/v1/getreportcsv HTTP/1.1
User-Agent: curl/7.33.0
Host: somehost
Accept: */*
Content-Length: 459
Expect: 100-continue
Content-Type: multipart/form-data; boundary=------------------------4b0d14cc31a40c5b
HTTP/1.1 100 Continue
--------------------------4b0d14cc31a40c5b
Content-Disposition: form-data; name="customerid"
902
--------------------------4b0d14cc31a40c5b
Content-Disposition: form-data; name="username"
API1
--------------------------4b0d14cc31a40c5b
Content-Disposition: form-data; name="password"
somepassword
--------------------------4b0d14cc31a40c5b
Content-Disposition: form-data; name="reportname"
1002
--------------------------4b0d14cc31a40c5b--
HTTP/1.1 200 OK
Server: GlassFish Server Open Source Edition 3.1.1
Content-Type: text/html;charset=UTF-8
Transfer-Encoding: chunked
Date: Thu, 26 Nov 2015 23:13:57 GMT
2000
...snip... the results of the api
Obviously they are very different requests, however I wouldn't expect something to be so specific?
The question seems very specific to the service in question.
However, the problem might be with headers. According to curl man page:
-F [...] causes curl to POST data using the Content-Type
multipart/form-data according to RFC 2388
However, according to PHP manual, CURLOPT_POST option will send data using application/x-www-form-urlencoded.
According to the same manual, if value of CURLOPT_POSTFIELDS is an array, the Content-Type header will be set to multipart/form-data. You may also try to set the content type explicitly as a header.
Try setting the following cURL options:
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_POST, count($fields));
curl_setopt($ch, CURLOPT_POSTFIELDS, $fields);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-type: multipart/form-data'));
If that does not work, it might help analysing all the headers send by the command line curl using the -v parameter and try to set them. Also might be good idea to set content length header explicitly.

How to log the cURL calls into a file without using Wireshark to capture packets?

I am using PHP and cURL to send in HTTP request to a REST API. However, I need to validate some data going in/out.
How can I get a copy of the entire request logged into a file?
I have added this line to my request
curl_setopt($ch, CURLOPT_VERBOSE, true);
I am hoping I can get a copy to look like this saves to a log file.
--- START REQUEST
POST /icws/3048186002/interactions/3002853105 HTTP/1.1
Host: servername:8018
Accept: */*
Cookie: icws_3048186002=64425290-29e5-432a-9ee1-c303cdee1f79
ININ-ICWS-CSRF-Token: WAp3cmF0Y2xpZmZlWBJJQ1dTLUFQSS1jb25uZWN0b3JYJGQwYjhlMWRhLTZlOTktNGIwMC05NGNlLTY5MDdkZjUwOWI5Y1gJMTAuMC40LjE4
ININ-ICWS-Session-ID: 3048186002
Content-Type: application/json
Content-Length: 35
{"attributes":{"logged":1}}
--- END REQUEST
--- START RESPOND
HTTP/1.1 200 OK
Cache-Control: no-cache, no-store, must-revalidate
Pragma: no-cache
Expires: 0
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true
Content-Type: application/vnd.inin.icws+JSON; charset=utf-8
Date: Wed, 02 Sep 2015 20:39:26 GMT
Server: HttpPluginHost
Content-Length: 0
--- END RESPOND
How can I capture this data without having to use WireShark? I would think cURL and PHP are advanced enough to allow you to log such a thing.

Google Speech API duplicates responses

I am using Speech API v2 with PHP, here is a code:
$file_to_upload = array('myfile'=>'#'.$filename.'.flac');
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "https://www.google.com/speech-api/v2/recognize?output=json&lang=ru-RU&key=___my_api_key___");
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Content-Type: audio/x-flac; rate=8000"));
curl_setopt($ch, CURLOPT_POSTFIELDS, $file_to_upload);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$result=curl_exec ($ch);
Google responses with two JSON objects, first is empty, second has valid response as I expect. That causes difficulties in parsing and further processing. See HTTP dump:
My POST request:
POST /speech-api/v2/recognize?output=json&lang=ru-RU&key=___my_api_key___ HTTP/1.1
Host: www.google.com
Accept: */*
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.131 Safari/537.36
Content-Length: 13123
Expect: 100-continue
Content-Type: audio/x-flac; rate=8000; boundary=----------------------------9641e899ac92
------------------------------9641e899ac92
Content-Disposition: form-data; name="myfile"; filename="/tmp/voice/1400157667.6440-in.wav.flac"
Content-Type: application/octet-stream
fLaC..."......e..\......! ..{..!y>..7..............................( ...reference libFLAC 1.2.1 20070917.
...encoded binary data...
------------------------------9641e899ac92--
Response with duplicate result of recognition:
HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
Content-Disposition: attachment
Cache-Control: no-transform
X-Content-Type-Options: nosniff
Pragma: no-cache
Date: Thu, 15 May 2014 12:41:09 GMT
Server: S3 v1.0
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Alternate-Protocol: 80:quic
Transfer-Encoding: chunked
e
{"result":[]} <--- first one
f8
{"result":[{"alternative":[{"transcript":"............","confidence":0.73531097},{"transcript":"................"},{"transcript":".............."},{"transcript":"................"},{"transcript":"............ .."}],"final":true}],"result_index":0} <--- second one
0
Why could it happen? When I used API v1, it had the only response. Other examples of v2 in the internet also have only one.
Thanks a lot.
First of all, be sure that the language you are using provides Speaker Diarization. For instance, for spanish in Colombia Google does not provide speaker diarization, but for spanish from Spain, it does:
Language Support
Besides, sometimes a slight alteration of audio is needed, what can be achieved using ffmpeg:
ffmpeg -i input.wav -ac 1 -ab 128k -filter:a volume=0.9 -filter:a equalizer=f=4000:t=h:w=200:g=-2 output.wav

Categories