Issue replicating PHP CURL request like browser - php

I am having some issues with a browser request that I am trying to replicate with curl. I am currently working on a university project and am stuck.
I am trying to replicate a browser request to the following URL: http://vm.tiktok.com/e9VDx8/ When I visit the page in my browser I am redirected to a page with a video and some other content. When I try using CURL I am being shown a 404 page not found error. My curl request looks like the following.
$ch = curl_init();
curl_setopt($ch, CURLOPT_TIMEOUT, 5);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
curl_setopt($ch, CURLOPT_USERAGENT, $USER_AGENT);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt( $ch, CURLOPT_COOKIEJAR, realpath('./cookies.txt') );
curl_setopt( $ch, CURLOPT_COOKIEFILE, realpath('./cookies.txt') );
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_HEADER, TRUE);
curl_setopt($ch, CURLOPT_URL, $url);
$result = curl_exec($ch);
I have looked at the headers from the original URL in the browser and tried to copy paste them into curl but still I get the 404 page. If I copy the browser request as a curl request from chrome developer tools and run it in terminal it works fine.
curl "http://vm.tiktok.com/e9VDx8/" -H "Connection: keep-alive" -H "Upgrade-Insecure-Requests: 1" -H "User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36" -H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8" -H "Accept-Encoding: gzip, deflate" -H "Accept-Language: en-US,en;q=0.9,fr-CA;q=0.8,fr;q=0.7" -H "Cookie: _ga=GA1.2.213365735.1552156986; _gid=GA1.2.1717226934.1552319684; tt_webid=6667489497775638018" --compressed
Any help would be really appreciated. I am stumped.

It turns out that I figure this issue out minutes after posting for help. Earlier in my script I truncate the URL to make sure there's no invalid characters and such. While doing so I changed the URL to lower case which caused the issue since the URL's are case sensitive.

Related

PHP cURL File Download over HTTP2

I have a php 7.4 script that downloads a zip file using cURL. Both servers are
Apache/2.4.51 (Fedora)
Fedora 35
OpenSSL version 1.1.11
If I use CURL_HTTP_VERSION_1_0 all works. CURL_HTTP_VERSION_2_0 does not. Apache on the server I am calling has protocol h2 set. Below are the pertinent lines of code.
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_0); // this is where I change to ver 2
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1) Gecko/20061024 BonEcho/2.0");
$html = curl_exec($ch);
the error I get using CURL_HTTP_VERSION_2_0 is Curl Error: transfer closed with 4 bytes remaining to read
Also, I can successfully cURL from the cli to the server from the same box the script is on with --http2.
What else should I try? Is there other info I should post to help answer?
EDIT: Is it possible the Content-Length header is being incorrectly set on the sending side?

Curl and file_get_contents timeouts in PHP (command line curl with same URL works normally)

I'm trying to retrieve the contents of a URL: https://www.cyber.gov.au/.
If I use wget or curl from the command line, all is fine. The response is almost instant.
$ wget https://www.cyber.gov.au/
--2020-11-17 08:47:12-- https://www.cyber.gov.au/
Resolving www.cyber.gov.au (www.cyber.gov.au)... 92.122.153.122, 92.122.153.201
Connecting to www.cyber.gov.au (www.cyber.gov.au)|92.122.153.122|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 41951 (41K) [text/html]
Saving to: ‘index.html’
index.html 100%[=========================================>] 40.97K --. KB/s in 0.002s
2020-11-17 08:47:13 (18.8 MB/s) - ‘index.html’ saved [41951/41951]
However, when I try to connect to the same URL through PHP curl, it times out with the message:
Operation timed out after 5001 milliseconds with 0 bytes received
I've reduced this to a test case:
$handle = curl_init('https://www.cyber.gov.au/');
curl_setopt($handle, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($handle, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($handle, CURLOPT_CONNECTTIMEOUT, 5);
curl_setopt($handle, CURLOPT_TIMEOUT, 5);
curl_setopt($handle, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36');
$output = curl_exec($handle);
echo $output;
curl_close($handle);
I also tried with various combinations of these additional curl settings, with no change:
curl_setopt($handle, CURLOPT_FRESH_CONNECT, true);
curl_setopt($handle, CURLOPT_IPRESOLVE, CURL_IPRESOLVE_V4); // Also tried specifying v6
curl_setopt($handle, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($handle, CURLOPT_SSL_VERIFYHOST, 0);
It doesn't seem to be the DNS resolution time:
echo curl_getinfo($handle, CURLINFO_NAMELOOKUP_TIME); // 0.012 seconds
I've tried this on different machines, with different versions of PHP (7.2.12 and 7.4.10), and I get the same behaviour. Other URLs, both HTTP and HTTPS, work as expected. I get the same on CLI PHP as through Apache. Trying file_get_contents() gives a similar result, it just times out. Adding verbose curl logging didn't provide any more information.
curl --version gives curl 7.47.0 and curl 7.58.0 on the machines I tested on.
Can anyone spot what's going on or point me in the right direction to find out more about the problem?

PHP cURL methods time out on some URLs, but command line always works

When I attempt to use PHP's cURL methods for SOME URLs, it times out. When I use the commandline for the same URL, it works just fine.
I am using AWS and have a t2.medium box running the php-55 apache libraries from yum.
Here is my PHP code:
function curl($url) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36');
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_MAXREDIRS, 2);
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Accept-Language: en-us'
));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
curl_setopt($ch, CURLOPT_IPRESOLVE, CURL_IPRESOLVE_V4);
$fh = fopen('/home/ec2-user/curllog', 'w');
curl_setopt($ch, CURLOPT_STDERR, $fh);
$a = curl_exec($ch);
curl_close($ch);
fclose($fh);
$headers = explode("\n",$a);
var_dump($headers);
var_dump($a);
exit;
return $result;
}
So here is call that works just fine:
curl('http://www.google.com');
And this returns the data for the homepage of google.
However, I try another URL:
curl('http://www.trulia.com/profile/agent-1391347/overview');
And I get this in the curllog:
[ec2-user#central Node]$ cat ../curllog
* Hostname was NOT found in DNS cache
* Trying 23.0.160.99...
* Connected to www.trulia.com (23.0.160.99) port 80 (#0)
> GET /profile/agent-1391347/overview HTTP/1.1
User-Agent: Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36
Host: www.trulia.com
Accept: */*
Accept-Language: en-us
* Operation timed out after 10002 milliseconds with 0 bytes received
* Closing connection 0
If I run this from the command line:
curl -s www.trulia.com/profile/agent-1391347/overview
It IMMEDIATELY returns (within 1 second) with NO output. This is expected. However when I run this:
curl -sL www.trulia.com/profile/agent-1391347/overview
It returns the page properly, just as I would want.
So, what is wrong with my curl?
PHP 5.5.20
Here is the cURL bit from my phpinfo():
curl
cURL support => enabled
cURL Information => 7.38.0
Age => 3
Features
AsynchDNS => Yes
CharConv => No
Debug => No
GSS-Negotiate => No
IDN => Yes
IPv6 => Yes
krb4 => No
Largefile => Yes
libz => Yes
NTLM => Yes
NTLMWB => Yes
SPNEGO => Yes
SSL => Yes
SSPI => No
TLS-SRP => No
Protocols => dict, file, ftp, ftps, gopher, http, https, imap, imaps, ldap, ldaps, pop3, pop3s, rtsp, scp, sftp, smtp, smtps, telnet, tftp
Host => x86_64-redhat-linux-gnu
SSL Version => NSS/3.16.2 Basic ECC
ZLib Version => 1.2.7
libSSH Version => libssh2/1.4.2
I have checked your function curl() It seems fine. No need to change anything in the function. What should you need to do is just pass the URL as it is as parameter no need to change HTTPS to HTTP
curl('http://www.trulia.com/profile/agent-1391347/overview');
Reason:
You already told curl to don't verify the SSL
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
Let me know if you need any explanation.
The verbose output shows a clear timeout problem:
Operation timed out after 10002 milliseconds with 0 bytes received
This signals a problem with your network setup. These are harder to locate, this can be on your own end (e.g. in context of the webserver or the PHP executable) or on the other end. Both places are possible to a certain extend, however the server accepts both requests even if they have different request headers, so it is more likely that this is execution context related which is also how you generally describe it.
Check if there are any restrictions on security and other networking layers regarding performing those requests via PHP. E.g. try a different server image if you're not so into system administration and trouble-shooting. From what is shared in your question, this is hard to say what exactly causes your timeout.
Try increasing the timeout values in the following lines:
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
Those are pretty short timeout values - the CURLOPT_TIMEOUT specifically limits the entire execution time, try giving larger values:
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 15);
curl_setopt($ch, CURLOPT_TIMEOUT, 30);
You have 2 VARIABLES
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
The first one, CURLOPT_CONNECTTIMEOUT is maximum amount of time allowed to make connection to the server`
You can disable it by setting it to 0.
That is
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 0);
But it is not a good method if you are in a production environment because it will never timeout.
Now CURLOPT_TIMEOUT
From PHP Documentation
The maximum number of seconds to allow cURL functions to execute.
Set it to some higher value
curl_setopt($ch, CURLOPT_TIMEOUT, 20); // 20 Seconds.

File get contents and cURL error

I have problems with "file_get_contents" and "cURL". When I make this:
$myFile = 'http://example.com/asset_compress/assets/get/bbb.js?file%5B0%5D=myfile.js';
$a=file_get_contents( $myFile );
I get this error:
Warning (2): file_get_contents
(http://example.com/asset_compress/assets/get/bbb.js?file%5B0%5D=myfile.js)
[function.file-get-contents]: failed to open stream:
HTTP request failed! HTTP/1.0 404 Not Found
[APP/Controller/MyController.php, line 1373]
Then I tried CURL like this:
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $myFile);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 10);
curl_setopt($curl, CURLOPT_USERAGENT, $this->userAgent);
curl_setopt($curl_handle, CURLOPT_VERBOSE, true);
curl_setopt($curl_handle, CURLOPT_AUTOREFERER, true);
$a = curl_exec($curl);
curl_close($curl);
And I get this error:
404 Not Found: The resource requested could not be found on this server!
But when I write http://example.com/asset_compress/assets/get/bbb.js?file%5B0%5D=myfile.js to my browser's address bar, I get the file perfectly. The headers of the browser is like this:
Request URL:http://example.com/asset_compress/assets/get/bbb.js?file%5B0%5D=myfile.js
Request Method:GET
Status Code:200 OK
Request Headers
Accept:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Charset:ISO-8859-1,utf-8;q=0.7,*;q=0.3
Accept-Encoding:gzip,deflate,sdch
Accept-Language:tr,en-US;q=0.8,en;q=0.6
Cache-Control:max-age=0
Connection:keep-alive
Host:example.com
User-Agent:Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/26.0.1410.28 Safari/537.31
Query String Parameters
file[0]:myfile.js
Response Headers
Connection:close
Content-Length:15911
Content-Type:application/javascript; charset=UTF-8
Date:Fri, 29 Mar 2013 20:33:43 GMT
Server:Apache
X-Powered-By:PleskLin
I suspected file_get_contents and when I make this, I get output perfectly:
$d1 = file_get_contents("http://www.yahoo.com");
print_r($d1);
When I try cURL I get 404 error. How can I more diagnose why I get 404, despite that I get 200 from browser request.
This being hosted on a local computer? Or else where?
I was having an similar issue that my host determined to be an ISP down the line blocking my requests.

Call SOAP method via CURL in PHP,call require client authentication

OVERVIEW:
The code is about making call to the escreen web service using SOAP and Curl with client authentication required. Currently I am not getting any result only HTTP 403 and 500 errors.
The call requires client authenticate cert to be on the callng site.
CODE:
$content = "<TicketRequest>
<Version>1.0</Version>
<Mode>Test</Mode>
<CommitAction></CommitAction>
<PartnerInfo>
<UserName>xxxxxxxxxx</UserName>
<Password>xxxxxxxxxxx</Password>
</ PartnerInfo>
<RequestorOrderID></RequestorOrderID>
<CustomerIdentification>
<IPAddress></IPAddress>
<ClientAccount>xxxxxxxxxx</ClientAccount>
<ClientSubAccount>xxxxxxxxxx</ClientSubAccount>
<InternalAccount></InternalAccount>
<ElectronicClientID></ElectronicClientID>
</CustomerIdentification>
<TicketAction>
<Type></Type>
<Params>
<Param>
<ID>4646</ID>
<Value></Value>
</Param>
</Params>
</TicketAction>
</TicketRequest>";
$wsdl = "https://services.escreen.com/SingleSignOnStage/SingleSignOn.asmx";
$headers = array( "Content-type: text/xml;charset=\"utf-8\"",
"Accept: text/xml",
"Cache-Control: no-cache",
"Pragma: no-cache",
// "SOAPAction: \"\"",
"Content-length: ".strlen($content),
);
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $wsdl);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_VERBOSE, '1');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $content);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, '1');
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, '1');
//curl_setopt($ch, CURLOPT_HTTPHEADER, array("Content-Type: text/xml"));
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
//curl_setopt($ch, CURLOPT_HTTPHEADER, array('SOAPAction: ""'));
curl_setopt($ch, CURLOPT_CAPATH, '/home/pps/');
curl_setopt($ch, CURLOPT_CAINFO, '/home/pps/authority.pem');
curl_setopt($ch, CURLOPT_SSLCERT, 'PROTPLUSSOL_SSO.pem');
curl_setopt($ch, CURLOPT_SSLCERTPASSWD, 'xxxxxxxxxxxx');
$output = curl_exec($ch);
// Check if any error occured
if(curl_errno($ch))
{
echo 'Error no : '.curl_errno($ch).' Curl error: ' . curl_error($ch);
}
print_r($output);
QUESTIONS:
I need to call the RequestTicket method and pass the XML string to it.
I don't know how to do it here(pass the method name to call).
For client authentication they gave us three certs, one root cert, one intermediate
cert and a client authentication cert PROTPLUSSOL_SSOpem(it was a .pfx file). Since we are on linux we converted them to pem . In curl calls I could not find way to how to include both the root cert and the intermediate cert ,so I combined them by making a new pem file and copying the intermediate cert and them the root cert and naming it authority.pem .
I am not sure whether it works or not and would like your opinion.
For the current code Iam getting the error
Error no : 77 Curl error: error setting certificate verify locations: CAfile: /home/pps/authority.pem CApath: /home/pps/
If I disable the curl error message,I am getting blank page with page title 403 - Forbidden. Access is denied.
If I comment out the CURLOPT_CAPATH and CURLOPT_CAINFO lines it gives http 500 error page with the message as content and the following at the top.
HTTP/1.1 500 Internal Server Error. Cache-Control: private Content-Type: text/html Server: Microsoft-IIS/7.5 X-AspNet-Version: 1.1.4322 X-Powered-By: ASP.NET Date: Thu, 02 Sep 2010 14:46:38 GMT Content-Length: 1208
If I comment out as above and also CURLOPT_SSLCERT and CURLOPT_SSLCERTPASSWD it gives 403 error with the message as content.
So I would request you to help me out by pointing out whats wrong with the current code.
Thank you.
PHP comes with a soap client:
http://php.net/manual/en/book.soap.php
You can tell it to use a certificate, by passing an option local_cert to the constructor.

Categories