I write code for send hit request to proxy server's ip addresses. But this code is giving error of
504 Gateway Time-out
I also try to increase the timeout in php.ini. But that is also not working. Here is the code I am trying to use
<?php
$curl = curl_init();
$timeout = 300;
$proxies = file("proxy.txt");
$r="https://www.youtube.com/watch?v=iglQXfPXJHE";
//$r ="https://www.youtube.com/watch?v=rcWMxmKbj7c";
// Not more than 2 at a time
for($x=0;$x<2000; $x++){
//setting time limit to zero will ensure the script doesn't get timed out
set_time_limit(300);
//now we will separate proxy address from the port
//$PROXY_URL=$proxies[$getrand[$x]];
echo $proxies[$x];
curl_setopt($curl, CURLOPT_URL,$r);
curl_setopt($curl , CURLOPT_PROXY , preg_replace('/\s+/', '',$proxies[$x]));
curl_setopt($curl, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.5) Gecko/2008120122 Firefox/3.0.5");
curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt($curl, CURLOPT_REFERER, "http://google.com/");
$text = curl_exec($curl);
echo "Hit Generated:";
echo htmlentities($x);
}
?>
Any help is appreciated. Thank you
Given a text file, call proxy.txt, with the following content
198.110.57.6 8080 US United States anonymous no yes 1 minute ago
35.193.215.131 8080 US United States anonymous no yes 1 minute ago
198.50.219.239 80 CA Canada anonymous no yes 1 minute ago
217.61.124.144 80 IT Italy anonymous no yes 1 minute ago
171.255.199.5 80 VN Vietnam anonymous no no 1 minute ago
And the following PHP code
define('ROOT','c:/wwwroot');
function curlproxy( $url, $ip, $port, $https ){
$cacert=ROOT . '/cacert.pem';
$curl=curl_init();
if( $https==true ){
curl_setopt( $curl, CURLOPT_SSL_VERIFYPEER, 0 );
curl_setopt( $curl, CURLOPT_SSL_VERIFYHOST, 2 );
curl_setopt( $curl, CURLOPT_CAINFO, realpath( $cacert ) );
$proxy='https://'.$ip .':' . $port;
} else {
$proxy='http://'.$ip .':' . $port;
}
$vbh = fopen('php://temp', 'w+');
curl_setopt( $curl, CURLOPT_URL, $url );
curl_setopt( $curl, CURLOPT_AUTOREFERER, TRUE );
curl_setopt( $curl, CURLOPT_FOLLOWLOCATION, TRUE );
curl_setopt( $curl, CURLOPT_FRESH_CONNECT, TRUE );
curl_setopt( $curl, CURLOPT_FORBID_REUSE, TRUE );
curl_setopt( $curl, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_1 );
curl_setopt( $curl, CURLOPT_CLOSEPOLICY, CURLCLOSEPOLICY_OLDEST );
curl_setopt( $curl, CURLOPT_MAXCONNECTS, 1 );
curl_setopt( $curl, CURLOPT_FAILONERROR, TRUE );
curl_setopt( $curl, CURLOPT_RETURNTRANSFER, TRUE );
curl_setopt( $curl, CURLOPT_CONNECTTIMEOUT, 20 );
curl_setopt( $curl, CURLOPT_TIMEOUT, 20 );
curl_setopt( $curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36' );
curl_setopt( $curl, CURLINFO_HEADER_OUT, FALSE );
curl_setopt( $curl, CURLOPT_NOBODY, TRUE );
curl_setopt( $curl, CURLOPT_PROXY, $proxy );
curl_setopt( $curl, CURLOPT_HTTPPROXYTUNNEL, TRUE );
curl_setopt( $curl, CURLOPT_PROXYTYPE, CURLPROXY_HTTP );
curl_setopt( $curl, CURLOPT_VERBOSE, TRUE );
curl_setopt( $curl, CURLOPT_NOPROGRESS, TRUE );
curl_setopt( $curl, CURLOPT_STDERR, $vbh );
$payload=(object)array_filter( array(
'response' => curl_exec( $curl ),
'info' => (object)curl_getinfo( $curl ),
'errors' => curl_error( $curl ),
'request' => array(
'url' => $url,
'ip' => $ip,
'port' => $port,
'https' => $https,
'proxy' => $proxy
)
)
);
curl_close( $curl );
rewind( $vbh );
$payload->verbose=stream_get_contents( $vbh );
fclose( $vbh );
return $payload;
}
$data=array();
$url='https://www.youtube.com/watch?v=iglQXfPXJHE';
$list = file('c:/temp/proxy.txt');
foreach( $list as $i => $line ){
list($ip,$port,$code,$country,$anomynous,$google,$https,$up)=explode(chr(9),$line);
$data[]=curlproxy( $url, $ip, $port, $https );
}
echo '<pre>',print_r($data,true),'</pre>';
Gave reasonable results for certain proxies chosen randomly from free-proxy-list.net ~ of which a small snippet is shown here
Array
(
[0] => stdClass Object
(
[info] => stdClass Object
(
[url] => https://lightspeed.ravennaschools.org/access?YT91X4Q5J1HNFABCRE8BZ5FZ22WS4KQ2
[content_type] => text/html
[http_code] => 200
[header_size] => 721
[request_size] => 970
[filetime] => -1
[ssl_verify_result] => 19
[redirect_count] => 1
[total_time] => 1.763
[namelookup_time] => 0
[connect_time] => 0.14
[pretransfer_time] => 0.624
[size_upload] => 0
[size_download] => 0
[speed_download] => 0
[speed_upload] => 0
[download_content_length] => 0
[upload_content_length] => 0
[starttransfer_time] => 0.764
[redirect_time] => 0.999
[certinfo] => Array
(
)
)
[request] => Array
(
[url] => https://www.youtube.com/watch?v=iglQXfPXJHE
[ip] => 198.110.57.6
[port] => 8080
[https] => yes
[proxy] => https://198.110.57.6:8080
)
[verbose] => * About to connect() to proxy 198.110.57.6 port 8080 (#0)
* Trying 198.110.57.6... * connected
* Connected to 198.110.57.6 (198.110.57.6) port 8080 (#0)
* Establish HTTP proxy tunnel to www.youtube.com:443
> CONNECT www.youtube.com:443 HTTP/1.1
Host: www.youtube.com:443
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36
Proxy-Connection: Keep-Alive
< HTTP/1.1 200 Connection established
<
* Proxy replied OK to CONNECT request
* successfully set certificate verify locations:
* CAfile: C:\wwwroot\cacert.pem
CApath: none
* SSL connection using AES256-SHA
* Server certificate:
* subject: C=US; ST=California; L=Mountain View; O=Google Inc; CN=*.google.com
* start date: 2017-07-25 08:46:44 GMT
* expire date: 2017-10-17 08:28:00 GMT
* subjectAltName: www.youtube.com matched
* issuer: C=US; ST=California; L=Bakersfield; O=Lightspeed Systems; OU=Support; CN=Lightspeed Rocket; emailAddress=support#lightspeedsystems.com
* SSL certificate verify result: self signed certificate in certificate chain (19), continuing anyway.
> HEAD /watch?v=iglQXfPXJHE HTTP/1.1
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36
Host: www.youtube.com
Accept: */*
< HTTP/1.1 302 Moved Temporarily
< Server: squid/3.3.13
< Date: Thu, 10 Aug 2017 06:52:52 GMT
< Content-Length: 0
< Location: https://lightspeed.ravennaschools.org/access?YT91X4Q5J1HNFABCRE8BZ5FZ22WS4KQ2
< X-Cache: MISS from lightspeed.ravennaschools.org
< Via: 1.1 lightspeed.ravennaschools.org (squid/3.3.13)
< Connection: close
<
* Closing connection #0
* Issue another request to this URL: 'https://lightspeed.ravennaschools.org/access?YT91X4Q5J1HNFABCRE8BZ5FZ22WS4KQ2'
* About to connect() to proxy 198.110.57.6 port 8080 (#0)
* Trying 198.110.57.6... * connected
* Connected to 198.110.57.6 (198.110.57.6) port 8080 (#0)
* Establish HTTP proxy tunnel to lightspeed.ravennaschools.org:443
> CONNECT lightspeed.ravennaschools.org:443 HTTP/1.1
Host: lightspeed.ravennaschools.org:443
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36
Proxy-Connection: Keep-Alive
< HTTP/1.1 200 Connection established
<
* Proxy replied OK to CONNECT request
* successfully set certificate verify locations:
* CAfile: C:\wwwroot\cacert.pem
CApath: none
* SSL connection using AES256-SHA
* Server certificate:
* subject: OU=Domain Control Validated; CN=lightspeed.ravennaschools.org
* start date: 2017-08-01 16:01:01 GMT
* expire date: 2020-08-01 16:01:01 GMT
* subjectAltName: lightspeed.ravennaschools.org matched
* issuer: C=US; ST=California; L=Bakersfield; O=Lightspeed Systems; OU=Support; CN=Lightspeed Rocket; emailAddress=support#lightspeedsystems.com
* SSL certificate verify result: self signed certificate in certificate chain (19), continuing anyway.
> HEAD /access?YT91X4Q5J1HNFABCRE8BZ5FZ22WS4KQ2 HTTP/1.1
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36
Host: lightspeed.ravennaschools.org
Accept: */*
Referer: https://www.youtube.com/watch?v=iglQXfPXJHE
< HTTP/1.1 200 OK
< Server: nginx/1.10.0
< Date: Thu, 10 Aug 2017 06:52:53 GMT
< Content-Type: text/html
< Expires: Thu, 10 Aug 2017 06:52:52 GMT
< Cache-Control: no-cache
< Cache-Control: no-cache
< Pragma: no-cache
< X-UA-Compatible: IE=Edge,chrome=1
< X-Lightspeed: suite
< X-Cache: MISS from lightspeed.ravennaschools.org
< Via: 1.1 lightspeed.ravennaschools.org (squid/3.3.13)
< Connection: keep-alive
* no chunk, no close, no size. Assume close to signal end
<
* Closing connection #0
)
If however the sole aim of this script is to increase the hit counter then you may need to rethink as that does not seem to be affected but perhaps the above will be of use.
Related
I am sending a POST request in PHP via cURL to a REST API that uses XML. When I use Postman or Advanced REST Client, I get a XML response to my POST request. However, when I use PHP and cURL I do not seem able to see back the XML responses. What do I need to do to get these back? Eventually I need to retrieve a token that I can then use to process INSERT, UPDATES and GETS through this API via XML.
Here is the code that I am currently using:
$curl = curl_init();
curl_setopt_array($curl, array(
CURLOPT_URL => 'https://xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx',
CURLOPT_RETURNTRANSFER => true,
CURLOPT_ENCODING => '',
CURLOPT_MAXREDIRS => 10,
CURLOPT_TIMEOUT => 0,
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
CURLOPT_CUSTOMREQUEST => 'POST',
CURLOPT_HTTPHEADER => array(
'xxxxxx-Username: xxx',
'xxxxxx-Password: xxx',
'content-type: application/xml'
),
));
$response = curl_exec($curl);
curl_close($curl);
echo $response;
and currently I am getting a blank page. I have tried quite a few solutions, like the following
//header("Content-Type: text/xml");
//header('Content-type: application/xml');
//$decoded = iconv("UTF-8", "ISO-8859-1//TRANSLIT", $response);
//echo $decoded;
//echo $response;
//print_r($response);
// set up your xml result
$xml = new SimpleXMLElement($response, LIBXML_NOCDATA);
// loop through the results
$cnt = count($xml->Result);
for($i=0; $i<$cnt; $i++){
echo 'XML : First Name: = ';
}
but nothing seems to give me back what I get from Postman or Advanced REST Client, which on this particular command is the following
<?xml version="1.0" encoding="UTF-8"?>
<AuthInfo>
<token/>
<AuthStatus>
<Id>503</Id>
<Description>There's no proapi manager running with the given company code: crmapp</Description>
</AuthStatus>
</AuthInfo>
I understand that at this stage there is an issue with my url that I need to fix, but I still should be able to receive that error back via XML.
Can anyone please help me get this XML response back so that I can progress my interface?
Thank you in advance,
Adri
Thanks again Professor, here is the full debug with the latest version of PHP and cUrl
Verbose debug info
* Trying xxx.xx.xxx.xxx:443...
* Connected to xxxxx-xx-xx.xxxxxxxx.com.au (xxx.xx.xxx.xxx) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: D:/Adri/PHP/MoW/famac/cacert.pem
* CApath: D:/Adri/PHP/MoW/famac/cacert.pem
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
* ALPN, server did not agree to a protocol
* Server certificate:
* subject: CN=*.prontohosted.com.au
* start date: Jun 2 00:00:00 2020 GMT
* expire date: Sep 4 00:00:00 2022 GMT
* subjectAltName: host "xxxxx-xx-xx.xxxxxxxx.com.au" matched cert's "*.xxxxxxxx.com.au"
* issuer: C=GB; ST=Greater Manchester; L=Salford; O=Sectigo Limited; CN=Sectigo RSA Domain Validation Secure Server CA
* SSL certificate verify ok.
> GET /xxxxx/rest/xxx.xxx/login HTTP/1.1
Host: xxxxx-xx-xx.xxxxxxxx.com.au
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.38 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.38
Accept: */*
Accept-Encoding: deflate, gzip
xxxxxx-Username: xxx
xxxxxx-Password: xxx
Content-Type: application/xml
* Mark bundle as not supporting multiuse
< HTTP/1.1 404 Not Found
< Date: Tue, 09 Nov 2021 11:34:57 GMT
< Server: Apache
< Referrer-Policy: origin-when-cross-origin, strict-origin-when-cross-origin
< X-Frame-Options: SAMEORIGIN
< X-XSS-Protection: 1; mode=block
< X-Content-Type-Options: nosniff
< Content-Security-Policy: img-src 'self' *.xxxxx.net *.xxxxx.com.au https://www.google.com https://*.googleapis.com/ www.google-analytics.com stats.g.doubleclick.net http://*.xxxxx-xxxxx.com *.twitter.com *.twimg.com data: blob: https://*.google.com https://*.gstatic.com https://*.googleapis.com; frame-src * blob:; script-src 'self' 'unsafe-inline' 'unsafe-eval' *.xxxxx.net *.xxxxx.com.au https://*.google.com www.google-analytics.com *.twitter.com *.twimg.com https://*.googleapis.com https://jawj.github.io https://*.gstatic.com; connect-src 'self' wss: blob: *.twitter.com www.google-analytics.com stats.g.doubleclick.net; base-uri 'none'; style-src 'self' 'unsafe-inline' *.twitter.com *.twimg.com https://*.google.com *.googleapis.com https://*.gstatic.com; font-src 'self' data: https://*.googleapis.com https://fonts.gstatic.com; child-src * blob:; object-src 'none'; default-src 'self' blob:
< X-Permitted-Cross-Domain-Policies: master-only
< Content-Type: text/html; charset=UTF-8
< Content-Length: 994
* The requested URL returned error: 404
* Closing connection 0
Info
stdClass Object
(
[url] => https://xxxxx-xx-xx.xxxxxxxx.com.au/xxxxx/rest/xxx.xxx/login
[content_type] => text/html; charset=UTF-8
[http_code] => 404
[header_size] => 1271
[request_size] => 350
[filetime] => -1
[ssl_verify_result] => 0
[redirect_count] => 0
[total_time] => 0.232624
[namelookup_time] => 0.029367
[connect_time] => 0.05058
[pretransfer_time] => 0.162497
[size_upload] => 0
[size_download] => 0
[speed_download] => 0
[speed_upload] => 0
[download_content_length] => 994
[upload_content_length] => 0
[starttransfer_time] => 0.232609
[redirect_time] => 0
[redirect_url] =>
[primary_ip] => xxx.xx.xxx.xxx
[certinfo] => Array
(
)
[primary_port] => 443
[local_ip] => xxx.xxx.x.xxx
[local_port] => 52711
[http_version] => 2
[protocol] => 2
[ssl_verifyresult] => 0
[scheme] => HTTPS
[appconnect_time_us] => 162464
[connect_time_us] => 50580
[namelookup_time_us] => 29367
[pretransfer_time_us] => 162497
[redirect_time_us] => 0
[starttransfer_time_us] => 232609
[total_time_us] => 232624
)
Can you please let me know what you think of this? While I am no longer getting the previous error, I still seem unable to receive the XML response back. :(
Thank you in advance, Adri
The curl function I use is as follows. It has extra debugging information in the output and the default settings can be easily overridden at runtime by supplying a different $options argument. I'm not suggesting this is the answer but with a better set of options configured and better debug info you should get closer.
function curl( $url=NULL, $options=NULL, $headers=false ){
$cacert='c:/wwwroot/cacert.pem';
$vbh = fopen('php://temp', 'w+');
/*
Download a copy of CACERT.pem from
https://curl.haxx.se/docs/caextract.html
save to webserver and modify the $cacert variable
to suit - ensuring that the path you choose is
readable.
*/
$res=array(
'response' => NULL,
'info' => array( 'http_code' => 100 ),
'headers' => NULL,
'errors' => NULL
);
if( is_null( $url ) ) return (object)$res;
session_write_close();
/* Initialise curl request object - these should be OK as-is */
$curl=curl_init();
if( parse_url( $url,PHP_URL_SCHEME )=='https' ){
curl_setopt( $curl, CURLOPT_SSL_VERIFYPEER, true );
curl_setopt( $curl, CURLOPT_SSL_VERIFYHOST, 2 );
curl_setopt( $curl, CURLOPT_CAINFO, $cacert );
curl_setopt( $curl, CURLOPT_CAPATH, $cacert );
}
/* Define standard options */
curl_setopt( $curl, CURLOPT_URL,trim( $url ) );
curl_setopt( $curl, CURLOPT_AUTOREFERER, true );
curl_setopt( $curl, CURLOPT_FOLLOWLOCATION, true );
curl_setopt( $curl, CURLOPT_FAILONERROR, true );
curl_setopt( $curl, CURLOPT_HEADER, false );
curl_setopt( $curl, CURLINFO_HEADER_OUT, false );
curl_setopt( $curl, CURLOPT_RETURNTRANSFER, true );
curl_setopt( $curl, CURLOPT_BINARYTRANSFER, true );
curl_setopt( $curl, CURLOPT_CONNECTTIMEOUT, 20 );
curl_setopt( $curl, CURLOPT_TIMEOUT, 60 );
curl_setopt( $curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.38 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.38' );
curl_setopt( $curl, CURLOPT_MAXREDIRS, 10 );
curl_setopt( $curl, CURLOPT_ENCODING, '' );
/* enhanced debug */
curl_setopt( $curl, CURLOPT_VERBOSE, true );
curl_setopt( $curl, CURLOPT_NOPROGRESS, true );
curl_setopt( $curl, CURLOPT_STDERR, $vbh );
/* Assign runtime parameters as options to override defaults if needed. */
if( isset( $options ) && is_array( $options ) ){
foreach( $options as $param => $value ) curl_setopt( $curl, $param, $value );
}
/* send any headers with the request that are needed */
if( $headers && is_array( $headers ) ){
curl_setopt( $curl, CURLOPT_HTTPHEADER, $headers );
}
/* Execute the request and store responses */
$res=(object)array(
'response' => curl_exec( $curl ),
'info' => (object)curl_getinfo( $curl ),
'errors' => curl_error( $curl )
);
rewind( $vbh );
$res->verbose=stream_get_contents( $vbh );
fclose( $vbh );
curl_close( $curl );
return $res;
}
Then, to use it:
$url='https://www.example.com/api/';
$args=array();
$headers=array(
'xxxxxx-Username: xxx',
'xxxxxx-Password: xxx',
'Content-Type: application/xml'
);
$res=curl( $url, $args, $headers );
if( $res->info->http_code==200 ){
#cool - use $res->response in further processing
print_r($res->response,true);
}else{
# useful information will be displayed here...
printf('<h1>Verbose debug info</h1><pre>%s</pre>',print_r($res->verbose,true));
printf('<h1>Info</h1><pre>%s</pre>',print_r($res->info,true));
}
update to indicate how to send POST data
You use the $options parameter to supply different runtime configuration to the curl request, like so:
$url='https://www.example.com/api/';
$args=array(
CURLOPT_POST => true,
CURLOPT_POSTFIELDS => $send_body
);
$headers=array(
'xxxxxx-Username: xxx',
'xxxxxx-Password: xxx',
'Content-Type: application/xml'
);
$res=curl( $url, $args, $headers );
I have the method below that is to get the response from webserver using cURL.
function login (string $_login, string $_password) : string {
$url = "https://acweb.net.br/api/orcamentos/login";
$fields = [
"login" => $_login,
"password" => $_password
];
$headers = [
"Try" => "Trying"
];
$ch = curl_init();
curl_setopt( $ch, CURLOPT_URL, $url);
curl_setopt( $ch, CURLOPT_POST, true);
curl_setopt( $ch, CURLOPT_POSTFIELDS, $fields);
curl_setopt( $ch, CURLOPT_SSL_VERIFYPEER, false );
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
return curl_exec( $ch );
}
It works fine!
i can get the value of $_POST with
print_r ($_POST)
But i can't get the value of CURLOPT_HTTPHEADER.
EDIT:
I did try so:
print_r ($_SERVER)
but it wasn't there.
How can i get the value of CURLOPT_HTTPHEADER?
all HTTP_headers in $_SERVER:
[HTTP_HOST] => ctemcasb.com.br
[HTTP_CONNECTION] => keep-alive
[HTTP_UPGRADE_INSECURE_REQUESTS] => 1
[HTTP_USER_AGENT] => Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36
[HTTP_SEC_FETCH_USER] => ?1
[HTTP_ACCEPT] => text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3
[HTTP_SEC_FETCH_SITE] => none
[HTTP_SEC_FETCH_MODE] => navigate
[HTTP_ACCEPT_ENCODING] => gzip, deflate, br
[HTTP_ACCEPT_LANGUAGE] => pt-BR,pt;q=0.9,en-US;q=0.8,en;q=0.7
[HTTP_COOKIE] => PHPSESSID=j4cqqdc83fia68nk0gsglqk1bv
HTTP_TRY, not exists.
And now?
I did this in the server:
print_r($_SERVER)
and
print_r ($_SERVER["HTTP_TRY]);
The headers shouldn't be an associative array, it should be an indexed array of strings.
$headers = [
'Try: Trying',
'Content-Type: text/html',
...
];
Then you should be able to access the header with: $_SERVER['HTTP_TRY'] since custom headers are prefixed with HTTP_
I'm trying to get some data from that website: https://stubhub.com .
1- With file_get_contents:
$url= 'https://www.stubhub.com';
$html = file_get_contents($url);
echo $html;
I get:
Warning: file_get_contents(https://stubhub.com): failed to open stream: HTTP request failed! HTTP/1.0 405 Method Not Allowed
2- With CURL:
$url= 'https://www.stubhub.com';
$curl = curl_init();
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, true);
curl_setopt($curl, CURLOPT_HEADER, true);
curl_setopt($curl, CURLOPT_AUTOREFERER, true);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_REFERER, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
$html = curl_exec($curl);
$response = curl_getinfo($curl, CURLINFO_HTTP_CODE);
curl_close($curl);
var_dump($html);
var_dump($response);
But I get:
bool(false) int(0)
I tried to add some headers like User-Agent and proxy:
curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 6.1; rv:2.2) Gecko/20110201');
$proxy = '185.135.226.159:23500';
curl_setopt($curl, CURLOPT_PROXY, $proxy);
But again I get the same.
I have allow_url_fopen=On, So what's wrong?
function curl( $url=NULL, $options=NULL ){
$cacert='c:/wwwroot/cacert.pem'; # <----- download your own copy and configure this path
$vbh = fopen('php://temp', 'w+');
$res=array(
'response' => NULL,
'info' => array( 'http_code' => 100 ),
'headers' => NULL,
'errors' => NULL
);
if( is_null( $url ) ) return (object)$res;
session_write_close();
/* Initialise curl request object */
$curl=curl_init();
if( parse_url( $url,PHP_URL_SCHEME )=='https' ){
curl_setopt( $curl, CURLOPT_SSL_VERIFYPEER, true );
curl_setopt( $curl, CURLOPT_SSL_VERIFYHOST, 2 );
curl_setopt( $curl, CURLOPT_CAINFO, $cacert );
}
/* Define standard options */
curl_setopt( $curl, CURLOPT_URL,trim( $url ) );
curl_setopt( $curl, CURLOPT_AUTOREFERER, true );
curl_setopt( $curl, CURLOPT_FOLLOWLOCATION, true );
curl_setopt( $curl, CURLOPT_FAILONERROR, true );
curl_setopt( $curl, CURLOPT_HEADER, false );
curl_setopt( $curl, CURLINFO_HEADER_OUT, false );
curl_setopt( $curl, CURLOPT_RETURNTRANSFER, true );
curl_setopt( $curl, CURLOPT_BINARYTRANSFER, true );
curl_setopt( $curl, CURLOPT_CONNECTTIMEOUT, 20 );
curl_setopt( $curl, CURLOPT_TIMEOUT, 60 );
curl_setopt( $curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36' );
curl_setopt( $curl, CURLOPT_MAXREDIRS, 10 );
curl_setopt( $curl, CURLOPT_ENCODING, '' );
curl_setopt( $curl, CURLOPT_VERBOSE, true );
curl_setopt( $curl, CURLOPT_NOPROGRESS, true );
curl_setopt( $curl, CURLOPT_STDERR, $vbh );
/* Assign runtime parameters as options */
if( isset( $options ) && is_array( $options ) ){
foreach( $options as $param => $value ) curl_setopt( $curl, $param, $value );
}
/* Execute the request and store responses */
$res=(object)array(
'response' => curl_exec( $curl ),
'info' => (object)curl_getinfo( $curl ),
'errors' => curl_error( $curl )
);
rewind( $vbh );
$res->verbose=stream_get_contents( $vbh );
fclose( $vbh );
curl_close( $curl );
return $res;
}
$url='https://www.stubhub.com/';
$res = curl( $url );
if( $res->info->http_code==200 ){
printf('<pre>%s</pre>',print_r( $res->info,true ));
printf('<pre>%s</pre>',print_r( $res->verbose,true ));
}
This will output:
stdClass Object
(
[url] => https://www.stubhub.com/
[content_type] => text/html
[http_code] => 200
[header_size] => 1304
[request_size] => 214
[filetime] => -1
[ssl_verify_result] => 0
[redirect_count] => 0
[total_time] => 0.609
[namelookup_time] => 0.25
[connect_time] => 0.265
[pretransfer_time] => 0.39
[size_upload] => 0
[size_download] => 1194
[speed_download] => 1960
[speed_upload] => 0
[download_content_length] => 1194
[upload_content_length] => -1
[starttransfer_time] => 0.609
[redirect_time] => 0
[redirect_url] =>
[primary_ip] => 23.43.75.46
[certinfo] => Array
(
)
[primary_port] => 443
[local_ip] => 192.168.0.56
[local_port] => 5042
)
* Trying 23.43.75.46...
* TCP_NODELAY set
* Connected to www.stubhub.com (23.43.75.46) port 443 (#0)
* ALPN, offering http/1.1
* successfully set certificate verify locations:
CAfile: c:/wwwroot/cacert.pem
CApath: none
* SSL connection using TLSv1.2 / ECDHE-ECDSA-AES256-GCM-SHA384
* ALPN, server accepted to use http/1.1
* Server certificate:
* subject: C=US; ST=California; L=San Francisco; O=Stubhub, Inc.; OU=Technology; CN=www.stubhub.com
* start date: Jun 11 00:00:00 2018 GMT
* expire date: Jan 9 12:00:00 2020 GMT
* subjectAltName: host "www.stubhub.com" matched cert's "www.stubhub.com"
* issuer: C=US; O=DigiCert Inc; CN=DigiCert ECC Secure Server CA
* SSL certificate verify ok.
> GET / HTTP/1.1
Host: www.stubhub.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36
Accept: */*
Accept-Encoding: deflate, gzip
< HTTP/1.1 200 OK
< Server: nginx
< Content-Type: text/html
< Expires: Thu, 01 Jan 1970 00:00:01 GMT
< Cache-Control: private, no-cache, no-store, must-revalidate
< Surrogate-Control: no-store, bypass-cache
< Content-Encoding: gzip
< X-EdgeConnect-MidMile-RTT: 163
< X-EdgeConnect-Origin-MEX-Latency: 24
< X-Akamai-Transformed: 9 624 0 pmb=mTOE,1mRUM,1
< Date: Sat, 20 Oct 2018 16:25:57 GMT
< Content-Length: 1194
< Connection: keep-alive
< Vary: Accept-Encoding
< Set-Cookie: DC=lvs31;Path=/;Domain=stubhub.com;Expires=Sat, 20-Oct-2018 16:55:56 GMT;Max-Age=1800
< Set-Cookie: akacd_PCF_Prod=1540053357~rv=98~id=53e183ee10a83152497c9102c8c7dee7; path=/; Expires=Sat, 20 Oct 2018 16:35:57 GMT
< Strict-Transport-Security: max-age=31536000; includeSubDomains
< Set-Cookie: _abck=10D08E1267D29C2EDBEA32445BD116805C7A3616AB3500001557CB5B9AD22713~-1~e+BGOJkoD/UwtPOWH75YXUSo6Kzyd7sF6nTkkw89JfE=~-1~-1; expires=Sun, 20 Oct 2019 16:25:57 GMT; max-age=31536000; path=/; domain=.stubhub.com
< Set-Cookie: bm_sz=7C06CFF7557E22DEC7855EC89DF628B0~QAAQFjZ6XGg5goBmAQAAIypMkhVJRZxwtVU8097T7Q8Z2TcGPZR0XRtAVFY3TBHGsR4EW51MqZlCAyk3cMPDJEmukVvLunM36/5Kn1gtoxarUtgkqBvlfudWZBJb2xc1rHdnMhdsAXoHWLaGt0NwROSXckDe48kkqu2Kw3suRgrWcqDlj7Y1akARK8OYnoa6; Domain=.stubhub.com; Path=/; Expires=Sat, 20 Oct 2018 20:25:56 GMT; Max-Age=14399; HttpOnly
<
* Connection #0 to host www.stubhub.com left intact
To access the actual response body you would process $res->response - load it into DOMDocument or whatever you intend to do... good luck
I'm trying to simulate a real browser request using CURL with proxy rotate, I searched about it, But none of the answers worked.
Here is the code:
$url= 'https://www.stubhub.com/';
$proxy = '1.10.185.133:30207';
$userAgent = 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36';
$curl = curl_init();
curl_setopt( $curl, CURLOPT_URL, trim($url) );
curl_setopt($curl, CURLOPT_REFERER, trim($url));
curl_setopt( $curl, CURLOPT_RETURNTRANSFER, TRUE );
curl_setopt( $curl, CURLOPT_FOLLOWLOCATION, TRUE );
curl_setopt( $curl, CURLOPT_CONNECTTIMEOUT, 0 );
curl_setopt( $curl, CURLOPT_TIMEOUT, 0 );
curl_setopt( $curl, CURLOPT_AUTOREFERER, TRUE );
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, FALSE);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, FALSE);
$cacert='C:/xampp/htdocs/cacert.pem';
curl_setopt( $curl, CURLOPT_CAINFO, $cacert );
curl_setopt($curl, CURLOPT_COOKIEFILE,__DIR__."/cookies.txt");
curl_setopt ($curl, CURLOPT_COOKIEJAR, dirname(__FILE__) . '/cookies.txt');
curl_setopt($curl, CURLOPT_MAXREDIRS, 5);
curl_setopt( $curl, CURLOPT_USERAGENT, $userAgent );
//Headers
$header = array();
$header[] = "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
$header[] = "Accept-Language: cs,en-US;q=0.7,en;q=0.3";
$header[] = "Accept-Encoding: utf-8";
$header[] = "Connection: keep-alive";
$header[] = "Host: www.gumtree.com";
$header[] = "Origin: https://www.stubhub.com";
$header[] = "Referer: https://www.stubhub.com";
curl_setopt( $curl, CURLOPT_HEADER, $header );
curl_setopt($curl, CURLOPT_PROXYTYPE, CURLPROXY_HTTP);
curl_setopt($curl, CURLOPT_HTTPPROXYTUNNEL, TRUE);
curl_setopt($curl, CURLOPT_PROXY, $proxy);
curl_setopt($curl, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_1);
$data = curl_exec( $curl );
$info = curl_getinfo( $curl );
$error = curl_error( $curl );
echo '<pre>';
print_r($all);
echo '</pre>';
Here is what I get when I run the script:
Array
(
[data] => HTTP/1.1 200 OK
HTTP/1.0 405 Method Not Allowed
Server: nginx
Content-Type: text/html; charset=UTF-8
Accept-Ranges: bytes
Expires: Thu, 01 Jan 1970 00:00:01 GMT
Cache-Control: private, no-cache, no-store, must-revalidate
Surrogate-Control: no-store, bypass-cache
Content-Length: 9411
X-EdgeConnect-MidMile-RTT: 203
X-EdgeConnect-Origin-MEX-Latency: 24
Date: Sat, 03 Nov 2018 17:15:56 GMT
Connection: close
Strict-Transport-Security: max-age=31536000; includeSubDomains
[info] => Array
(
[url] => https://www.stubhub.com/
[content_type] => text/html; charset=UTF-8
[http_code] => 405
[header_size] => 487
[request_size] => 608
[filetime] => -1
[ssl_verify_result] => 0
[redirect_count] => 0
[total_time] => 38.484
[namelookup_time] => 0
[connect_time] => 2.219
[pretransfer_time] => 17.062
[size_upload] => 0
[size_download] => 9411
[speed_download] => 244
[speed_upload] => 0
[download_content_length] => 9411
[upload_content_length] => -1
[starttransfer_time] => 23.859
[redirect_time] => 0
[redirect_url] =>
[primary_ip] => 1.10.186.132
[certinfo] => Array
(
)
[primary_port] => 42150
[local_ip] => 192.168.1.25
[local_port] => 59320
)
[error] =>
)
As well as a Recaptcha, As it says:
Due to high volume of activity from your computer, our anti-robot software has blocked your access to stubhub.com. Please solve the puzzle below and you will immediately regain access.
When I visit the website using any browser, The website is displayed.
But with the above script, It's not.
So what am I missing to make the curl request like a real browser request and not be detected as a bot?
Or if there is an API/library that could do it, Please mention it.
Would Guzzle or similar fix this issue?
"So what am I missing to make the curl request like a real browser request"
My guess is they are using a simple cookie check. There are more sophisticated methods that allow recognizing automation such as cURL with a high degree of reliability, especially if coupled with lists of proxy IP addresses or IPs of known bangers.
Your first step is to intercept the outgoing browser request using pcap or something similar, then try and replicate it using cURL.
One other simple thing to check is whether your cookie jar has been seeded with some telltale. I routinely do that too, since most scripts on the Internet are just copy-pastes and don't pay much attention to these details.
The thing that would for sure make you bounce from any of my systems is that you're sending a referer, but you don't seem to actually have connected to the first page. You're practically saying "Well met again" to a server that is seeing you for the first time. You might have saved a cookie from that first encounter, and the cookie has now been invalidated (actually been marked "evil") by some other action. At least in the beginning, always replicate the visiting sequence from a clean slate.
You might try and adapt this answer, also cURL-based. Always verify actual traffic using a MitM SSL-decoding proxy.
Now, the real answer - what do you need that information for? Can you get it somewhere else? Can you ask for it explicitly, maybe reach an agreement with the source site?
I'm trying to do a payment in my test environment in Adyen with curl but for some reason I keep getting a 401 Unauthorized response back. I have checked the credentials of the Web Service User a dozen times but I'm sure they are correct. When I try the official Adyen PHP Api library (https://github.com/Adyen/adyen-php-api-library) I get the same results. I have also tried creating a new Web Service User but without results. Has anyone an idea what I'm doing wrong?
The request code:
<?php
$request = array(
"merchantAccount" => "MyWebsite",
"amount" => array(
"currency" => "EUR",
"value" => "199"
),
"reference" => "TEST-PAYMENT-" . date("Y-m-d-H:i:s"),
"shopperIP" => "2.207.255.255",
"shopperReference" => "YourReference",
"billingAddress" => array(
"street" => "Simon Carmiggeltstraat",
"postalCode" => "1011DJ",
"city" => "Amsterdam",
"houseNumberOrName" => "6-60",
"stateOrProvince" => "NH",
"country" => "NL"
),
"card" => array(
"expiryMonth" => "08",
"expiryYear" => "2018",
"holderName" => "Test Card Holder",
"number" => "4111111111111111",
"cvc" => "737"
),
);
$json = json_encode($request);
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "https://pal-test.adyen.com/pal/servlet/Payment/v25/authorise");
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_BASIC );
curl_setopt($ch, CURLOPT_USERPWD, "xxxx:xxxx");
curl_setopt($ch, CURLOPT_POST, count($request));
curl_setopt($ch, CURLOPT_POSTFIELDS, $json);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER,array("Content-type: application/json"));
// things I tried
curl_setopt($ch, CURLOPT_USERAGENT,'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.52 Safari/537.17');
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_VERBOSE, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
$result = curl_exec($ch);
?>
The $result variable returns an empty string.
Response:
* Trying 91.212.42.153...
* Connected to pal-test.adyen.com (91.212.42.153) port 443 (#0)
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:#STRENGTH
* successfully set certificate verify locations:
* CAfile: /etc/ssl/certs/ca-certificates.crt
CApath: /etc/ssl/certs
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server did not agree to a protocol
* Server certificate:
* subject: C=NL; ST=Noord-Holland; L=Amsterdam; O=Adyen B.V.; CN=*.adyen.com
* start date: Jun 14 00:00:00 2016 GMT
* expire date: Aug 13 23:59:59 2018 GMT
* issuer: C=US; O=thawte, Inc.; CN=thawte SSL CA - G2
* SSL certificate verify ok.
* Server auth using Basic with user 'xxxxx'
> POST /pal/servlet/Payment/v25/authorise HTTP/1.1
Host: pal-test.adyen.com
Authorization: Basic xxxxxx
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.52 Safari/537.17
Accept: */*
Content-type: application/json
Content-Length: 465
* upload completely sent off: 465 out of 465 bytes
< HTTP/1.1 401 Unauthorized
< Date: Mon, 02 Apr 2018 19:58:11 GMT
< Server: Apache
< Set-Cookie: JSESSIONID=47E667BF9B585DC3BDF40F8D58493E23.test103e; Path=/pal; Secure; HttpOnly
* Authentication problem. Ignoring this.
< WWW-Authenticate: BASIC realm="Adyen PAL Service Authentication"
< Content-Length: 0
< Content-Type: text/plain; charset=UTF-8
<
* Connection #0 to host pal-test.adyen.com left intact
401 is failed authentication. You are not using the correct combination of user + password.
You have the option to generate a password for general API usage or for POS Payments. Make sure that if are intending to use this API user for the general API, use the "Generate Password" and not "Generate POS Password".
Status: 401 with errorCode: 000 is a classic error for when the following maybe incorrect:
Use correct merchantAccount Name NOT companyAccount Name
API Key - Preferably regenerate the API Key and use the copy button in the customer area to copy it
Environment set to LIVE/TEST
Oke it works now. The strange thing is that I didn't change anything. My guess is that Adyen was having trouble on their side. I'll give them a call next time when something similar happens.