Reduce Bandwidth usage in cURL PHP - php

I'm using basic cURL requests to fetch webpages in PHP, however these webpages are big in size and I'm limited in bandwidth usage.
Is there a way to reduce/optimize cURL data usage, for example using compression. I also heard that Brotli compression is the best, but I'm not sure how to use it.

$headers[] = "Accept-Encoding: gzip"; // tell the server you accept gzip
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
curl_setopt($ch,CURLOPT_ENCODING , "gzip"); // tells curl to gunzip it automatically
$data = curl_exec($ch);
Not tried this with brotli, support will vary by software version which you didn't tell us about.

Related

Copy vs Curl to save external file at my server

Which way is fastest to save external file to my server. why and how ?
Using Curl :
$ch = curl_init();
$fp = fopen ($local_file, 'w+');
$ch = curl_init($remote_file);
curl_setopt($ch, CURLOPT_TIMEOUT, 50);
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_ENCODING, "");
curl_exec($ch);
curl_close($ch);
fclose($fp);
Using Copy:
copy($extFile, "report.csv");
it mostly depends on the protocol (for instance, if it was a local file, copy() would be faster), but since you're saying "remote file", curl will probably be faster. you're using CURLOPT_ENCODING and CURLOPT_FOLLOWLOCATION, i guess that means it's transferred over http, where curl is generally much faster than copy, for at least 2 reasons:
1: PHP's fopen http wrappers doesn't use compression, but when you set CURLOPT_ENCODING to emptystring here, you tell curl to use compression if possible. (and while it depends on how libcurl is compiled, gzip and deflate compression is usually compiled in with libcurl.)
2: copy() keeps reading from the socket until the remote server closes the connection, which may be much later than when the file is completely downloaded. meanwhile, curl will only read until it has read bytes equal to the Content-Length:-http header, then close the connection itself, which is often much faster than stalling on read() until the remote server closes the connection (which copy() does, but curl_exec() doesn't.)
but the only way to know for sure ofc, TIAS.
$starttime=microtime(true);
$ch = curl_init();
$fp = fopen ($local_file, 'w+');
$ch = curl_init($remote_file);
curl_setopt($ch, CURLOPT_TIMEOUT, 50);
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_ENCODING, "");
curl_exec($ch);
curl_close($ch);
fclose($fp);
echo "used ".(microtime(true)-$starttime)." seconds.\n";
vs
$starttime=microtime(true);
copy($extFile, "report.csv");
echo "used ".(microtime(true)-$starttime)." seconds.\n";
gives you roughly microsecond precision (IEEE 754 double floating point precision probably corrupts it somewhat, but probably not enough to matter.)

PHP cURL wrong content-length provided by the source

Downloading an image using cURL
https://cdni.rt.com/deutsch/images/2018.04/article/5ac34e500d0403503d8b4568.jpg
when saving this image manually from the browser to the local pc, the size shown by the system is 139,880 bytes
When downloading it using cURL, the file seems to be damaged and does not get considered as a valid image
its size, when downloaded using cURL, is 139,845 which is lower than the size when downloading it manually
digging the issue further, found that the server is returning the content length in the response headers as
content-length: 139845
This length is identical to what cURL downloaded, so I suspected that cURL closes the transfer once reached the alleged (possibly wrong) length by the server
Is there any way to make cURL download the file completely even if the content-length header is wrong
Used code:
//curl ini
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER,0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
curl_setopt($ch, CURLOPT_TIMEOUT,20);
curl_setopt($ch, CURLOPT_REFERER, 'http://www.bing.com/');
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.8) Gecko/2009032609 Firefox/3.0.8');
curl_setopt($ch, CURLOPT_MAXREDIRS, 5); // Good leeway for redirections.
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); // Many login forms redirect at least once.
curl_setopt($ch, CURLOPT_COOKIEJAR , "cookie.txt");
//curl get
$x='error';
$url='https://cdni.rt.com/deutsch/images/2018.04/article/5ac34e500d0403503d8b4568.jpg';
curl_setopt($ch, CURLOPT_HTTPGET, 1);
curl_setopt($ch, CURLOPT_URL, trim($url));
$exec=curl_exec($ch);
$x=curl_error($ch);
$fp = fopen('test.jpg','x');
fwrite($fp, $exec);
fclose($fp);
the server has a bugged implementation of Accept-Encoding compressed transfer mechanism.
the response is ALWAYS gzip-compressed, but won't tell the client that it's gzip-compressed unless the client has the Accept-Encoding: gzip header in the request. when the server doesn't tell the client that it's gzipped, the client won't gzip-decompress it before saving it, thus your corrupted download. tell curl to offer gzip compression by setting CURLOPT_ENCODING,
curl_setopt($ch,CURLOPT_ENCODING,'gzip');
, then the server will tell curl that it's gzip-compressed, and curl will decompress it for you, before giving it to PHP.
you should probably tell the server admin about this, it's a serious bug in his web server, corrupting downloads.
libcurl has an option for that called CURLOPT_IGNORE_CONTENT_LENGTH, unfortunately this is not natively supported in php, but you can trick php into setting the option anyway, by using the correct magic number (which, at least on my system is 136),
if(!defined('CURLOPT_IGNORE_CONTENT_LENGTH')){
define('CURLOPT_IGNORE_CONTENT_LENGTH',136);
}
if(!curl_setopt($ch,CURLOPT_IGNORE_CONTENT_LENGTH,1)){
throw new \RuntimeException('failed to set CURLOPT_IGNORE_CONTENT_LENGTH! - '.curl_errno($ch).': '.curl_error($ch));
}
you can find the correct number for your system by compiling and running the following c++ code:
#include <iostream>
#include <curl/curl.h>
int main(){
std::cout << CURLOPT_IGNORE_CONTENT_LENGTH << std::endl;
}
but it's probably 136.
lastly, protip, file_get_contents ignore the content-length header altogether, and just keeps downloading until the server closes the connection (which is potentially much slower than curl) - also, you should probably contact the server operator and let him know, something's wrong/bugged with his server.

send multiple .crt certificate with curl in php

I want to send two .crt certificates with curl in php.
I am using this code.
$firstcalldata = "csv file data";
$target_url = www.example.com;
$ch = curl_init($target_url);
curl_setopt($ch, CURLOPT_VERBOSE, '1');
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_CAINFO,"C:/Users/admin/Desktop/CERT/PSCERT.pem");
curl_setopt($ch, CURLOPT_CAINFO, getcwd() . "C:/Users/admin/Desktop/CERT/PSCERT-C.crt");
curl_setopt($ch,CURLOPT_POST, 1);
curl_setopt($ch,CURLOPT_POSTFIELDS, $firstcalldata);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Type: application/csv'));
$result = curl_exec($ch);
echo $result;
curl_close($ch);`
I am sending also a csv file to $target_url.
But all time I am getting 403 - Forbidden: Access is denied.
You're not really saying what you want to accomplish, you're just asking a very weird question with no answer. You example doesn't send the .crt at all, it uses it to verify the server's certificate. And CAINFO is meant to point to a file holding a full bundle (like one of those you can get at curl's caextract page)
But you disable both VERIFYPEER and VERIFYHOST so you don't actually need any CA cert!?
What exactly do you want to do? Note also how getting a 403 back means that you already communicate fine over the TLS layer.

PHP cURL sending to port 8080

today I am trying to make a curl call to somesite which is listening to port 8080. However, calls like this get sent to port 80 and not 8080 instead:
$ch = curl_init();
curl_setopt($ch, CURLOPT_PORT, 8080);
curl_setopt($ch, CURLOPT_URL, 'http://somesite.tld:8080');
curl_setopt($ch, CURLOPT_POST, count($data));
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$target_response = curl_exec($ch);
curl_close($ch);
am i missing something here?
Just a note, I've had this issue before because the web host I was using was blocking outbound ports aside from the standard 80 and 443 (HTTPS). So you might want to ask them or do general tests. In fact, some hosts often even block outbound on port 80 for security.
Simple CURL GET request: (Also added json/headers if required, to make your life easier in need)
<?php
$chTest = curl_init();
curl_setopt($chTest, CURLOPT_URL, "http://example.com:8080/?query=Hello+World");
curl_setopt($chTest, CURLOPT_HTTPHEADER, array('Content-Type: application/json; charset=utf-8', 'Accept: application/json'));
curl_setopt($chTest, CURLOPT_HEADER, true);
curl_setopt($chTest, CURLOPT_RETURNTRANSFER, true);
$curl_res_test = curl_exec($chTest);
$json_res_test = explode("\r\n", $curl_res_test);
$codeTest = curl_getinfo($chTest, CURLINFO_HTTP_CODE);
curl_close($chTest);
?>
Example POST request:
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://example.com:8080');
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Type: application/json; charset=utf-8', 'Accept: application/json'));
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, '{"Hello" : "World", "Foo": "World"}');
// Set timeout to close connection. Just for not waiting long.
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
$curl_res = curl_exec($ch);
?>
Charset is not a required HTTP header and so are the others, the important one is Content-Type.
For the examples I used JSON MIME type, You may use whatever you want, take a look at the link:
http://en.wikipedia.org/wiki/Internet_media_type
Make sure that the php_curl.dll extension is enabled on your php, and also that the ports are open on the target serve.
Hope this helps.
Have you tried to SSH into the server that runs this and verify the iptables rules, and/or attempt to connect via telnet to the target server?
sh
telnet somesite.tld 8080
If nothing else, it will help troubleshoot your problem and eliminate network issues.
First check that you're able to connect using something other than PHP's horrible Curl syntax:
Chrome's Postman is easy to use if you're on your local machine,
else look at using (for linux users)
curl somesite:8080
wget -qO- somesite:8080
Once you've established you can connect, then you can go about the horrible business of using PHP Curl. There are wrappers, but they're flaky that I've found.
Both Curl, Wget or similar can be very easily configured to use GET and POST methods. It's not advisible, but for more than one complicated operation using Curl I've simply given up trying to configure PHP's library correctly and simply dropped to the command line.
THERE ARE SECURITY IMPLICATIONS. You need to take great care to ensure that anything you give it, particularly if it's from a form or an external source, is appropriately escaped.
<?
//Strip out any possible non-alpha-numeric character for security
$stringToSend = preg_replace('[^a-zA-Z]', '', $stringToSend);
$return = shell_exec("curl -X POST -d $stringToSend http://example.com/path/to/resource");
Your web server is misconfigured. The code you provided works for me.
Also, your code can be simpler. Just put the URI into the init call and drop the CURLOPT_PORT and CURLOPT_URL lines:
$ch = curl_init('http://somesite.tld:8080');
Are you sure that the server you intent to join isn't firewalled? And that Selinux is disabled?
Maby you can use
--local-port [-num]
Set a preferred number or range of local port numbers to use for the connection(s). Note that port numbers by nature are a scarce resource that will be busy at times so setting this range to something too narrow might cause unnecessary connection setup failures. (Added in 7.15.2)
Source: http://curl.haxx.se/docs/manpage.html#--ftp-pasv
// Use PHP cURL to make a request to port 8080 on a server
$ch = curl_init();
curl_setopt($ch, CURLOPT_PORT, 8080);
curl_setopt($ch, CURLOPT_URL, 'http://somesite.tld:8080');
curl_setopt($ch, CURLOPT_POST, count($data));
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$target_response = curl_exec($ch);
curl_close($ch);
curl_setopt($ch, CURLOPT_URL, 'http://somesite.tld:8080');
PHP cURL sending to port 8080
$ch = curl_init();
curl_setopt($ch, CURLOPT_PORT, 8080);
curl_setopt($ch, CURLOPT_URL, 'http://somesite.tld:8080');
curl_setopt($ch, CURLOPT_POST, count($data));
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$target_response = curl_exec($ch);
curl_close($ch);
curl_setopt($ch, CURLOPT_URL, 'http://somesite.tld:8080');
try this option for curl
curl_setopt($ch, CURLOPT_PROXY,"localhost:8080");
or use this
curl_setopt($ch, CURLOPT_PROXY,"yourservername:8080");
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "your-domain-name");
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_exec($ch);
curl_close($ch); `
?>

PHP, curl, and raw headers

When using the PHP curl functions, is there anyway to see the exact raw headers that curl is sending to the server?
You can use curl_getinfo:
Before the call
curl_setopt($ch, CURLINFO_HEADER_OUT, true);
After
$headers = curl_getinfo($ch, CURLINFO_HEADER_OUT);
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://www.example.com/");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLINFO_HEADER_OUT, true);
curl_exec($ch);
var_dump(curl_getinfo($ch,CURLINFO_HEADER_OUT));
?>
Only available in php 5.1.3
http://php.net/manual/en/function.curl-getinfo.php
You can verify that they are the same by using your console and hitting
curl http://example.com/ -I
or
curl --trace-ascii /file.txt http://example.com/
AFAIK, the PHP/CURL binding still lacks proper support for CURLOPT_DEBUGFUNCTION which is a callback from libcurl that can provide all those details.
That's the primary reason why I recommend people to work out HTTP scripting things with the curl command line tool and its --trace-ascii option FIRST, then translate that into a PHP function.
be sure to set the CURLINFO_HEADER_OUT option before making the curl_getinfo call
curl_setopt($c, CURLINFO_HEADER_OUT, true);

Categories