file_get_contents (and wget) very slow - php

I'm using the google text to speech api, but for some reason it's being really slow when I connect to it via php or command line.
I'm doing this:
$this->mp3data = file_get_contents("http://translate.google.com/translate_tts?tl=en&q={$text}");
Where $text is just a urlencoded string.
I've also tried doing it via wget on the command line:
wget http://translate.google.com/translate_tts?tl=en&q=test
Either way takes about 20 seconds or more. Via php it does eventually get the contents and add them to a new file on my server as I want it to. Via wget it times the connection out.
However, if I just go to that url in the browser, it's pretty much instant.
Could anyone shed any light on why this might be occuring?
Thanks.

It's due to how Google parses robots. You need to spoof the User-Agent headers to pretend to be a computer.
Some info on how to go about this would be here:
https://duckduckgo.com/?q=php%20curl%20spoof%20user%20agent

Managed to sort this out now, this is what I ended up doing and now it's only taking a few seconds:
$header=array("Content-Type: audio/mpeg");
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $uri);
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
$this->mp3data = curl_exec($ch);
curl_close($ch);

Related

Suddenly access denied cUrl in PHP

I used the following function to get access to an API (live working example)
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://www.halteverbotszonen.com/api/numbers');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.0.3705; .NET CLR 1.1.4322)');
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($ch, CURLOPT_FRESH_CONNECT, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
$output = curl_exec($ch);
$status = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
Since a few days (can't tell exactly when) it gives me a 403 error when executing the curl call. Accessing https://www.halteverbotszonen.com/api/numbers directly is possible. I have not changed anything on any of the two servers, what could possibly cause this and where could I see that (any logs for this?)
I have a second api where the same happens (accessible directly works, but not via curl call).
It's the same hoster, could they have changed something that does not allow incoming curl calls?
Any hint appreciated
- Maybe due to https/http ?
- Maybe a different conf inside your apache/php ?
- Maybe the distant server banned your IP
:o
In most of cases, when something works and PAF another day didn't work anymore, it's a software update problem (like conf file) or network problem (like IP) or distant problem (like the server). I guess :D

Using Curl in PHP on my Http://localhost server workings fine, but using Curl on Http(s)://localhost will not output any CSS

I am using Curl on server to curl www.yelp.com, but using Curl on Http(s)://localhost will not output any CSS htmls sheets. I have tried:
CURLOPT_SSL_VERIFYPEER, FALSE
But the issue is not curling the page, it is the fact that my browser Chrome does not seem to recognize any sort of CSS formatting. Any ideas?
For example, running the below code from http://localhost gives a well formatted page. Running the below code from https://localhost gives a page without css.
<?php
$url="http://www.yelp.com/";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.2; WOW64; rv:17.0) Gecko/20100101 Firefox/17.0");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch,CURLOPT_TIMEOUT,10);
$cl = curl_exec($ch);
echo $cl;
exit;

There is something wrong with my CURL script

I'm using a CURL script to basically recreate the process a user having to hit send on a form. I would like everything to run in the background but it never sends when this script executes.
$ch = curl_init();
curl_setopt($ch, CURLOPT_POST,1);
curl_setopt($ch, CURLOPT_POSTFIELDS,
"&shopId=".$ID."&encodedMessage=".urlencode($encodedMessage)."&signature=".$signature);
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_USERAGENT,
"Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.7.3) Gecko/20041001 Firefox/0.10.1" );
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
$result = curl_exec($ch);
Is it perhaps not executing?
Tested, working great.
Did you set the variable $url right?
curl is not a good option to emulate fork.
As soon as your current script stops, the curl request will be killed and probably the target url will also stop executing its own script.
you could use a cron or similar external process to run the curl requests and it would run with no interruption. It would also be helpful to queue up parallel requests.

PHP cURl: Can I check if my user agent is working?

I'm writing a cURL script, but how can I check if it's working and passing properly when it's visiting the website?
$ckfile = '/tmp/cookies.txt';
$useragent= "Mozilla/5.0 (iPhone; U; CPU iPhone OS 3_0_1 like Mac OS X; en-us) AppleWebKit/528.18 (KHTML, like Gecko) Mobile/7A400";
$ch = curl_init ("http://website.com");
curl_setopt($ch, CURLOPT_AUTOREFERER , true);
=> true
curl_setopt($ch, CURLOPT_USERAGENT, $useragent); // set user agent
curl_setopt ($ch, CURLOPT_COOKIEJAR, $ckfile);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
$output = curl_exec ($ch);
curl_close($ch);
just make a php page like this on your server and try your script on your own url
var_dump($_SERVER);
and check the HTTP_USER_AGENT string.
You can also achieve the same things by looking at the Apache logs.
But I am pretty sure curl is setting the User-Agent string like it should ;-)
You'll find the FF extension LiveHTTPHEaders will help you see exactly what happens to the headers when using a normal browsing session.
http://livehttpheaders.mozdev.org/
This will increase your understanding of how your target server responds, and even shows if it redirects your request internally.

How do I download files of big sizes from somewhere on the web to the web server with PHP?

How do I download files of big sizes from somewhere on the web to the web server with PHP? Also, what should be allowed on the server in order to make this happen? Thanks.
Could this do a good job?
<?php
ini_set(max_execution_time, 0);
$the_link = $_GET['url'];
$ch = curl_init($the_link);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322;)");
curl_setopt($ch, CURLOPT_COOKIEJAR, '/tmp/cookies.txt');
curl_setopt($ch, CURLOPT_COOKIEFILE, '/tmp/cookies.txt');
$the_file = curl_exec($ch);
curl_close($ch);
$hdl = fopen("file", 'w');
fwrite($hdl, $the_file);
fclose($hdl);
?>
Use curl for that. PHP must have curl support and you need, obviously, a writeable filesystem.

Categories