Unable to download remote mp3 file to server using PHP - php

I am trying to download a pronunciation file (approx. 8kb) to server using a server-side PHP. Taking cue from a number of threads discussing this issue, I tried the following:
$numwrd = str_word_count($wrd);
if($numwrd == 1){
$html = file_get_html("http://www.dictionaryapi.com/api/v1/references/spanish/xml/" . rawurlencode($wrd) . "?key=" . rawurlencode('6d4d41f9-c28f-4544-9bb3-1b4708d1a4d1'));
$sn = $html->find('sound');
if($sn[0] != ""){
$foldername = findsub($sn[0]->plaintext);
$filename = explode(".", $sn[0], 2)[0];
$audiofn = $foldername . $filename . '.mp3';
$soundurl = 'http://media.merriam-webster.com/audio/prons/es/me/mp3/' . $foldername . '/' . $filename . '.mp3';
$path = 'amit.mp3';
$headers = getHeaders($soundurl);
if ($headers['http_code'] === 200 and $headers['download_content_length'] < 1024*1024) {
if (download($url, $path)){
return $audiofn . " " . $soundurl;
}
}
}
else { return "not found"; }
}
else { return "not found"; }
function getHeaders($url)
{
$ch = curl_init($url);
curl_setopt( $ch, CURLOPT_NOBODY, true );
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, false );
curl_setopt( $ch, CURLOPT_HEADER, false );
curl_setopt( $ch, CURLOPT_FOLLOWLOCATION, true );
curl_setopt( $ch, CURLOPT_MAXREDIRS, 3 );
curl_exec( $ch );
$headers = curl_getinfo( $ch );
curl_close( $ch );
return $headers;
}
function download($url, $path)
{
# open file to write
$fp = fopen ($path, 'w+');
# start curl
$ch = curl_init();
curl_setopt( $ch, CURLOPT_URL, $url );
# set return transfer to false
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, false );
curl_setopt( $ch, CURLOPT_BINARYTRANSFER, true );
curl_setopt( $ch, CURLOPT_SSL_VERIFYPEER, false );
# increase timeout to download big file
curl_setopt( $ch, CURLOPT_CONNECTTIMEOUT, 10 );
# write data to local file
curl_setopt( $ch, CURLOPT_FILE, $fp );
# execute curl
curl_exec( $ch );
# close curl
curl_close( $ch );
# close local file
fclose( $fp );
if (filesize($path) > 0) return true;
}
This didn't work so I tried again with file_get_contents. This method however only creates the file but with zero bytes. The values in $foldername, $filename, $audiofn, and $soundurl are evaluating correctly and all these variables have been tested. I can manually download the file by browsing to the URL, right clicking in the browser, and clicking download file as.... What could be wrecking my PHP?
P.S.: I just tried a modified function using cURLand this failed too:
function down($url, $target){//feeding it $soundurl and $path values
set_time_limit(0);
$file = fopen(dirname(__FILE__) . $target, 'w+');
$curl = curl_init($url);
curl_setopt_array($curl, [
CURLOPT_URL => $url,
CURLOPT_BINARYTRANSFER => 1,
CURLOPT_RETURNTRANSFER => 1,
CURLOPT_FILE => $file,
CURLOPT_TIMEOUT => 50,
CURLOPT_USERAGENT => 'Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)'
]);
$response = curl_exec($curl);
if($response === false) {
throw new \Exception('Curl error: ' . curl_error($curl));
}
$response;
}

Finally got it to work! This is what fixed it (line 7):
$filename = explode(".", $sn[0]->plaintext, 2)[0];
The reason I had to add the ->plaintext attribute is because without it, the value being returned to $filename was an xml tag instead of the text inside of that tag. Since the following line of code takes this value as input, this was corrupting the URL being called for download:
$soundurl = 'http://media.merriam-webster.com/audio/prons/es/me/mp3/' . $foldername . '/' . $filename . '.mp3';
Now the file downloads successfully because the URL is being formed correctly.

Related

PHP's fopen() can't get url, but curl can

A while back, I wrote a little utility function that takes inPath and outPath, opens both and copies from one to the other using fread() and fwrite(). allow_url_fopen is enabled.
Well, I've got a url that I'm trying to get the contents of, and fopen() doesn't get any data, but if I use curl to do the same, it works.
The url in question is: http://www.deltagroup.com/Feeds/images.php?lid=116582497&id=1
fopen version:
$in = #fopen( $inPath, "rb" );
$out = #fopen( $outPath, "wb" );
if( !$in || !$out )
{
echo 0;
}
while( $chunk = fread( $in, 8192 ) )
{
fwrite( $out, $chunk, 8192 );
}
fclose( $in );
fclose( $out );
if( file_exists($outPath) )
{
echo 1;
}
else
{
echo 0;
}
curl version:
$opt = "curl -o " . $outPath . " " . $inPath;
$res = `$opt`;
if( file_exists($outPath) )
{
echo 1;
}
else
{
echo 0;
}
Any idea why this would happen?
Even using php's curl, I was unable to download the file- until I added a curlopt_useragent string. Nothing in the response indicated that it was required (no errors, nothing other than an HTTP 200).
Final code:
$out = #fopen( $outPath, "wb" );
if( !$out )
{
echo 0;
}
$ch = curl_init();
curl_setopt( $ch, CURLOPT_URL, $inPath );
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt( $ch, CURLOPT_FILE, $out );
curl_setopt( $ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13');
curl_setopt( $ch, CURLOPT_FOLLOWLOCATION, true );
curl_setopt( $ch, CURLOPT_CONNECTTIMEOUT, 15 );
curl_setopt( $ch, CURLOPT_TIMEOUT, 18000 );
$data = curl_exec( $ch );
curl_close( $ch );
fclose( $out );
if( file_exists($outPath) )
{
echo 1;
}
else
{
echo 0;
}

How do I get youtube video direct link file size info using php curl?

I want to display the size of youtube video size on the direct link. I have a link to an mp4 format file that is 98 MB in size. I want to display the size when you browse this link your will get file.
Direct url link:
$url ='https://r4---sn-a8au-p5qs.googlevideo.com/videoplayback?signature=9216B94F5EA16F023DE3D34C6881F8AFA20E1EBA.4B56920C4B2BAB848AC847E0EB4C1E01FB01685C&itag=22&ratebypass=yes&expire=1424904527&id=o-AA8OyXA6gxAGHzkwqf94rIO0LTrhD8iHuUc9lMI9ED76&pl=46&fexp=905657%2C907263%2C916942%2C923382%2C927622%2C934601%2C934954%2C9406984%2C943917%2C947225%2C947240%2C947601%2C948124%2C951703%2C952302%2C952605%2C952612%2C952901%2C955301%2C957201%2C959701&mm=31&ipbits=0&dur=1366.192&mt=1424882845&ms=au&key=yt5&upn=SIiUMnjm5o0&source=youtube&sparams=dur%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Cmm%2Cms%2Cmv%2Cpl%2Cratebypass%2Crequiressl%2Csource%2Cupn%2Cexpire&mv=m&initcwndbps=1567500&ip=2001%3A4802%3A7805%3A101%3Abe76%3A4eff%3Afe20%3A31e4&sver=3&requiressl=yes&title=PHP%3A+Create+Your+Own+MVC+%28Part+1%29';
php curl code I am using to get code
function retrieve_remote_file_size($url){
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_HEADER, TRUE);
curl_setopt($ch, CURLOPT_NOBODY, TRUE);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
$data = curl_exec($ch);
$size = curl_getinfo($ch, CURLINFO_CONTENT_LENGTH_DOWNLOAD);
curl_close($ch);
return $size;
}
It is giving me 1419. How do I get the correct file size?
getID3 supports video formats. See: http://getid3.sourceforge.net/
$getID3 = new getID3;
$file = $getID3->analyze($filename);
echo("Filesize: ".$file['filesize']." bytes<br />");
Note: You must include the getID3 classes before this will work! See the above link.
If you have the ability to modify the PHP installation on your server, a PHP extension for this purpose is ffmpeg-php. See: http://ffmpeg-php.sourceforge.net/
edit:
Found something about this here:
Here's the best way (that I've found) to get the size of a remote file. Note that HEAD requests don't get the actual body of the request, they just retrieve the headers. So making a HEAD request to a resource that is 100MB will take the same amount of time as a HEAD request to a resource that is 1KB.
$curl = curl_init( $url );
// Issue a HEAD request and follow any redirects.
curl_setopt( $curl, CURLOPT_NOBODY, true );
curl_setopt( $curl, CURLOPT_HEADER, true );
curl_setopt( $curl, CURLOPT_RETURNTRANSFER, true );
curl_setopt( $curl, CURLOPT_FOLLOWLOCATION, true );
curl_setopt( $curl, CURLOPT_USERAGENT, get_user_agent_string() );
$data = curl_exec( $curl );
curl_close( $curl );
if( $data ) {
$content_length = "unknown";
$status = "unknown";
if( preg_match( "/^HTTP\/1\.[01] (\d\d\d)/", $data, $matches ) ) {
$status = (int)$matches[1];
}
if( preg_match( "/Content-Length: (\d+)/", $data, $matches ) ) {
$content_length = (int)$matches[1];
}
// http://en.wikipedia.org/wiki/List_of_HTTP_status_codes
if( $status == 200 || ($status > 300 && $status <= 308) ) {
$result = $content_length;
}
}
return $result;
}
?>
Usage:
$file_size = curl_get_file_size( "http://stackoverflow.com/questions/2602612/php-remote-file-size-without-downloading-file" );
The 1491 bytes you get is the length of the redirect. Add the following curl option:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, True);
According to the documentation it should be true by default, but it wasn't for me.

Get remote file size from HTTPS url

Till now i used the following function to get filesize of files from url.It works perfectly fine if the url is http but fails when its https.Can anyone update it to work with https.
<?php
/**
* Returns the size of a file without downloading it, or -1 if the file
* size could not be determined.
*
* #param $url - The location of the remote file to download. Cannot
* be null or empty.
*
* #return The size of the file referenced by $url, or -1 if the size
* could not be determined.
*/
function curl_get_file_size( $url ) {
// Assume failure.
$result = -1;
$curl = curl_init( $url );
// Issue a HEAD request and follow any redirects.
curl_setopt( $curl, CURLOPT_NOBODY, true );
curl_setopt( $curl, CURLOPT_HEADER, true );
curl_setopt( $curl, CURLOPT_RETURNTRANSFER, true );
curl_setopt( $curl, CURLOPT_FOLLOWLOCATION, true );
curl_setopt( $curl, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT'] );
$data = curl_exec( $curl );
curl_close( $curl );
if( $data ) {
$content_length = "unknown";
$status = "unknown";
if( preg_match( "/^HTTP\/1\.[01] (\d\d\d)/", $data, $matches ) ) {
$status = (int)$matches[1];
}
if( preg_match( "/Content-Length: (\d+)/", $data, $matches ) ) {
$content_length = (int)$matches[1];
}
// http://en.wikipedia.org/wiki/List_of_HTTP_status_codes
if( $status == 200 || ($status > 300 && $status <= 308) ) {
$result = $content_length;
}
}
return $result;
}
?>
This is not my code btw and the full credit goes to NebuSoft
Source:
PHP: Remote file size without downloading file
Regards
You need to add the below cURL parameter to your existing set.
curl_setopt( $curl, CURLOPT_SSL_VERIFYPEER, 0);
Adding this will stop cURL from verifying the peer's certificate.
In this, you have not used curl host verify'r, you have to set this for the communication verify purpose.
Just add this to your parameters
curl_setopt( $curl, CURLOPT_SSL_VERIFYPEER, false);
Try strlen(file_get_contents($url)); this will get the length of content through this
use this one:
<?php
$remoteFile = $url;//remote url
$ch = curl_init($remoteFile);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); //not necessary unless the file redirects (like the PHP example we're using here)
$data = curl_exec($ch);
curl_close($ch);
if ($data === false) {
echo 'cURL failed';
exit;
}
$contentLength = 'unknown';
$status = 'unknown';
if (preg_match('/^HTTP\/1\.[01] (\d\d\d)/', $data, $matches)) {
$status = (int)$matches[1];
}
if (preg_match('/Content-Length: (\d+)/', $data, $matches)) {
$contentLength = (int)$matches[1];
}
echo 'HTTP Status: ' . $status . "\n";
echo 'Content-Length: ' . $contentLength;
?>
Refer: http://in3.php.net/filesize
function remotefileSize($url) {
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_NOBODY, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 0);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_MAXREDIRS, 3);
curl_exec($ch);
$filesize = curl_getinfo($ch, CURLINFO_CONTENT_LENGTH_DOWNLOAD);
curl_close($ch);
if ($filesize) return $filesize;
}
echo remotefileSize('http://www.google.com/images/srpr/logo4w.png');
//Output
//19978 bytes

cURL not working but file_put_contents() do?

I have a code in cURL that should copy an image from a URL to my server:
$curl = curl_init( $url );
$file = fopen( $imageURL , 'wb' );
curl_setopt( $curl , CURLOPT_FILE , $file );
curl_setopt( $curl , CURLOPT_HEADER , true );
curl_setopt( $curl , CURLOPT_FOLLOWLOCATION , true );
curl_exec( $curl );
curl_close( $curl );
fclose( $file );
it doesn't work correctly but file_put_contents() does. Is there something wrong with my cURL code?
There are multiple solutions, cURL probably isn't the best.
$remote_img = 'http://www.somwhere.com/images/image.jpg';
$img = imagecreatefromjpeg($remote_img);
$path = 'images/';
imagejpeg($img, $path);
Would work nicely, but if you are set on cURL, try this:
$ch = curl_init ($img);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_BINARYTRANSFER,1);
$rawdata=curl_exec($ch);
curl_close ($ch);
if(file_exists($fullpath)){
unlink($fullpath);
}
$fp = fopen($fullpath,'x');
fwrite($fp, $rawdata);
fclose($fp);
That should work as well.
Best of luck!
Dont set CURLOPT_HEADER to true. This will include the header in the output. So your image file will contain response header + image data. Remove that line or set it false.

How to download a copy of the html from a website?

How do I download a copy of the html from a website that has language detection (eg google, youtube) and redirection? I have tried file_get_contents but it is to limited.
I am trying to use curl in php to get the html from www.google.com but it detects that I am from the UK and sends me a 302 redirect to www.google.co.uk.
I have tried many different things with no joy, is this possible? websites like www.markosweb.com do it..
my code:
$ch = curl_init( "http://www.google.com/" );
// $userAgent = "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Win64; x64; Trident/5.0)";
// $userAgent = 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)';
$userAgent = 'Googlebot/2.1 (http://www.googlebot.com/bot.html)';
$header = array(
"Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5",
"Accept-Language: en-US,us;q=0.7,en-us;q=0.5,en;q=0.3",
"Accept-Charset: windows-1251,utf-8;q=0.7,*;q=0.7",
"Keep-Alive: 300");
curl_setopt($ch,CURLOPT_RETURNTRANSFER,TRUE); //TRUE to return the transfer as a string of the return value of curl_exec() instead of outputting it out directly.
curl_setopt($ch,CURLOPT_CONNECTTIMEOUT,5); //The number of seconds to wait while trying to connect.
curl_setopt($ch, CURLOPT_USERAGENT, $userAgent); //The contents of the "User-Agent: " header to be used in a HTTP request.
curl_setopt($ch, CURLOPT_FAILONERROR, TRUE); //To fail silently if the HTTP code returned is greater than or equal to 400.
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); //To follow any "Location: " header that the server sends as part of the HTTP header.
curl_setopt($ch, CURLOPT_AUTOREFERER, TRUE); //To automatically set the Referer: field in requests where it follows a Location: redirect.
curl_setopt($ch, CURLOPT_TIMEOUT, 10); //The maximum number of seconds to allow cURL functions to execute.
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
curl_setopt($curl, CURLOPT_REFERER, $url);
curl_setopt($ch, CURLOPT_HTTPHEADER, 0);
$content = curl_exec( $ch );
$err = curl_errno( $ch );
$errmsg = curl_error( $ch );
$header = curl_getinfo( $ch );
curl_close( $ch );
$header['errno'] = $err;
$header['errmsg'] = $errmsg;
$header['content'] = $content;
return $header;
I have tried changing the useragent to lots of things, tried with and without header details. I managed to get something if I used header info : "Accept-Language: ru-ru,ru;q=0.7,en-us;q=0.5,en;q=0.3" but it was in russian or something.
Thanks for your help.
Carl
Try this proxy script:
// Change these configuration options if needed, see above descriptions for info.
$enable_jsonp = false;
$enable_native = false;
$valid_url_regex = '/.*/';
// ############################################################################
$url = $_GET['url'];
if ( !$url ) {
// Passed url not specified.
$contents = 'ERROR: url not specified';
$status = array( 'http_code' => 'ERROR' );
} else if ( !preg_match( $valid_url_regex, $url ) ) {
// Passed url doesn't match $valid_url_regex.
$contents = 'ERROR: invalid url';
$status = array( 'http_code' => 'ERROR' );
} else {
$ch = curl_init( $url );
if ( strtolower($_SERVER['REQUEST_METHOD']) == 'post' ) {
curl_setopt( $ch, CURLOPT_POST, true );
curl_setopt( $ch, CURLOPT_POSTFIELDS, $_POST );
}
if ( $_GET['send_cookies'] ) {
$cookie = array();
foreach ( $_COOKIE as $key => $value ) {
$cookie[] = $key . '=' . $value;
}
if ( $_GET['send_session'] ) {
$cookie[] = SID;
}
$cookie = implode( '; ', $cookie );
curl_setopt( $ch, CURLOPT_COOKIE, $cookie );
}
curl_setopt( $ch, CURLOPT_FOLLOWLOCATION, true );
curl_setopt( $ch, CURLOPT_HEADER, true );
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true );
curl_setopt( $ch, CURLOPT_USERAGENT, $_GET['user_agent'] ? $_GET['user_agent'] : $_SERVER['HTTP_USER_AGENT'] );
list( $header, $contents ) = preg_split( '/([\r\n][\r\n])\\1/', curl_exec( $ch ), 2 );
$status = curl_getinfo( $ch );
curl_close( $ch );
}
// Split header text into an array.
$header_text = preg_split( '/[\r\n]+/', $header );
if ( $_GET['mode'] == 'native' ) {
if ( !$enable_native ) {
$contents = 'ERROR: invalid mode';
$status = array( 'http_code' => 'ERROR' );
}
// Propagate headers to response.
foreach ( $header_text as $header ) {
if ( preg_match( '/^(?:Content-Type|Content-Language|Set-Cookie):/i', $header ) ) {
header( $header );
}
}
print $contents;
} else {
// $data will be serialized into JSON data.
$data = array();
// Propagate all HTTP headers into the JSON data object.
if ( $_GET['full_headers'] ) {
$data['headers'] = array();
foreach ( $header_text as $header ) {
preg_match( '/^(.+?):\s+(.*)$/', $header, $matches );
if ( $matches ) {
$data['headers'][ $matches[1] ] = $matches[2];
}
}
}
// Propagate all cURL request / response info to the JSON data object.
if ( $_GET['full_status'] ) {
$data['status'] = $status;
} else {
$data['status'] = array();
$data['status']['http_code'] = $status['http_code'];
}
// Set the JSON data object contents, decoding it from JSON if possible.
$decoded_json = json_decode( $contents );
$data['contents'] = $decoded_json ? $decoded_json : $contents;
// Generate appropriate content-type header.
$is_xhr = strtolower($_SERVER['HTTP_X_REQUESTED_WITH']) == 'xmlhttprequest';
header( 'Content-type: application/' . ( $is_xhr ? 'json' : 'x-javascript' ) );
// Get JSONP callback.
$jsonp_callback = $enable_jsonp && isset($_GET['callback']) ? $_GET['callback'] : null;
// Generate JSON/JSONP string
$json = json_encode( $data );
print $jsonp_callback ? "$jsonp_callback($json)" : $json;
}
Make sure to perform a request like this:
http://example.com/script?url=http://whateverurl.com/
Oh, and this PHP script will display the result in JSON.
From there, you can parse it using jQuery.
Like I use this jQuery code:
<script type="text/javascript">
$(document).ready(function(){
var url='+++++URL WHICH THE PHP PROXY SCRIPT IS IN++++++';
$(window).load(function(){
$.getJSON(url,function(json){
$("#resu").append(""+json.contents+"");
});
});
});
</script>
Edit: This script is not a true proxy in the sense that it does fake an IP address. Sorry for the confusion.

Categories