Downloading File from a URL using PHP script

Downloading File from a URL using PHP script - php

Hi I want to download some 250 files from a URL which are in a sequence. I am almost done with it! Just the Problem is the structure of my URL is:
http://lee.kias.re.kr/~newton/sann/out/201409//SEQUENCE1.prsa
Where id is in a sequence but the file name "SEQUENCE1.psra" has a format "SEQUENCE?.psra".
Is there any way I can specify this format of file in my code? And also there are other files in folder, but only 1 with ".psra" ext.
Code:
<?php
// Source URL pattern
//$sourceURLOriginal = "http://www.somewebsite.com/document{x}.pdf";
$sourceURLOriginal = " http://lee.kias.re.kr/~newton/sann/out/201409/{x}/**SEQUENCE?.prsa**";
// Destination folder
$destinationFolder = "C:\\Users\\hp\\Downloads\\SOP\\ppi\\RSAdata";
// Destination file name pattern
$destinationFileNameOriginal = "doc{x}.txt";
// Start number
$start = 7043;
// End number
$end = 7045;
$n=1;
// From start to end
for ($i=$start; $i<=$end; $i++) {
// Replace source URL parameter with number
$sourceURL = str_replace("{x}", $i, $sourceURLOriginal);
// Destination file name
$destinationFile = $destinationFolder . "\\" .
str_replace("{x}", $i, $destinationFileNameOriginal);
// Read from URL, write to file
file_put_contents($destinationFile,
file_get_contents($sourceURL)
);
// Output progress
echo "File #$i complete\n";
}
?>
Its working if I directly specify the URL!
Error:
Warning: file_get_contents( http://lee.kias.re.kr/~newton/sann/out/201409/7043/SEQUENCE?.prsa): failed to open stream: Invalid argument in C:\xampp\htdocs\SOP\download.php on line 37
File #7043 complete
Its making the files but they are empty!
If there is a way in which I can download that whole folder(named with id in sequence) can also work! But how do we download the whole folder in a folder?

It may be possible file_get_contents() function is not working on your server.
Try this code :
function url_get_contents ($Url) {
if (!function_exists('curl_init')){
die('CURL is not installed!');
}
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $Url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$output = curl_exec($ch);
curl_close($ch);
return $output;
}

Here you go.
I didnt test the whole file_get_contents, file_put_contents part, but if you say its adding the files (albeit, blank) then I assume it still works here...
Everything else works fine. I left a var_dump() in so you can see what the return looks like.
I did what I suggested in my comment. Open the folder, parse the file list, grab the filename you need.
Also, I dont know if you read my original comments, but $sourceURLOriginal has an extra space at the beginning, which might have been giving you an issue.
<?php
$start=7043;
$end=7045;
$sourceURLOriginal="http://lee.kias.re.kr/~newton/sann/out/201409/";
$destinationFolder='C:\Users\hp\Downloads\SOP\ppi\RSAdata';
for ($i=$start; $i<=$end; $i++) {
$contents=file_get_contents($sourceURLOriginal.$i);
preg_match_All("|href=[\"'](.*?)[\"']|",$contents,$hrefs);
$file_list=array();
if (empty($hrefs[1])) continue;
unset($hrefs[1][0],$hrefs[1][1],$hrefs[1][2],$hrefs[1][3],$hrefs[1][4]);
$file_list=array_values($hrefs[1]);
var_dump($file_list);
foreach ($file_list as $index=>$file) {
if (strpos($file,'prsa')!==false) {
$needed_file=$index;
break;
}
}
file_put_contents($destinationFolder.'\doc'.$i.'.txt',
file_get_contents($sourceURLOriginal.$i.'/'.$file_list[$needed_file])
);
}

Related

Question mark on filename using fopen

I am having a problem saving a file using fopen. For some reason the saved file has a question mark in the end.
I am attempting to retrieve a list of files from a remote server and download them to my server.
This is the part of my code that does the job :
$arrlength = count($reports);
for ($x = 0; $x < $arrlength; ++$x) {
$report = $reports[$x];
$thefilepath = returnfilename($report);
echo 'the filepath : '.$thefilepath;
echo '<br>';
$thefilename = basename($thefilepath).PHP_EOL;
echo 'the filename : '.$thefilename;
echo '<br>';
$localfile = 'incoming/'.$thefilename;
echo 'local file to save : '.$localfile;
echo '<br>';
curl_setopt($ch, CURLOPT_URL, $thefilepath);
$out = curl_exec($ch);
$fp = fopen($localfile, 'w');
fwrite($fp, $out);
fclose($fp);
}
This script returns the following (I've hidden the actual addresses - retain the spaces etc):
the filepath : https://example.com.com/xxx/xxx.xlsx
the filename : xxx.xlsx
local file to save : incoming/xxx.xlsx
When on my server I do a ls I get :
-rw-r--r-- 1 www-data www-data 29408 May 17 23:01 xxx.xlsx?
There is nothing wrong with the file, when I remove the ? I can retrieve it as normal and open it.
What is this? And how can I do it so it is not added in the end?

The string you're using to name the file has a non-printable character at the end, and ls is telling you that there is something there, even if you would normally be unable to see it. Strip the character from the string before using it.

Unable to fread output of a remote php file

I am using the output of a php file on a remote server, to show content on my own web-site. I do not have access to modify files on the remote server.
The remote php file outputs java script like this:
document.write('<p>some text</p>');
If I enter the url in a browser I get the correct output. E.g:
https://www.remote_server.com/files/the.php?param1=12
I can show the output of the remote file on my website like this:
<script type="text/javascript" src="https://www.remote_server.com/files/the.php?param1=12"></script>
But I would like to filter the output a bit before showing it.
Therefore I implemented a php file with this code:
function getRemoteOutput(){
$file = fopen("https://www.remote_server.com/files/the.php?param1=12","r");
$output = fread($file,1024);
fclose($file);
return $output;
}
When I call this function fopen() returns a valid handle, but fread() returns an empty string.
I have tried using file_get_contents() instead, but get the same result.
Is what I am trying to do possible?
Is it possible for the remote server to allow me to read the file via the browser, but block access from a php file?

Your variable $output is only holding the 1st 1024 bytes of the url... (headers maybe?).
You will need to add a while not the "end of file" loop to concatenate the entire remote file.
PHP reference: feof
You can learn a lot more in the PHP description for the fread function.
PHP reference: fread.
<?php
echo getRemoteOutput();
function getRemoteOutput(){
$file = fopen("http://php.net/manual/en/function.fread.php","r");
$output = "";
while (!feof($file)){ // while not the End Of File
$output.= fread($file,1024); //reads 1024 bytes at a time and appends to the variable as a string.
}
return $output;
fclose($file);
}
?>
In regards to your questions:
Is what I am trying to do possible?
Yes this is possible.
Is it possible for the remote server to allow me to read the file via
the browser, but block access from a php file?
I doubt it.

I contacted the support team for the site I was trying to connect to. They told me that they do prevent access from php files.
So that seems to be the reason for my problems, and apparently I just cannot do what I tried to do.
For what it's worth, here is the code I used to test the various methods to read file output:
<?php
//$remotefile = 'http://www.xencomsoftware.net/configurator/tracker/ip.php';
$remotefile = "http://php.net/manual/en/function.fread.php";
function getList1(){
global $remotefile;
$output = file_get_contents($remotefile);
return htmlentities($output);
}
function getList2(){
global $remotefile;
$file = fopen($remotefile,"r");
$output = "";
while (!feof($file)){ // while not the End Of File
$output.= fread($file,1024); //reads 1024 bytes at a time and appends to the variable as a string.
}
fclose($file);
return htmlentities($output);
}
function getList3(){
global $remotefile;
$ch = curl_init(); // create curl resource
curl_setopt($ch, CURLOPT_URL, $remotefile); // set url
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); //return the transfer as a string
$output = curl_exec($ch); // $output contains the output string
curl_close($ch); // close curl resource to free up system resources
return htmlentities($output);
}
function getList4(){
global $remotefile;
$r = new HttpRequest($remotefile, HttpRequest::METH_GET);
try {
$r->send();
if ($r->getResponseCode() == 200) {
$output = $r->getResponseBody();
}
} catch (Exception $e) {
echo 'Caught exception: ', $e->getMessage(), "\n";
}
return htmlentities($output);
}
function dumpList($ix, $list){
$len = strlen($list);
echo "<p><b>--- getList$ix() ---</b></p>";
echo "<div>Length: $len</div>";
for ($i = 0 ; $i < 10 ; $i++) {
echo "$i: $list[$i] <br>";
}
// echo "<p>$list</p>";
}
dumpList(1, getList1()); // doesn't work! You cannot include/requre a remote file.
dumpList(2, getList2());
dumpList(3, getList3());
dumpList(4, getList4());
?>

Fixing permissions for writing cUrl output to a text file

I am making a web application that grabs one page from the internet using cUrl and updates another page accordingly. I have been doing this by saving the HTML from cUrl and then parsing it on the other page. The issue is: I can't figure out what permissions I should use for the text file. I don't have it saved in my /public/ html folder, since I don't want any of the website's users to be able to see it. I only want them to be able to see the way it's parsed on the site.
Here is the cUrl code:
$perfidlist = "" ;
$sourcefile = "../templates/textfilefromsite.txt";
$trackerfile = "../templates/trackerfile.txt";
//CURL REQUEST 1 OF 2
$ch = curl_init("http://www.website.com");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
<more cUrl options omitted>
ob_start();
$curl2 = curl_exec($ch);
ob_end_clean();
curl_close($ch);
//WRITING FILES
$out = fopen($sourcefile, "r");
$oldfiletext = fread($out, filesize($sourcefile));
fclose($out);
$runcode = 1 ;
And the part where I save the text file:
/*only writing a file if the site has changed*/
if (strcmp($oldfiletext, $curl2) !==0)
{
$out = fopen($sourcefile, "w");
fwrite($out, $curl2);
fclose($out);
$tracker = fopen($trackerfile, "a+");
fwrite($tracker, date('Y/m/d H:i:s')."\n");
fclose($tracker);
$runcode = 1 ;
}
I am receiving an error at that last '$out = fopen($sourcefile, "w");' part that says:
Warning: fopen(../templates/textfilefromsite.txt): failed to open stream: Permission denied in /usr/share/nginx/templates/homedir.php on line 72
Any ideas?

The issue was with file/folder permissions. I ended up changing the permissions for the file to '666' meaning '-rw-rw-rw-' and it worked.

PHP / Curl To Randomize Text Between Two Tags

Below is my working code that pulls a text file from a remote location and inserts into the html body of a page a specific line. The code works just fine as it is now. I want to do an addition to the code however and have it randomize the line that it gets. Here is what I'm wanting to do.
The text file that is being pulled will have a varied amount of lines. Only one line is chosen via the echo $lines[0]; which tells which line to get. The line will be formatted like this..
<p>This is a line of text domain 1. This is a line of text.</p><p>This is a line of text domain 2. This is a line of text.</p><p>This is a line of text domain 3. This is a line of text.</p>
All of that would be one line and pulled into the html of the page. The above example would display 3 paragraphs of text with links in the order above.
What I would like to do is have that line of text randomize between the <p>..</p> So for instance if I put the below code on Site A the output would be in order of domain 1 then domain 2 and then domain 3. If I put the code on Site B I would like it to be domain 3 and then domain 1 and then domain 2. To display them in random order, not the exact order for each time I put the code on a site.
I don't know if there would need to be some sort of cache on the site I have the code on to remember which random order to display in. That is what I want. I do not want a random order on each page load.
I hope this makes sense. If not please tell me so I can try and explain it better. Here is my working code as of now. Can anyone help me get this working? Thank you very much for your help.
<?php
function url_get_contents ($url) {
if (function_exists('curl_exec')){
$conn = curl_init($url);
curl_setopt($conn, CURLOPT_SSL_VERIFYPEER, true);
curl_setopt($conn, CURLOPT_FRESH_CONNECT, true);
curl_setopt($conn, CURLOPT_RETURNTRANSFER, 1);
$url_get_contents_data = (curl_exec($conn));
curl_close($conn);
}elseif(function_exists('file_get_contents')){
$url_get_contents_data = file_get_contents($url);
}elseif(function_exists('fopen') && function_exists('stream_get_contents')){
$handle = fopen ($url, "r");
$url_get_contents_data = stream_get_contents($handle);
}else{
$url_get_contents_data = false;
}
return $url_get_contents_data;
}
?>
<?php
$data = url_get_contents("http://mydomain.com/mytextfile.txt");
if($data){
$lines = explode("\n", $data);
echo $lines[0];
}
?>

Try This
$str = '<p>This is a line of text domain 1. This is a line of text.</p><p>This is a line of text domain 2. This is a line of text.</p><p>This is a line of text domain 3. This is a line of text.</p>';
preg_match_all('%(<p[^>]*>.*?</p>)%i', $str, $match);
$count = 0;
$used = array();
while ($count < 3) {
$index = rand(0, 2);
if (!isset($used[$index])) {
$used[$index] = 1;
echo $match[0][$index];
$count++;
}
}

I think I understand what you are asking, but if not, please let me know and I will adjust.
Basically, what I'm doing here is counting the number of lines in the array that you exploded and then using that as a max number to randomize against. Once I have a random number, then I just access that line of the file array. So if I generate the number 5, then it will grab the 5th line from the array.
$lines = explode("\n", $data);
$line_count = count($lines) - 1;
for ($i = 0; $i < 3; $i++) {
print "<p>".$lines[get_random_line($line_count)]."</p>";
}
function get_random_line($line_count) {
mt_srand(microtime() * 1000000);
$random_number = rand(0, $line_count);
return $random_number;
}

Without modifying your code too much and without getting into storing values in databases, using flat file storage you can do something like the following:
Create a file called "count.txt" and place it in the same location as your php file.
<?php
function url_get_contents ($url) {
if (function_exists('curl_exec')){
$conn = curl_init($url);
curl_setopt($conn, CURLOPT_SSL_VERIFYPEER, true);
curl_setopt($conn, CURLOPT_FRESH_CONNECT, true);
curl_setopt($conn, CURLOPT_RETURNTRANSFER, 1);
$url_get_contents_data = (curl_exec($conn));
curl_close($conn);
}elseif(function_exists('file_get_contents')){
$url_get_contents_data = file_get_contents($url);
}elseif(function_exists('fopen') && function_exists('stream_get_contents')){
$handle = fopen ($url, "r");
$url_get_contents_data = stream_get_contents($handle);
}else{
$url_get_contents_data = false;
}
return $url_get_contents_data;
}
$data = url_get_contents("http://mydomain.com/mytextfile.txt");
$fp=fopen('count.txt','r');//Open count.txt for reading
$count=fread($fp,4) ? $count++ : $count=0;//Get and increment $count (4=no. bytes to read)
fclose($fp); //Close file
if($data){
$lines=explode("\n",$data);
if($count>count($lines)){$count=0;}//Reset $count if more than available lines
echo $lines[$count];
$fp=fopen('count.txt','w'); //Another fopen to truncate the file simply
fwrite($fp,$count); //Store $count just displayed
fclose($fp); //Close file
}
?>

Sounds like your really looking for a way to have unique content or maybe also have the appearance to updated content on your HTML page. This has been extremely useful for me and Im sure many others will like it as well even though it is a bit different than what you are trying to do.
This will grab nested Spintax from a text file. It will then spin the content and display in your page. Your page will need to be .php however there is a way for this to work on an HTML page that's what I use this for.
Spintax Example: {cat|Dog|Mouse} Works on Sentence Spins, Spin/Rotate images, spin HTML code etc... There are many things that you can do with this.
<?php
function spin($s){
preg_match('#\{(.+?)\}#is',$s,$m);
if(empty($m)) return $s;
$t = $m[1];
if(strpos($t,'{')!==false){
$t = substr($t, strrpos($t,'{') + 1);
}
$parts = explode("|", $t);
$s = preg_replace("+\{".preg_quote($t)."\}+is",
$parts[array_rand($parts)], $s, 1);
return spin($s);
}
$file = "http://www.yourwebsite/Data.txt";
$f = fopen($file, "r");
while ( $line = fgets($f, 1000) ) {
echo spin($line);
}
?>

PHP: Download a file from web to local machine

I have searched the web for 2 days and can not find the answer.
I am trying to create a routine which displays the files on a site I control, and allows the user to download a selected file to a local drive.
I am using the code below. When I uncomment the echo statements, it displays the correct source and destination directories, the correct file size and the echo after the fclose displays TRUE.
When I echo the source file ($data), it displays the correct content.
The $FileName variable contains the correct filename, which is either .doc/.docx or .pdf. I have tested both and neither saves anything into the destination directory, or anywhere else on my machine.
The source path ($path) is behind a login, but I am already logged in.
Any thoughts on why this is failing to write the file?
Thanks,
Hank
Code:
$path = "https://.../Reports/ReportDetails/$FileName";
/* echo "Downloading: $path"; */
$data = file_get_contents($path); /* echo "$data"; */
$dest = "C:\MyScans\\".$FileName; /* echo "<br />$dest"; */
$fp = fopen($dest,'wb');
if ( $fp === FALSE ) echo "<br />Error in fopen";
$result = fwrite($fp,$data);
if ( $result === FALSE ) echo "<br />Can not write to $dest";
/* else echo "<br />$result bytes written"; */
$result = fclose($fp); /* echo "<br />Close: $result"; */

I think (!) that you're a bit confused.
You mentioned
allows the user to download a selected file to a local drive.
But the path "C:\MyScans\\".$FileName is is the path on the webserver, not the path on the user's own computer.
After you do whatever to retrieve the desired file from the remote website:
Create a file from it and redirect the user to the file by using header('Location: /path/to/file.txt');
Insert the following header:
header('Content-disposition: attachment; filename=path/to/file.txt');
It forces the user to download the file. And that's probably what you want to do
Note: I have used the extension txt, but you can use any extension

you can use php Curl:
<?php
$url = 'http://www.example.com/a-large-file.zip';
$path = '/path/to/a-large-file.zip';
$fp = fopen($path, 'w');
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_FILE, $fp);
$data = curl_exec($ch);
curl_close($ch);
fclose($fp);

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Downloading File from a URL using PHP script - php

Related

Question mark on filename using fopen

Unable to fread output of a remote php file

Fixing permissions for writing cUrl output to a text file

PHP / Curl To Randomize Text Between Two Tags

PHP: Download a file from web to local machine

Categories

Resources