Check files on remote FTP server for duplicate content with PHP - php

I've written a script that transfers local files into a folder structure on a remote FTP server with PHP. I'm currently using ftp_connect() to connect to the remote server and ftp_put() to transfer the file, in this case, a CSV file.
My question is, how would one verify that a file's contents (on the remote FTP server) are not a duplicate of the local file's contents? Is there any way to parse the contents of the remote file in question, as well as a local version and then compare them using a PHP function?
I have tried comparing the filesizes of the local file using filesize() and the remote file using ftp_size(), respectively. However, even with different data, but the same number of characters it generates a false positive for duplication as the file-sizes are the same number of bytes.
Please note, the FTP in question is not under my control, so I can't put any scripts on the remote server.
Update
Thanks to both Mark and gafreax, here is the final working code:
$temp_local_file = fopen($conf['absolute_installation_path'] . 'data/temp/ftp_temp.csv', 'w');
if ( ftp_fget($connection, $temp_local_file, $remote_filename, FTP_ASCII) ) {
$temp_local_stream = file_get_contents($conf['absolute_installation_path'] . 'data/temp/ftp_temp.csv');
$local_stream = file_get_contents($conf['absolute_installation_path'] . $local_filename);
$temp_hash = md5($temp_local_stream);
$local_hash = md5($local_stream);
if($temp_hash !== $local_hash) {
$remote_file_duplicate = FALSE;
} else {
$remote_file_duplicate = TRUE;
}
}

You can use hashing function like md5 and check against two generated md5 if they match.
For example:
$a = file_get_contents('a_local');
$b = file_get_contents('b_local');
$a_hash = md5($a);
$b_hash = md5($b);
if($a_hash !== $b_hash)
echo "File differ";
else
echo "File are the same";
The md5 function is useful to avoid problem on reading strange data on file

You could also compare the last modified time of each file. You'd upload the local file only if it is more recent than the remote one. See filemtime and ftp_mdtm. Both of those return a UNIX timestamp you can easily compare. This is faster than getting the file contents and calculating a hash.

Related

PHP replace a row in csv works fine on my localhost but does not replace the row when uploaded to cpanel?

Hello I am relatively new to PHP and I was trying to replace a row in a csv file, i didnt find an optimal solution so I concocted script (a work around) which suits my needs for the time being till I grasp a better understanding of PHP
I tested it on my localhost using XAMPP and everything was working fine , it was replacing the row as intended but when i uploaded the files to my cpanel it stopped replacing and instead it just goes the normal route and write the row on new line.
this is my code :
$fileName = 'Usecase.csv'; //This is the CSV file
$tempName = 'temp.csv';
$inFile = fopen($fileName, 'r');
$outFile = fopen($tempName,'w');
while (($line = fgetcsv($inFile)) !== FALSE)
{
if(($line[0] == "$fin") ) //Here I am checking the value of the variable to see if same value exists in the array then i am replacing the array which will be later written into the csv file
{
$line = explode (",", "$tempstr10");
$asd=$asd+1; //this is the variable that i defined and assigned value 0 in the top most section, this is used later in the code
}
fputcsv($outFile, $line );
}
fclose($inFile);
fclose($outFile);
unlink($fileName);
rename($tempName, $fileName);
if( $asd==0 && filesize("Usecase.csv")>0) // here its checking if the value is 0 , if value is 0 then that means the above code didnt execute which means the value wasnt present in the file , this is to avoid writing the same string again into the file
{ file_put_contents("Usecase.csv", "$tempstr10\r\n",FILE_APPEND | LOCK_EX); }
if( $asd==0 && filesize("Usecase.csv")==0)
{ file_put_contents("Usecase.csv", "$tempstr10\r\n",FILE_APPEND | LOCK_EX); }
and as I mentioned above , its working on the localhost but not on the cpanel , can someone point out if something is wrong with the code ? or if its something else ?
thank you
The most likely problem is that your local version of PHP or your local configuration of PHP is different from what is on the server.
For example, fopen is a feature that can be disabled on some shared servers.
You can check this by creating a php file with the following conents:
<?php phpinfo();
Then visit that PHP file in your browser. Do this for both your local dev environment and your cPanel server to compare the configuration to identify the differences that may be contributing to the differing behavior.
You should also check the error logs. They can be found in multiple different places depending on how your hosting provider has things configured. If you can't find them, you'll need to ask your hosting provider to know for sure where the error logs are.
Typical locations are:
The "Errors" icon in cPanel
A file named "error_log" in one of the folders of your site. Via ssh or the Terminal icon in cPanel you can use this command to find those files: find $PWD -name error_log
If your server is configured to use PHP-FPM, the php error log is located at ~/logs/yourdomain_tld.php.error.log
You should also consider turning on error reporting for the script by putting this at the very top. Please note that this should only be used temporarily while you are actively debugging the application. Leaving this kind of debugging output on could expose details about your application that may invite additional security risks.
<?php
ini_set('display_startup_errors', 1);
error_reporting(E_ALL);
... Your code here ...

ftp listing and download file in current date

I have a case,
I have a remote server that contains so many generated transaction files (.txt) from 2015 until now. I must download it everyday real time. For now, i use PHP to download it all, but the method i think is not effectifely. First, I list all files, and then I read the component of the files such as the date modified, but this method is annoying. Make my program run slowly and take a very much time.
This is my code (I've used PHP Yii2),
public function actionDownloadfile(){
$contents=Yii::$app->ftpFs->listContents('/backup', ['timestamp','path','basename']); --> Much time needed while executing this line
var_dump($contents);
foreach ($contents as $value) {
if (date('Y-m-d',$value['timestamp']) == date('Y-m-d')){
echo "[".date('Y-m-d H:i:s')."] : Downloading file ".$value['basename']."\n";
$isi = Yii::$app->ftpFs->read($value['path']);
$dirOut = Yii::$app->params['out'];
$fileoutgoing = $dirOut."/".$value['basename'];
$file = fopen($fileoutgoing,"w");
fwrite($file,$isi);
}
}
}
i have a question,
Is that possible to list and download some files in ftp server just only on this current date without listing them all first?
Any solution either using PHP or Shell Script is OK.
Thank you so much (y)

Executing file via file_get_contents on remote host

I have a script and I don't know why and how it works - one reason for that is I found contradicting information about file_get_contents.
I have three (internal) webservers - all set up the same way, running the same software.
I need to count the number of files in one specific folder on each server (in order to get the number of users logged into a certain application).
For the local server my file counting PHP script is called by a simple include and for the two remote servers I use file_get_contents.
In both cases I refer to the same PHP file. That works - I get the correct number of files for the folder on each server.
Sometimes you read file_get_contents returns just the file content but does not execute the file. In my case the file is executed and I get the correct number of files. So, I'm a bit confused here why my scripts actually work.
My scripts were saved on one server. I want to be more flexible and be able to call the scripts from each server. Therefore I created a new virtual directory on a network folder and moved the script files there, the virtual folder has the same set up on each server. I had to change my script slightly to get the same result again. Instead of a return $num I now have echo $num. If I use return I won't get a result, if I use echo the correct number of files is given. I would prefer to receive the result via return - but I don't know why this doesn't work anymore in the new context.
script which shows the number of files:
function getUserNum($basis_url_server, $url_vaw_scripte, $script_number_users)
{
$serverName = strtoupper($_SERVER['SERVER_NAME']);
//local server
if(strpos(strtoupper($basis_url_server),$serverName) !== false)
{
$numUsers = (include($script_number_users));
}
//remote server
else
{
$path = $basis_url_server.$url_vaw_scripte.$script_number_users;
$numUsers = file_get_contents($path);
//include($path);
}
return $numUsers;
}
echo getUserNum($basis_url_server1, $url_vaw_scripte, $script_number_users)."($label_server1)";
echo getUserNum($basis_url_server2, $url_vaw_scripte, $script_number_users)."($label_server2)";
echo getUserNum($basis_url_server3, $url_vaw_scripte, $script_number_users)."($label_server3)";
script for counting the files (refered as $script_number_users above)
<?php
// 'include' only contains $fadSessionRepository = "E:\Repository\Session"
include dirname(__DIR__).'/vaw_settings.php';
$fi = new FilesystemIterator($pfadSessionRepository, FilesystemIterator::SKIP_DOTS);
$number = (iterator_count($fi)-1)/2 ;
//return $number;
echo $number;
?>
file_get_contents() will execute a GET if given a url, and will read a file if given filesystem path. It is like 2 different function from the same call.
You are actually building a primitive REST webservice instead of actually loading the files as you though, the remote files are executed and you get the output that you would see if you manually loaded them from a browser
file_get_contents() will return the raw content of a local file. For remote files it will return what the webserver delivers. If the webserver executes the script in the file it will get the result of that script. If the webserver doesn't execute the script in the file (due to a misconfiguration for example) you will still get the raw content of the remote script.
In your case I'd just remove the include path and just fetch all scripts over http. It reduces the complexity and the overhead of calling one of three scripts via http instead of loading it directly is negligible.

PHP FTP code fails, even though file is downloaded successfully

I'm using the following code to download a file via FTP in PHP:
$fp = fopen($local_file, 'w+');
$conn_id = ftp_connect($host);
$login_result = ftp_login($conn_id, $user, $pass);
$ret = ftp_nb_fget($conn_id, $fp, $remote_file, FTP_BINARY);
while ($ret == FTP_MOREDATA) {
$ret = ftp_nb_continue($conn_id);
}
if ($ret != FTP_FINISHED) {
echo "<span style='color:red;'><b>There was an error downloading the file!</b></span><br>";
logThis("log.txt", date('h:i:sa'), "ERROR DOWNLOADING FILE!");
exit();
}
fclose($fp);
<<php code continues below this....>>
This code seems to be working fine. The file is downloaded, and the MD5 hash of the file matches the hash of the file on the other server before it was downloaded. So the download does complete.
In any event, using that code above, even with the file successfully downloading, it's hitting the code inside of the if ($ret != FTP_FINISHED) condition.
If the file downloads fine, why is FTP_FINISHED not true?
EDIT
When I check the value of $ret after the WHILE loop, the times the script completes fine $ret=1 and the times the script fails $ret=0
However, there are times when the script fails because $ret=0 when the file is actually downloaded properly, which can be confirmed with a MD5 comparison.
Also, 0 or 1 are not values that should be returned from these commands. The official PHP documentation give three possible return values, they are FTP_FAILED or FTP_FINISHED or FTP_MOREDATA
I have thought of one solution. Since the file does get downloaded correctly, as determined by an MD5 check from the original source (which we do have), I could modify the code this way:
if ($ret != FTP_FINISHED) {
$localMD5 = md5_file($local_file);
if($localMD5 != $remoteMD5){
echo "<span style='color:red;'><b>There was an error downloading the file!</b></span><br>";
logThis("log.txt", date('h:i:sa'), "ERROR DOWNLOADING FILE!");
exit();
}
}
In most cases the script completes as expected, so this block of code never gets run. However, in cases where the error above occurs and this code is run, it could verify the MD5 hash of the file, and only run the error code if it doesn't match the MD5 of the original source file. If the MD5's match then the download was successful anyways, so the error code shouldn't run
Edited:
My first solution wasn't correct. After checking your comments below, I must say your code is correct, and that the problem probably is on "upload_max_size" and "post_max_size" values.
See here "Upload large files to FTP with PHP" and mainly here: "Changing upload_max_filesize on PHP"
So, the proposed solution is to add this to the .htaccess file:
php_value upload_max_filesize 2G
php_value post_max_size 2G
or, if the server is yours (dedicated), set them in php.ini (you'll need to restart the server so the changes take effect).
You may also find useful the post_max_size info in php.net. I've found interesting particularly this:
If the size of post data is greater than post_max_size, the $_POST and
$_FILES superglobals are empty. This can be tracked in various ways,
e.g. by passing the $_GET variable to the script processing the data,
i.e. , and then checking if
$_GET['processed'] is set.

Go through files in folder at another server PHP

I've got two servers. One for the files and one for the website.
I figured out how to upload the files to that server but now I need to show the thumbnails on the website.
Is there a way to go through the folder /files on the file server and display a list of those files on the website using PHP?
I searched for a while now but can't find the answer.
I tried using scanddir([URL]) but that didn't work.
I'm embarrassed to say this but I found my answer at another post:
PHP directory list from remote server
function get_text($filename) {
$fp_load = fopen("$filename", "rb");
if ( $fp_load ) {
while ( !feof($fp_load) ) {
$content .= fgets($fp_load, 8192);
}
fclose($fp_load);
return $content;
}
}
$matches = array();
preg_match_all("/(a href\=\")([^\?\"]*)(\")/i", get_text('http://www.xxxxx.com/my/cool/remote/dir'), $matches);
foreach($matches[2] as $match) {
echo $match . '<br>';
}
scandir will not work any other server but your own. If you want to be able to do such a thing your best bet to have them still on separate servers would be to have a php file on the website, and a php file on the file server. The php file on your website could get file data of the other server via the file server php file printing data to the screen, and the webserver one reading in that data. Example:
Webserver:
<?php
$filedata = file_get_contents("url to file handler php");
?>
Fileserver:
<?php
echo "info you want webserver to read";
?>
This can also be customized for your doing with post and get requests.
I used the following method:
I created a script which goes through all the files at the file server.
$fileList = glob($dir."*.*");
This is only possible if the script is actually on the fileserver. It would be rather strange to go through files at another server without having access to it.
There is a way to do this without having access (read my other answer) but this is very slow and not coming in handy.
I know I said that I didn't have access, but I had. I just wanted to know all the possibilities.

Categories