FTP Issues Downloading Large Directory - php

When I use FireFTP (or other FTP clients for that matter) to download large directories the download gets messed up. It seems to work unendingly and show a nearly completed percentage. Then it will change and show a status percentage much farther from completion. So usually what I have to do with large directories is ssh into the host and zip or tar the file and then download the tarred file. Is there a reason and/or solution to this?

Configure the client to use passive mode. Then try again.

Related

PHP FPM download speed on a backup system

i'm making a backup system for a company and I need to understand why i can't get better download speeds using PHP.
The files are on the webserver and I need to bring them to the backupserver. The problem is, using WGET to get the files, i can download them to the backupserver at 50mbps(network limit), but using PHP file_put_contents i can only get like 2mbps if only one file, and when i try to download like 50 files at the same time they get 50kbps each...
Since i'm downloading about 50TB of content, and each file is about 800mb-1.2g, this would take months this way.
I'm using NGINX with PHP-FPM and the configs are perfect everywhere. No limits, no timeouts etc
The code i'm using is basically this example but i'm updating the bytes downloaded in mysql.
https://www.php.net/manual/en/function.stream-notification-callback.php
Could this problem be related to file_put_contents performance? Is there a solution to get better download speeds?

Is there an Apache equivalent of nginx mod_zip?

I have been working a web app where I need to allow the user to select a number of files and allow them to download said files as a zip file. I am working with lots of data so storing the zip file in memory or on disk isn't an option.
I am currently using Apache and haven't been able to find any solutions to be able to dynamically create and stream zip files to a client. One thing I did find was nginx mod_zip that seems to do exactly what I want.
What would be an Apache equivalent to mod_zip, or another solution to dynamically zip and stream zip files (without using disk space or loading the whole file in memory)?

Uploading common required file via FTP without affecting website?

Lets say I have a file common.php used by many pages in my website. Now I want to update the file via FTP, so there will be around 1-2 seconds where the file is not available / still partially being uploaded.
During that time, it causes require('common.php') to report error, thus website is not loading properly.
How to solve cases like this?
Thanks!
You can upload the file with a different name and rename it only after the upload completes. That minimizes the downtime.
Some clients support this even automatically. What further minimizes the downtime.
For example, WinSCP SFTP/FTP client supports this. But with SFTP protocol only, if that's an option for you.
In WinSCP preferences, enable Transfer to temporary filename for All files.
WinSCP will then upload all files with a temporary .filepart extension, overwriting the target file only after the upload finishes.
(I'm the author of WinSCP)

PHP creating .nfs00000 files?

I have a PHP app, which is working fine for me, both on test system and a production system.
But another user of my app wrote me, that it creates a lot of files .nfs00000* on his system and it slows down loading of the page.
My app does not create any files on the filesystem, all datas are stored into MySQL. So I was really surprised by this. But that user removed my PHP app from his website and the problem dissappeared.
I will be honest -- I know nothing about .nfs00000* files and I was not able to google out anything reasonable about them. Can someone please try to give me explanation, what they are, why they are created and if I can do anything to avoid their creation?
Thanx, Honza
Maybe this can help:
Under linux/unix, if you remove a file that a currently running process still has open, the file isn't really removed. Once the process closes the file, the OS then removes the file handle and frees up the disk blocks. This process is complicated slightly when the file that is open and removed is on an NFS mounted filesystem. Since the process that has the file open is running on one machine (such as a workstation in your office or lab) and the files are on the file server, there has to be some way for the two machines to communicate information about this file. The way NFS does this is with the .nfsNNNN files. If you try to remove one of these file, and the file is still open, it will just reappear with a different number. So, in order to remove the file completely you must kill the process that has it open.
If you want to know what process has this file open, you can use 'lsof .nfs1234'. Note, however, this will only work on the machine where the processes that has the file open is running. So, if your process is running on one machine (eg. bobac) and you run the lsof on some other burrow machine (eg. silo or prairiedog), you won't see anything.
(Source)
If your app is deleting or modifying some files it could be the cause of the problem.

PHP: How do I avoid reading partial files that are pushed to me with FTP?

Files are being pushed to my server via FTP. I process them with PHP code in a Drupal module. O/S is Ubuntu and the FTP server is vsftp.
At regular intervals I will check for new files, process them with SimpleXML and move them to a "Done" folder. How do I avoid processing a partially uploaded file?
vsftp has lock_upload_files defaulted to yes. I thought of attempting to move the files first, expecting the move to fail on a currently uploading file. That doesn't seem to happen, at least on the command line. If I start uploading a large file and move, it just keeps growing in the new location. I guess the directory entry is not locked.
Should I try fopen with mode 'a' or 'r+' just to see if it succeeds before attempting to load into SimpleXML or is there a better way to do this? I guess I could just detect SimpleXML load failing but... that seems messy.
I don't have control of the sender. They won't do an upload and rename.
Thanks
Using the lock_upload_files configuration option of vsftpd leads to locking files with the fcntl() function. This places advisory lock(s) on uploaded file(s) which are in progress. Other programs don't need to consider advisory locks, and mv for example does not. Advisory locks are in general just an advice for programs that care about such locks.
You need another command line tool like lockrun which respects advisory locks.
Note: lockrun must be compiled with the WAIT_AND_LOCK(fd) macro to use the lockf() and not the flock() function in order to work with locks that are set by fcntl() under Linux. So when lockrun is compiled with using lockf() then it will cooperate with the locks set by vsftpd.
With such features (lockrun, mv, lock_upload_files) you can build a shell script or similar that moves files one by one, checking if the file is locked beforehand and holding an advisory lock on it as long as the file is moved. If the file is locked by vsftpd then lockrun can skip the call to mv so that running uploads are skipped.
If locking doesn't work, I don't know of a solution as clean/simple as you'd like. You could make an educated guess by not processing files whose last modified time (which you can get with filemtime()) is within the past x minutes.
If you want a higher degree of confidence than that, you could check and store each file's size (using filesize()) in a simple database, and every x minutes check new size against its old size. If the size hasn't changed in x minutes, you can assume nothing more is being sent.
The lsof linux command lists opened files on your system. I suggest executing it with shell_exec() from PHP and parsing the output to see what files are still being used by your FTP server.
Picking up on the previous answer, you could copy the file over and then compare the sizes of the copied file and the original file at a fixed interval.
If the sizes match, the upload is done, delete the copy, work with the file.
If the sizes do not match, copy the file again.
repeat.
Here's another idea: create a super (but hopefully not root) FTP user that can access some or all of the upload directories. Instead of your PHP code reading uploaded files right off the disk, make it connect to the local FTP server and download files. This way vsftpd handles the locking for you (assuming you leave lock_upload_files enabled). You'll only be able to download a file once vsftp releases the exclusive/write lock (once writing is complete).
You mentioned trying flock in your comment (and how it fails). It does indeed seem painful to try to match whatever locking vsftpd is doing, but dio_fcntl might be worth a shot.
I guess you've solved your problem years ago but still.
If you use some pattern to find the files you need you can ask the party uploading the file to use different name and rename the file once the upload has completed.
You should check the Hidden Stores in proftp, more info here:
http://www.proftpd.org/docs/directives/linked/config_ref_HiddenStores.html

Categories