I am looking for a solution I need to delete log files, but there might be a possibility that they are being accessed at the moment the delete call is made. By being accessed, I mean a process is either reading or writing to the file. In such cases, I need to skip the file instead of deleting it. Also my server is Linux and PHP is running on Apache.
What I am looking for is something similar to (in pseudo-code):
<?php
$path = "path_to_log_file";
$log_file = "app.log";
if(!being_accessed($log_file))
{
unlink($path.$log_file);
}
?>
Now my question is how can I define being_accessed? I know there might not be a language function do to this directly in PHP. I am thinking about using a combination of sections like last_access_time (maybe?) and flock (but this is useful only in those conditions where the file was flock-ed by the accessing application)
Any suggestions/insights welcome...
In general you will not be able to find that out without having administration rights (and i.e. be able to run tools like lsof to see if file of your is listed. But if your scripts are running on linux/unix server (which is the case for most hosters) then you do not need to bother, because filesystem will take care of this. So for example, you got 1GB file and someone is downloading this file. It is safe for you to delete the file (with unlink() or any other way) event if that downloader just started and it will not interfere his downloading, because filesystem knows that file is already open (some processes holds a handle) so it will only mark it, let say invisible for others (so once you try to list folder content you will no longer see that file, but if your file is big enough you could try to check available disk space (i.e. with df, to see it would still be occupied)) but those how kept the handle will still be able to use it. Once all processes close their handle file will be physically removed from media and disk space freed. So just unlink when needed. If you bother about warning unlink may throw (which may be a case on Windows), then just prepend your call with # mark (#unlink()) to disable any warning this call may throw in runtime
You'd simply change your code this way (if you are doing it repetitively):
<?php
$path = "path_to_log_file";
$log_file = "app.log";
#unlink($path.$log_file);
Notice the # to avoid getting an error in case the file is not deletable, and the lack of ending tag (ending tags are source of common errors and should be avoided)
Related
So as described in question itself, I want to replace the file from which a zip archive is opened and then which is overwriting files with new version.
If still my question is not clear then the thing I want to do is I want to get a zip file from a server and then unzip using CLASS "ZipArchive" and then over write everyfile which is in Zip to destination location, the problem will be the php file by which this thing is happening will gonna be overwritten.
So will the php generate the error or the process will go whatever we want?
On Linux files are not usually locked (see https://unix.stackexchange.com/questions/147392/what-is-advisory-locking-on-files-that-unix-systems-typically-employs) so you can do whatever you want with that file. PHP works with that file in memory so you can overwrite it during it's execution.
But if you will run the script multiple times while the first one is in progress it might load incomplete version and then it will throw some error so it might be wise to make sure that won't happen (using locks) or try to do some more atomic approach.
Windows locks files so I assume you won't be able to extract files the same way there.
Hey so I'm trying to clean up my code a bit, and I just need to know: How important is the fopen function in PHP? By this I mean...well I've been told that you always need to fclose a file when you're done with it. This leads me to believe that if a file stays open too long then it gets corrupt in some way?
I don't know, but the thing is I've got an if statement that if the condition is true, it opens the file(s) and writes to it(them). Would it be just as efficient to open all the files for writing/reading at the beginning of the script, and then just include the instruction to actually write to them if the conditional is true??
And while we're on the topic...if I want to read line by line from a file I'll simply use the array = file("filename) shortcut I learned here. Is there a shortcut for writing to a file as well, without having to go through all the fopen stuff? Can I take a file and make it an array, line by line, and by changing that array, change the file? Is there anyway to do that?
Thanks!
if a file stays open too long then it gets corrupt in some way?
I think PHP is smart enough to garbage collect your open files when you are finishing using them. I don't think the file will be corrupted if you don't close it unless you write to it unintentionally.
Would it be just as efficient to open all the files for writing/reading at the beginning of the script
I'm not sure you should worry about efficiency unless you need to. If your code is working and is maintainable, I wouldn't change where you open files.
Is there a shortcut for writing to a file as well, without having to go through all the fopen stuff?
You can use file_put_contents(..) as a shortcut to write to files.
The number of files that a process can have open at a given time is limited by the operating system. If you open many files and never close them eventually you'll run out of your quota and you can't open any more.
On the other hand, if you open the file for writing, until you close the file you have no guarantee that what you have written is safely on the disk.
The simple explanation is: until you fclose file, you have no guarantee that what you fwrited to it, is actually there. Operating system can have this content stored in some buffer and be waiting for access to hard disk. If you finish your script without closing the file, that data can simply be lost.
Now, this doesn't actually happen in majority of cases, but if youwant to be sure, fclose
Can I take a file and make it an array, line by line, and by changing that array, change the file?
You could make your own array class (implementing ArrayAccess interface), which loads every line of the file. Then modify those offsetSet and offsetUnset methods to rewrite the file everytime you call them.
But I doubt it will be performance wise to rewrite everything when you make a change.
This leads me to believe that if a file stays open too long then it gets corrupt in some way?
No, it doesn't corrupt the file. It just uses up resources (opening or keeping a file handle open does take some time, memory and overhead) and you risk making other scripts that want to open the same file wait. The actual file handle will automatically be closed once your script ends, but it's a good idea to explicitly close it as soon as you're done with it. That goes for everything really: if you don't need it anymore, clean it up.
Would it be just as efficient to open all the files for writing/reading at the beginning of the script, and then just include the instruction to actually write to them if the conditional is true??
No, see above. Opening file handles isn't free, so don't do it unless you need to.
if I want to read line by line from a file I'll simply use the array = file("filename) shortcut I learned here
That's nice, but be aware that this reads the entire file into memory at once. For small files that hardly matters, but for larger files it means that a) your script will stop while the entire file is being read from the disk and that b) you need to have enough memory available to store the file in. PHP scripts are usually rather memory constrained, since there are typically many instances running in parallel.
Is there a shortcut for writing to a file as well, without having to go through all the fopen stuff?
file_put_contents
I have a daemon that opens a file and writes to it throughout operation (typically for many days at a time). In order to support log rotation, I want to be able to identify when the file the handle refers to is in a new location from the original.
Is this possible? fstat() doesn't give me anything useful for this situation.
My current solution is, in the log-writing function, testing the existence of the log file and if it's not there, closing the old handle and opening a new handle. This works, but is a hack and has limitations. In my case, our systems group uses a tool for log rotation that requires them to touch the file after rotating it out, which causes my daemon to continue thinking that its file handle points to the correct place.
Here's a thought. It's not portable, I'm not totally sure if it works or is reliable, and it makes me cringe a little, but you can probably use readlink on /proc/%d/fd/%d, where the first %d is the result of getpid(), and the second is your file descriptor.
There are some caveats here, though. First, the whole "get path + do something with that path" approach will have a race condition in the face of a rename happening concurrently. Also, your log file could have other links. I'm not sure what the behavior is for the links in /proc in the face of a rename, either.
You can simply periodically re-acquire your file handle (with mode a), for example every 24 hours. That allows you to continue logging despite the presence of the moronic and buggy(since there is an inevitable race condition between renaming the file and re-touching it) log rotation utility.
fstat gives you an inode number, which will change when the log is rotated.
See http://php.net/manual/en/function.fstat.php and http://www.php.net/manual/en/function.lstat.php
You can compare the inode number from fstat with the inode number from lstat; if they are different, reopen.
The standard way of handling this for Unix daemons in the past has been to catch SIGHUP and use it as a signal to reopen the log file, and have the log rotation script send SIGHUP.
I have to write script in PHP which will be dynamicly replace some files on server from time to time. Easy thing, but the problem is that I want to avoid situation when user request this file during replacing. Then he could get uncompleted file or even error.
Best solution to me is block access to my site during replacing by e.g. setting .htaccess redirecting all requests to page with information about short break. But normally .htaccess file already exist, so there may be situation when server gets uncomplited .htaccess file.
Is there any way to solve it?
Edit: Thank you so much for all answers, guys. You are briliant.
#ircmaxell Your idea sounds great for me. I read what dudes from PHP.net wrote and I don't know if I understand all correctly.
So, tell me: If I do all steps you wrote and add apc.file_update_protection to my php.ini, there will be no way to get uncompleted file by user by any time? There will be always one, correct file? Are you sure at 100% ?
It is very important to me coz these replacements will be very often and there is big chance to request file during renaming.
Here's something that's easy, and will work on any local filesystem on linux:
Upload (or write) the file to a temporary filename
Move the file (using the mv (move) command, either in FTP, or command line, etc, or the rename command in PHP) to overwrite the existing one.
When you execute the mv command, it basically deletes the old file pointer, and writes the new one. Since it's being done at the filesystem level, it's an atomic operation. So the client can't get an old file...
APC recommends doing this to prevent these very issues from cropping up...
Also note that you could use rsync to do it as well (since it basically does this behind the scenes)...
Doesn't this work already? I never tested for this specifically but I've done what you're doing and that problem never showed up.
It seems like an easy thing for an operating system to
Upload / write to a temporary file
When writing is done, block access to the original file (make the request for the file wait)
Delete the file, rename the temporary one and remove any locks
I'm fairly sure this is what an OS should do for copying. If you're writing the file contents yourself with PHP you'll just have to do this yourself...
Try railless Capistrano or a method they use:
in a directory you have two things:
A folder containering folders, each subfolder is a release
A soft link to the current release folder
When you upload the new file, do the upload making a new release folder. Check to see that no one is currently running the current release (this might be a little tricky assuming you dont have a crazy number of users you could probably do it with a db entry) and then rewrite the softlink to point to the newest release.
maybe do try it like this:
delete file and save it's path
ln -nfs movedfilepath pathtosorrypage.html
upload file to some temporary folder on the server
remove symlink
mv newfile movedfilepath
Option 1: If you have a lot of users and this replacing is done not so frequent, you can set up a maintenance on the site (block access) and have no one log in after a certain time, and finally cut off everyone who is logged in when you're about to do the replacement.
Option 2: If the file replacing is done frequently (in which case you shouldn't do the maintenance every day), have it done by code. Have two of the same files (same folder if you want). Then, by code, when you're about to replace the file, have it just give the copy, while you replace the one you want. You can do it with a simple IF.
Pseudo-code:
if (replaceTime - 15 seconds <= currentTime <= replaceTime + 15 seconds){
// allows 30 seconds for another script to bring in the new image into 'myImage.jpg'
<img src="/myFiles/myOldImage.jpg" />
} else {
<img src="/myFiles/myImage.jpg" />
}
No need to update any database or manually move/copy/rename a file.
After replaceTime + 15 has passed:
copyFileTo("myImage.jpg","myOldImage.jpg");
// Now you have the copy ready for the next time to replace
I have an upload form created in php on my website where people are able to upload a zip file. The zip file is then extracted and all file locations are added to a database. The upload form is for people to upload pictures only, obviously, with the files being inside the zip folder I cant check what files are being uploaded until the file has been extracted. I need a piece of code which will delete all the files which aren't image formats (.png, .jpeg, etc). I'm really worried about people being able to upload malicious php files, big security risk! I also need to be aware of people changing the extensions of php files trying to get around this security feature.
This is the original script I used http://net.tutsplus.com/videos/screencasts/how-to-open-zip-files-with-php/
This is the code which actually extracts the .zip file:
function openZip($file_to_open) {
global $target;
$zip = new ZipArchive();
$x = $zip->open($file_to_open);
if($x === true) {
$zip->extractTo($target);
$zip->close();
unlink($file_to_open);
} else {
die("There was a problem. Please try again!");
}
}
Thanks, Ben.
Im really worried about people being able to upload malicious php files, big security risk!
Tip of the iceberg!
i also need to be aware of people changing the extensions of php files trying to get around this security feature.
Generally changing the extensions will stop PHP from interpreting those files as scripts. But that's not the only problem. There are more things than ‘...php’ that can damage the server-side; ‘.htaccess’ and files with the X bit set are the obvious ones, but by no means all you have to worry about. Even ignoring the server-side stuff, there's a huge client-side problem.
For example if someone can upload an ‘.html’ file, they can include a <script> tag in it that hijacks a third-party user's session, and deletes all their uploaded files or changes their password or something. This is a classic cross-site-scripting (XSS) attack.
Plus, thanks to the ‘content-sniffing’ behaviours of some browsers (primarily IE), a file that is uploaded as ‘.gif’ can actually contain malicious HTML such as this. If IE sees telltales like (but not limited to) ‘<html>’ near the start of the file it can ignore the served ‘Content-Type’ and display as HTML, resulting in XSS.
Plus, it's possible to craft a file that is both a valid image your image parser will accept, and contains embedded HTML. There are various possible outcomes depending on the exact version of the user's browser and the exact format of the image file (JPEGs in particular have a very variable set of possible header formats). There are mitigations coming in IE8, but that's no use for now, and you have to wonder why they can't simply stop doing content-sniffing, you idiots MS instead of burdening us with shonky non-standard extensions to HTTP headers that should have Just Worked in the first place.
I'm falling into a rant again. I'll stop. Tactics for serving user-supplied images securely:
1: Never store a file on your server's filesystem using a filename taken from user input. This prevents bugs as well as attacks: different filesystems have different rules about what characters are allowable where in a filename, and it's much more difficult than you might think to ‘sanitise’ filenames.
Even if you took something very restrictive like “only ASCII letters”, you still have to worry about too-long, too-short, and reserved names: try to save a file with as innocuous a name as “com.txt” on a Windows server and watch your app go down. Think you know all the weird foibles of path names of every filesystem on which your app might run? Confident?
Instead, store file details (such as name and media-type) in the database, and use the primary key as a name in your filestore (eg. “74293.dat”). You then need a way to serve them with different apparent filenames, such as a downloader script spitting the file out, a downloader script doing a web server internal redirect, or URL rewriting.
2: Be very, very careful using ZipArchive. There have been traversal vulnerabilities in extractTo of the same sort that have affected most naive path-based ZIP extractors. In addition, you lay yourself open to attack from ZIP bombs. Best to avoid any danger of bad filenames, by stepping through each file entry in the archive (eg. using zip_read/zip_entry_*) and checking its details before manually unpacking its stream to a file with known-good name and mode flags, that you generated without the archive's help. Ignore the folder paths inside the ZIP.
3: If you can load an image file and save it back out again, especially if you process it in some way in between (such as to resize/thumbnail it, or add a watermark) you can be reasonably certain that the results will be clean. Theoretically it might be possible to make an image that targeted a particular image compressor, so that when it was compressed the results would also look like HTML, but that seems like a very difficult attack to me.
4: If you can get away with serving all your images as downloads (ie. using ‘Content-Disposition: attachment’ in a downloader script), you're probably safe. But that might be too much of an inconvenience for users. This can work in tandem with (3), though, serving smaller, processed images inline and having the original higher-quality images available as a download only.
5: If you must serve unaltered images inline, you can remove the cross-site-scripting risk by serving them from a different domain. For example use ‘images.example.com’ for untrusted images and ‘www.example.com’ for the main site that holds all the logic. Make sure that cookies are limited to only the correct virtual host, and that the virtual hosts are set up so they cannot respond on anything but their proper names (see also: DNS rebinding attacks). This is what many webmail services do.
In summary, user-submitted media content is a problem.
In summary of the summary, AAAARRRRRRRGGGGHHH.
ETA re comment:
at the top you mentioned about 'files with the X bit set', what do you mean by that?
I can't speak for ZipArchive.extractTo() as I haven't tested it, but many extractors, when asked to dump files out of an archive, will recreate [some of] the Unix file mode flags associated with each file (if the archive was created on a Unix and so actually has mode flags). This can cause you permissions problems if, say, owner read permission is missing. But it can also be a security problem if your server is CGI-enabled: an X bit can allow the file to be interpreted as a script and passed to any script interpreter listed in the hashbang on the first line.
i thought .htaccess had to be in the main root directory, is this not the case?
Depends how Apache is set up, in particular the AllowOverride directive. It is common for general-purpose hosts to AllowOverride on any directory.
what would happen if someone still uploaded a file like ../var/www/wr_dir/evil.php?
I would expect the leading ‘..’ would be discarded, that's what other tools that have suffered the same vulnerability have done.
But I still wouldn't trust extractTo() against hostile input, there are too many weird little filename/directory-tree things that can go wrong — especially if you're expecting ever to run on Windows servers. zip_read() gives you much greater control over the dearchiving process, and hence the attacker much less.
First you should forbid every file that doesn’t have a proper image file extension. And after that, you could use the getimagesize function to check whether the files are regular image files.
But furthermore you should be aware that some image formats allow comments and other meta information. This could be used for malicious code such as JavaScript that some browsers will execute under certain circumstances (see Risky MIME sniffing in Internet Explorer).
You should probably not rely just on the filename extension, then. Try passing each file through an image library to validate that its really an image, also.
I don't see the risk in having renamed php files in your DB...
As long as you're not evaluating them as PHP files (or at all, for that matter), they can't do too much harm, and since there's no .php extension the php engine won't touch them.
I guess you could also search the files for <?php...
Also: assume the worst about the files uploaded to your machine. Renamed the folder into which you're saving them "viruses" and treat it accordingly. Don't make it public, don't give any file launch permissions (especially the php user), etc.
You might also want to consider doing mime type detection with the following library:
http://ca.php.net/manual/en/ref.fileinfo.php
Now you are relying on your harddrive space for extracting. You can check fileheaders to determine what kind of files they are. there probably libraries for that.
offtopic: isnt it better to let the user select couple of images instead of uploading a zip file. Better for people that don't know what zip is (yes they exist)
If you set php to only parse files ending with .php, then you can just rename a file from somename.php to somename.php.jpeg and you are safe.
If you really want to delete the files, there is a zip library available to php. You could use it to check the names and extensions of all the files inside the zip archive uploaded, and if it contains a php file, give the user an error message.
Personally, I'd add something to the Apache config to make sure that it served PHP files as text from the location the files are uploaded to, so you're safe, and can allow other file types to be uploaded in the future.
Beaware of this Passing Malicious PHP Through getimagesize()
inject PHP through image functions that attempt to insure that images
are safe by using the getimagesize() function
read more here http://ha.ckers.org/blog/20070604/passing-malicious-php-through-getimagesize/
Better for your user logo use gravatar like here used by Stackoverflow ;)
Use getimagesize function.
Full procedure:-
1.) Extract extension of image/uploaded file and then compare extension with allowed extension.
2.) Now create a random string for renaming uploaded file. best idea is md5(session_id().microtime()).It can not be duplicated and if your server is very fast and can process less than a microsecond than use incremented variable and add them with string.
now move that file.
A tip
Disable PHP file processing in upload directory, it will always prevent you from any server side attack and if possible add your htaccess in root directory or in httpd config file and disable htaccess files from there now it solve your maximum problems