search server for newly added files with PHP - php

I'm trying to write a cronjob in php to search a server for newly uploaded files. Every night the server adds a csv file, which must be pulled to my local server, and inserted into my database. I can read the csv file, insert it into the database, and everything else on my end, except figure out how to scan the directory for the new file every night. Does anybody have any general suggestions for going about this?

Algorithm:
Scan the directory, order files by date, and record the date of the most recent file
On subsequent scans, compare the date of the newest file with the recorded date
If the date is newer, a new file has been uploaded.

While you can technically do it with bare PHP, I'd go for the find command:
$files = explode("\0",`find /path/to/dir -mtime -1 -iname '*.csv' -print0`);

Using scandir, make an array $allFiles that contains all the files. Create another array $oldFiles containing all the existing files in the directory. After that, performing array_diff($allFiles, oldFiles) will yield an array containing only the new files.

Related

find out when a file's contents were last modified

I wrote a CMS script made of many folders and files and I want to find a way to track when I last modified any of the files. I wrote a recursive directory/file check that finds the latest modified file and gives me the date and time however my issue is this: every time that I as much as copy a file to the server, or rename a file, even if I didn't make any modifications at all to any of the files, the newly copied file or renamed file now has today's date and therefore my script shows that there was a modification made today even if I haven't made changes in weeks.
How can I circumvent that?
I am using filemtime()
Is there a way with PHP to know when the file was ACTUALLY last modified (ie when the code in a file was worked on the last time)?
Thanks
I found a way to do it and wanted to post the answer:
$test = new SplFileInfo('path/to/file');
echo $test->getMTime();
echo date('Y-m-d',$test->getMTime());
The SplFileInfo::getMTime will actually return the last time a file's contents were modified as opposed to the last modification date of the file

phpWord file growth issue when adding images

Ok, here's the scenario. I need to generate about 200 MS Word documents based off of data collected and stored in my database. Generating the word docs with or without photos is a user option. After Word doc generation I then want to create a zip file of all the files generated. The zip part is done, the word file generation for the most part is done.
When the user chooses to generate the reports without the photos, the site queries the database and returns about 200 records for the report, and then with a foreach loop I run the PHPWord coding to generate and write the files to a temp folder and then after the foreach loop I'm running code to zip them all up and then delete the temp files. Works great. BUT, when the option to generate the reports WITH photos is selected, it starts generating the Word docs, but the file sizes increase every time a file is created. First file is 70k, the second file is 140k, the third is 210k and so on, where each file should only be 70k each. The only difference in the two operations in the inclusion of the addImage commands with the table cells like so:
$table->addCell()->addImage('photos/thumb_image.jpg, $imageStyle);
Help please!
Use Media::resetElements() between loops that were available since 0.10.0.

Using PHP to update file after a new copy is uploaded

So I'm trying to see if something like this is possible WITHOUT using database.
A file is uploaded to the server /files/file1.html
PHP is tracking the upload time by checking last update time in database
If the file (file1.html) has been updated since the last DB time, PHP makes changes; Otherwise, no changes are made
Basically, for a text simulation game (basketball), it outputs HTML files for rosters/stats/standings/etc. and I'd like to be able to insert each team's Logo at the top (which the outputted files don't do). Obviously, it would need to be done often as the outputted files are uploaded to the server daily. I don't want to have to go through each team's roster manually inserting images at the top.
Don't have an example as the league hasn't started.
I've been thinking of just creating a button on the league's website (not created yet) that when pushed would update the pages, but I'm hoping to have PHP do it by itself.
Yes, you could simply let php check for the file creation date (the point in time where the file was created on the server, not the picture itself was made). check http://php.net/manual/en/function.filemtime.php and you should be done within 30mins ;)
sexy quick & dirty unproven code:
$filename = 'somefile.txt';
$timestamp_now = time(); // get timestamp from now (seconds)
if (filemtime($filename) > $timestamp_now) {
// overwrite the file (maybe check for existing file etc first)
}

How to use PHP to detect new files in a folder?

I will be sending new files over from one computer to another computer. How do I make PHP auto detect new/updated files in the folders and enter the information inside the files into mysql database?
Get all files you already know from the database
loop through the directory with http://www.php.net/manual/de/function.readdir.php
if the file is known, do nothing
if the file is not known, add it to the database
In the end, delete all files no longer in the directory
I would pick a set-up where new files and old fields are in a separate directory.
But if you have no choice, you could check the modification date and match it with your last directory iteration. (Use filemtime for this).
Don't forget to do some database checking when you process an image though.
Save the timestamp of the last check and when you check next look at the fileinfo and check creation date. Even better yet because you store filecontens in a database, check for the time it was modified using: filemtime()
You can't. PHP works as a preprocessor and even it has execution time limit (set in the configuration). If you need to process with PHP then make a PHP script that outputs a web page that use meta redirection to itself. Inside the script, you should loop over the files, query the database for the file name and its modification time, if it exists then nothing to do, otherwise, if the file name exists then it's an update, otherwise it's a new file.

Structurizing files without db

Basically i have simple form which user uses for files uploading. Files should be stored under /files/ directory with some subdirectories for almost equally splitting files. e.g. /files/sub1/sub2/file1.txt
Also i need to not to store equal files (by filename).
I have own solution. Calculate sha1 from filename. Take first 5 symbols - abcde for example and put file in /files/a/b/c/d/e/ this works well, but gives situation when one folder contains 4k files, 2nd 6k files. Is there any way to make files count be more closer to each other? Max files count can be 10k or 10kk.
Thanks for any help.
P.S. May be i explained something wrong, so once again :) Task is simple - you have only html and php (without any db) and files directory where you should store only uploaded files without any own data. You should develop script that can handle storing uploads to files directory without storing duplicates (by filename) and split uploaded files by subdirectories by files count in each directory (optimal and count files in each directory should be close to each other).
I have no idea why you want it taht way. But if you REALLY have to do it this way, iI would suggest you set a limit how many bytes are stored in each folder. Everytime you have to save the data you open a log with
the current sub
the total number of bytes written to that directory
If necesary you create a new sub diretory(you coulduse th current timestempbecause it wont repeat) and reset the bytecount
Then you save the file and increment the byte count by the number of bytes written.
I highly doubt it is worth the work, but I do not really know why you want to distribute the files that way.

Categories