Remove files which have not filename duplicates - php

For each document (.pdf, .txt, .docx ecc) I have also a corresponding json file with the same filename.
Example:
file1.json,
file1.pdf,
file2.json,
file2.txt,
filex.json,
filex.pdf,
But I got also some json files which are not accompanied with the corresponding document.
I want to delete all json files which have no corresponding document. Im really stucked because I cant find a proper solution to my problem.
I know how to scandir() get the filename, extensions from pathinfo() ecc. but the issue is that for each json file I find in directory I have to perform another foreach on that directory excluding all json files and see If the same filename exists or not so than I can decide to delete it. (This is how I think to solve it).
The problem here is with performance since there are millions of files and for each json I have to run a foreach on millions of files.
Can anyone guide me to a better solution?
Thank you!
Edit: Since no one will help without first posting a piece of code (and this approach in stackoverflow is definitively wrong) here is how I'm trying.:
<?php
$dir = "2000/";
$files = scandir($dir);
foreach ($files as $file) {
$fullName = pathinfo($file);
if ($fullName['extension'] === 'json') {
if (!in_array($fullName['filename'].'.pdf', $files)){
unlink($dir.$file);
}
}
}
Now as you can see I can only search only for one type of document (.pdf in this case). I want to search for every extension excluding .json and also I don't want that for each json file to run a foreach/in_array() but achieving all this in just one foreach.

Maybe you should consider it in another way? I mean, iterate through all files, and try to find corresponding files to json, if not found remove it.
It would look like follows:
$dir = "2000/";
foreach (glob($dir . "*.json") as $file) {
$file = new \SplFileInfo($dir . $file);
if (count(glob($dir . $file->getBasename('.' . $file->getExtension()) . ".*")) === 1) {
unlink($dir . $file->getFilename());
}
}
Manual
PHP: SplFileInfo
PHP: glob

Related

ftp_get() file with partial filename(not a wildcard)

I need to get a file based on the second half of the filename with PHP
The structure of the filename will always be NAME_123456789.dat where the number is a tracking_id(unique).
The name being John, Mel, Bronson, etc. And the number being a tracking_id.
What the process will be just for comprehension is that a person will enter their tracking_id. it will extract that from the search bar and plant it in the ftp search in the specific directory. Because the tracking_id is unique it should only return one result, hence ftp_get() right?
Any help is greatly appreciated.
Given the relatively small directory size (100+ from comments above), you should be ok first using ftp_nlist() to list all the files and then searching for and downloading the file you want.
$search = '_' . $trackingId . '.dat';
$searchLen = strlen($search);
$dir = '.'; // example directory
$files = ftp_nlist($connection, $dir);
foreach ($files as $file) {
// check if $file ends with $search
if (strrpos($file, $search) === strlen($file) - $searchLen) {
// found it, download it
ftp_get($connection, 'some/local/file/path', $dir . '/' . $file);
}
}
Better and more future-proof options can be found in Michael Berkowski's comment above...
How many files do you expect to be operating in the directory at any given time? If it is a small number, listing the contents via ftp may work suitably. If it is many thousands of files, you might want to store some sort of text manifest file to read from, or index them in a database.
These do hinge on how and when the files are uploaded to the FTP server though so given we don't know anything about that, I cannot provide any solutions.

Iterate thru possible file extension to get the proper extension

I have a folder with many files in various formats eg .jpg, .png, .pdf, .doc etc... The files are on a remote server. I have a json file with list of filenames and its location but missing the extensions.
I want to rebuild the json file and add the proper extension to filename. How can I do this with php? Can anyone give me any ideas how to iterate thru possible extensions to get the right filename + ext on the server.
eg. I have a url like this - http://www.somesite.com/filename. I know on the server the file is pdf but how can I do this programatically for many files which may be different and rename the url?
Any ideas?
Use a For Loop To loop over all current know file extensions and execute a GET request to the $url . $extension and see if the server returns a file.
If the server returns a file, you can break the for loop.
You can nest 2 for loops in each other to do this far all know urls.
Example
$files = [
"http://test.com/test",
"http://test.com/test2"
];
$extensions = [
".jpg",
".docx"
];
foreach ($files as $file)
{
foreach ($extensions as $extension)
{
$foundFile = // Get requests here
if(FILE_IS_FOUND){
// Store file where ever you need it
break;
}
}
}
This example uses a Foreach Loop

get a unknown file name in different dir, php

Here is my directory structure,
C:\xampp\htdocs\..
C:\download\20150923abc.xls //abc is a random value
how can I attach the file 20150923abc.xls in php?
Also, how to change the filename after I got it?
Thanks.
Use the glob ability to find references all files of type .xls and then you can use the file name references as you wish. This sidesteps the issue of you not knowing the specific file name.
$files = glob("c:/download/*.xls");
This will produce an array of all .xls files with their full filepath. If you wish to rename or attach these files then you can do this using the glob reference:
rename($files[0], "c:/download/somenewname.xls");
etc. Read more at:
PHP Glob Function
EDIT:
From Comment below:
foreach (glob( $old_folder."*.xls") as $filename)
{
$names = explode('/', $filename);
$just_file_name = end($names);
echo $just_file_name . "----\n";
$new_folder = dirname(FILE)."\\prm\\att\\";
//rename_win($old_folder, $new_folder);
rename($filename, $new_folder.$just_file_name); <== this line changed.
}
unset($filename);
To fix the above code in your comment, you need to change the incorrect variables referenced (there was no array $files[0]) to the ones used in the foreach loop.

Trying to echo contents of multiple text files while sorting the output by the file name - PHP

I'm not a developer, but I'm the default developer at work now. : ) Over the last few weeks I've found a lot of my answers here and at other sites, but this latest problem has me confused beyond belief. I KNOW it's a simple answer, but I'm not asking Google the right questions.
First... I have to use text files, as I don't have access to a database (things are locked down TIGHT where I work).
Anyway, I need to look into a directory for text files stored there, open each file and display a small amount of text, while making sure the text I display is sorted by the file name.
I'm CLOSE, I know it... I finally managed to figure out sorting, and I know how to read into a directory and display the contents of the files, but I'm having a heck of a time merging those two concepts together.
Can anyone provide a bit of help? With the script as it is now, I echo the sorted file names with no problem. My line of code that I thought would read the contents of a file and then display it is only echoing the line breaks, but not the contents of the files. This is the code I've got so far - it's just test code so I can get the functionality working.
<?php
$dirFiles = array();
if ($handle = opendir('./event-titles')) {
while (false !== ($file = readdir($handle))) {
if ($file != "." && $file != "..") {
$dirFiles[] = $file;
}
}
closedir($handle);
}
sort($dirFiles);
foreach($dirFiles as $file)
{
$fileContents = file_get_contents($file);//////// This is what's not working
echo $file."<br>".$fileContents."<br/><br/>";
}
?>
Help? : )
Dave
$files = scandir('./event-titles') will return an array of filenames in filename-sorted order. You can then do
foreach($files as $file)
{
$fileContents = file_get_contents('./event-titles/'.$file);
echo $file."<br/>".$fileContents."<br/><br/>";
}
Note that I use the directory name in the file_get_contents call, as the filename by itself will cause file_get_contents to look in the current directory, not the directory you were specifying in scandir.

Searching for specific file extensions in a folder/directory (PHP)

I'm trying to design a program in PHP that would allow me to find files with specific file extensions (example .jpg, .shp etc) in a known directory which consists of multiple folders.
Sample code, documentation or information about what methods I will be required to use will be much appreciated.
glob is pretty easy:
<?php
foreach (glob("*.txt") as $filename) {
echo "$filename size " . filesize($filename) . "\n";
}
?>
There are a few suggestions for recursive descent at the readdir page.
Take a look at PHP's SPL DirectoryIterator.
I believe PHP's glob() function is exactly what you are looking for:
http://php.net/manual/en/function.glob.php
Use readdir to get a list of files, and fnmatch to work out if it matches your required filename pattern. Do all this inside a function, and call your function when you find directories. Ask another question if you get stuck implementing this (or comment if you really have no idea where to start).
glob will get you all the files in a given directory, but not the sub directories. If you need that too, you will need to: 10. get recursive, 20. goto 10.
Here's the pseudo pseudocode:
function getFiles($pattern, $dir) {
$files = glob($dir . $pattern);
$folders = glob($dir, GLOB_ONLYDIR);
foreach ($folders as $folder) {
$files = $files + getFiles($folder);
}
return $files;
}
The above will obviously need to be tweaked to get it working, but hopefully you get the idea (remember not to follow directory links to ".." or "." or you'll be in infinite loop town).

Categories