I have a bunch of uniquely named images with different extensions, if I have one of the unique names, but I don't know the extension (it's an image extension), how can I find the image extension as fast as possible? I've seen other people doing this by searching all possible file extensions on that file name, but it seems too slow to try and load 6 different possible combinations before bringing up the original image.
Does anyone know an easier way?
You could use glob for this. Might not be the best solution but it is simple;
The glob() function searches for all the pathnames matching pattern
according to the rules used by the libc glob() function, which is
similar to the rules used by common shells.
$files = glob('filenamewithoutextension.*');
if (sizeof($files) > 0) {
$file = $files[0]; // Might be more than one hit however we are only interested in the first one?
}
After getting the filename you can use pathinfo to get the specific extension.
$extension = pathinfo($file, PATHINFO_EXTENSION);
Related
I'm trying to extract some files out of a tar.gz file.
But the filename seems to cause problems:
xxx.some-random-number.tar.gz
When I use \PharData::isValidPharFilename('xxx.some-random-number.tar.gz', false) the function returns false. When I omit the first part (i.e. \PharData::isValidPharFilename('some-random-number.tar.gz', false) it returns true.
I can't use different filenames as they are provided from a third-party service (and I don't wanna rename them on the fly, either (tedious).
Any ideas how to solve this?
I believe the extension needs to be phar, tar or zip. I just answered a similar question here where I provided a bit more detail.
I'm trying to group a bunch of files together based on RecipeID and StepID. Instead of storing all of the filenames in a table I've decided to just use glob to get the images for the requested recipe. I feel like this will be more efficient and less data handling. Keeping in mind the directory will eventually contain many thousands of images. If I'm wrong about this then the below question is not necessary lol
So let's say I have RecipeID #5 (nachos, mmmm) and it has 3 preparation steps. The naming convention I've decided on would be as such:
5_1_getchips.jpg
5_2_laycheese.jpg
5_2_laytomatos.jpg
5_2_laysalsa.jpg
5_3_bake.jpg
5_finishednachos.jpg
5_morefinishedproduct.jpg
The files may be generated by a camera, so DSC###.jpg...or the person may have actually named each picture as I have above. Multiple images can exist per step. I'm not sure how I'll handle dupe filenames, but I feel that's out of scope.
I want to get all of the "5_" images...but filter them by all the ones that DON'T have any step # (grouped in one DIV), and then get all the ones that DO have a step (grouped in their respective DIVs).
I'm thinking of something like
foreach ( glob( $IMAGES_RECIPE . $RecipeID . "-*.*") as $image)
and then using a substr to filter out the step# but I'm concerned about getting the logic right because what if the original filename already has _#_ in it for some reason. Maybe I need to have a strict naming convention that always includes _0_ if it doesn't belong to a step.
Thoughts?
Globbing through 1000s of files will never being faster than having indexed those files in a database (of whatever type) and execute a database query for them. That's what databases are meant for.
I had a similar issue with 15,000 mp3 songs.
In the Win command line dir
dir *.mp3 /b /s > mp3.bat
Used a regex search and replace in NotePad++ that converted the the file names and prefixed and appended text creating a Rename statement and Ran the mp3.bat.
Something like this might work for you in PHP:
Use regex to extract the digits using preg_replace to
Create a logic table(s) to create the words for the new file names
create the new filename with rename()
Here is some simplified and UNTESTED Example code to show what I am suggesting.
Example Logic Table:
$translation[x][y][z] = "phrase";
$translation[x][y][z] = "phrase";
$translation[x][y][z] = "phrase";
$translation[x][y][z] = "phrase";
$folder = '/home/user/public_html/recipies/';
$dir=opendir($folder);
while (false !== ($found=readdir($dir))){
if pathinfo($file,PATHINFO_EXTENSION) == '.jpg')
{
$files[]= pathinfo($file,PATHINFO_FILENAME);
}
}
foreach($files as $key=> $filename){
$digit1 = 'DSC(\d)\d\d\.jpg/',"$1", $filename);
$digit2 = 'DSC\d(\d)\d\.jpg',"$1", $filename);
$digit3 = 'DSC\d\d(\d)\.jpg',"$1", $filename);
$newName = $translation[$digit1][$digit2][$digit3]
ren($filename,$newfilename);
}
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How to extract a file extension in PHP?
Get the file extension (basename?)
trying tot learn from other people´s code , I see a lot of methods to strip a filename from it´s extension, but most of the methods seems too localized as they assume a certain condition. for example :
This will assume only 3-character extension (like .txt, .jpg, .pdf)
substr($fileName, 0, -4);
or
substr($fileName, 0, strrpos($fileName, '.'));
But this can cause problems on file names like .jpeg, .tiff .html . or only 2 like .jsOr .pl
(browsing this list shows some file names can have only 1 character, and some as many as 10 (!) )
some other methods i have seen rely on the point (.)
for example :
return key(explode(“.”, $filename));
Can cause problems with filenames like 20121029.my.file.name.txt.jpg
same here :
return preg_replace('/\.[^.]*$/', '', $filename);
some people use the pathinfo($file) and / or basename() (is it ALWAYS safe ?? )
basename($filename);
and many many other methods ..
so my question has several parts :
what is the best way to "strip" a file extension ? (with the point)
what is the best way to "get" the file extension (without the point) and / or check it
will php own functions (basename) will recognize ALL extensions regardless of how exotic they might be or how the filename is constructed ?
what if any influence does the OS has on the matter ? (win, linux, unix...)
all those small sub-questions , which i would like to have an answer to can be summed-up in an overall single question :
Is there a bullet-proof , overall, always-work, fail-proof , best-practice , über_function that will work under all and any condition ??
EDIT I - another file extension list
Quoting from the duplicate question's top answer:
$ext = pathinfo($filename, PATHINFO_EXTENSION);
this is the best available way to go. It's provided by the operating system, and the best you can do. I know of no cases where it doesn't work.
One exception would be a file extension that contains a .. But no sane person would introduce a file extension like that, because it would break everywhere plus it would break the implicit convention.
for example in a file 20121021.my.file.name.txt.tar.gz - tar.gz would be the extention..
Nope, it's much simpler - and maybe that is the root of your worries. The extension of 20121021.my.file.name.txt.tar.gz is .gz. It is a gzipped .gz file for all intents and purposes. Only when you unzip it, it becomes a .tar file. Until then, the .tar in the file name is meaningless and serves only as information for the gunzip tool. There is no file extension named .tar.gz.
That said, detecting the file extension will not help you determine whether a file is actually of the type it claims. But I'm sure you know that, just putting this here for future readers.
I'd like to be able to select a file by just giving it's name (without extension). For example, I might have a variable $id holding 12. I want to be able to select a file called the-id-in-the-variable, say, 12.png from a directory, but it may have any one of a number of file extensions, listed below:
.swf
.png
.gif
.jpg
There is only one occurrence of each ID. I could use a loop and file_exists(), but is there a better way?
Thanks,
James
$matches = glob("12.*");
would return an array with all the matching filenames in the current directory. glob() works much the same as wildcard matching at the shell prompt.
Take a look at glob. Unfortunately, the exact semantics of the $pattern parameter is not described in the manual. But it seems your problem can be solved using this function.
Quick question to OP here:
What is the file extension of this file: somefile.tar.gz? Is it .gz or .tar.gz? :) I ask because most would answer this question as .tar.gz...
http://php.net/glob
The documentation page on glob() has this example:
<?php
foreach (glob("*.txt") as $filename) {
echo "$filename size " . filesize($filename) . "\n";
}
?>
But to be honest, I don't understand how this can work.
The array produced by glob("*.txt") will be traversed, but where does this array come from? Is glob() reading a directory? I don't see that anywhere in the code. Glob() looks for all matches to *.txt
But where do you set where the glob() function should look for these strings?
Without any directory specified glob() would act on the current working directory (often the same directory as the script, but not always).
To make it more useful, use a full path such as glob("/var/log/*.log"). Admittedly the PHP documentation doesn't make the behaviour clear, but glob() is a C library function, which is where it originates from.
Something useful I've discovered with glob(), if you want to traverse a directory, for example, for images, but want to match more than one file extension, examine this code.
$images = glob($imagesDir . '*' . '.{jpg,jpeg,png,gif}', GLOB_BRACE);
The GLOB_BRACE flag makes the braces sort of work like the (a|b) regex.
One small caveat is that you need to list them out, so you can't use regex syntax such as jpe?g to match jpg or jpeg.
Yes, glob reads the directory. Therefore, if you are looking to match files in a specific directory, then the argument you supply to glob() should be specific enough to point out the directory (ie "/my/dir/*.png"). Otherwise, I believe that it will search for files in the 'current' directory.
Note that on some systems filenames can be case-sensitive so "*.png" may not find files ending in ".PNG".
A general overview of its purpose can be found here. Its functionality in PHP is based on that of the libc glob function whose rationale can be read at http://web.archive.org/web/20071219090708/http://www.isc.org/sources/devel/func/glob.txt .