Filenames with spaces - php

I am running PHP locally (no webserver involved at all). I am having trouble accessing a file with spaces in the path.
My bare bones case is
$indexFile = "file:///Users/username/Documents/My Folder/test.txt";
echo file_exists($indexFile);
or:
$indexFile = "file:///Users/username/Documents/My‰20Folder/test.txt";
echo file_exists($indexFile);
AFAICT this latter case is a well-formed file scheme URI. It's exactly what appears in the browser location field if I drag the file in there.
URI's without spaces don't have any problem. Unfortunately I am not at liberty to change "My Folder" to "MyFolder", and besides, I want to find a solution.
I tried using urlencode, rawurlencode, escapeshellarg, I've tried replacing %20 with a backslash-escaped space "My\ Folder" but none of this works. I've also hunted through google and stackoverflow, but while there are many suggstions, the question remains unanswered:
How to access an arbitrary file (on the local host) using a path which contains spaces?

This looks like a Mac OSX file system and so this should work:
$indexFile = "/Users/username/Documents/My Folder/test.txt"
echo file_exists("${indexFile}");

Remove the first file:// from the first code snippet. It'll work.
$indexFile = "file:///Users/username/Documents/My Folder/test.txt";
echo file_exists(str_replace("file://", "", $indexFile));

Related

Argument error when running mkvextract with PHP

No matter how hard I try, mkvextract doesn't work properly. I'm aware that there is a problem with the file path, but I tried hundreds of times, but I still could not succeed. How can I run this correctly?
shell_exec("mkvextract tracks /home/movies/R-12/X-1 ÇĞŞZ.mkv");
or
$filename = "/home/movies/R-12/X-1 ÇĞŞZ.mkv"
echo shell_exec("mkvextract tracks \"$filename\"");
I am aware that you cannot access the file path due to special characters
There may be several issues:
A file read permision issue: the file exists, but PHP (and the mkvextract it runs) don't have the permission to open it. In the rest of my answer I assume this is not happening, because you haven't added any error message containg the word permission or access to your question.
A shell argument escaping issue: correcly passing a command argument containing whitespace and/or shell metacharacters (e.g. ", \, $). I address this with escapeshellarg below.
A filename encoding issue: correctly specifying non-ASCII characters in filenames. I address this with mb_convert_encoding below.
For testing purposes, make a copy of the input file to /home/movies/t.mkv, and then try echo shell_exec("mkvextract tracks /home/movies/t.mkv").
If that works, then rename the copy to /home/movies/t t.mkv, and then try echo shell_exec("mkvextract tracks " . escapeshellarg("/home/movies/t t.mkv")). Without the escapeshellarg call, it wouldn't work, because the filename contains a space.
If that works, then the problem is with non-ASCII characters in the filename. To investigate it further, examine the output of var_dump(scandir("/home/movies/R-12")), and see how the letters with accents appear there. Pass it the same way to shell_exec. Don't forget about escapeshellarg.
If that works, use encoding conversion (with mb_convert_encoding) for the remaining filenames. You may want to ask a separate question about that, specifying the output of var_dump(scandir("/home/movies/R-12")) and var_dump("X-1 ÇĞŞZ.mkv") in your question.
$filename = "/home/movies/R-12/X-1 ÇĞŞZ.mkv"
echo shell_exec("sudo mkvextract tracks \"$filename\"");
I guess the whole problem was not adding sudo per :)

PHP preg_match on own computer doesn't work

I have this code:
$success = preg_match('/(.+(駅前)?駅) (\(([^線]+線)\) )?((([^線 ]+) )?(\d+[分時])?)/u', $m, $matches);
Example input text is
大正駅 (JR大阪環状線) バス 20分
This regex works on https://regex101.com/ and the code works on http://sandbox.onlinephpfunctions.com/. However, when I run the PHP code on my own computer, it never gives me a match. $matches is an empty array, and $success is 0. Yes, the exact same code. I have verified that the regex is correct (using first link) and that the code itself works (using second link). However, it still refuses to work on my own PC.
OS is Arch Linux, running PHP 7.3.11, system locale is ja_JP.UTF-8 (which I don't think matters, but just in case)
Does anyone see anything wrong with the code?
So I was able to find the problem.
First, I tried just the one-liner commented by Nick (3v4l.org/o4ADM) on my PC, and it works. (Of course it should. PHP can't be broken.)
So I figured out that it's the data I'm feeding preg_match that should be broken.
Normal prints and echos were in vain--$m is always how it should be. Then I considered AD7six's comment,
Check that the bytes for 駅 etc. are actually the same
so I looked carefully to check that the characters are all Japanese and no Chinese variants are there. And it's all Japanese, it's fine.
So what could it be?
I tried using PHP's file_put_contents to dump the variable to a file, and then typing the same text with my Japanese keyboard manually and saving them to another file. I opened Meld (a diff tool) and compared the two text and voila--the spaces on the text use a different codepoint than the usual half-width space (0x20). It uses 0xA0 instead, which is a "no-break space", apparently. What the heck.
Fortunately, a simple $m = str_replace("\u{00A0}", " ", $m) did the trick.
Thanks to everyone for leading me to the right answer!

Opening an encoded file with PHP

I am opening a file on the server with PHP. The file seems ordinary. It opens in Notepad and Textedit on a PC. Even PHP can display it without any issue in a web browser when we echo out.
But when I try searching it with strpos() it can’t find anything except single characters. if i search for a string with 2 or more characters, it doesn’t find anything.
I have tried encoding it to UTF-8, and it detects it as ASCII. so everything seems right there.
I have also isolated the part of the file that I am trying to read down to only 250 characters. They all look fine on the screen.
But strpos can’t find it. I’ve run tests on every part of my code and I believe everything is fine with my code. The problem I believe derives from that the characters I see on the screen are not exactly matching what those characters really are.
My last resort is to write a function which converts each character into an integer array (if that’s even possible), and then convert all that back to a string. This way, we’ll know 100% that the characters we see are real.
Hoping that somebody has a better approach or perhaps an idea for something I missed?
I'll post the code below:
$content = file_get_contents($file->getPathname()); // get the file contents
$content = substr($content, 30, 300); // reduce the large file to just the first few lines
$content = htmlspecialchars($content); // try to remove any special characters from the file
$content = iconv('ASCII', 'UTF-8//IGNORE', $content); // encode to a friendly format
$string = "JobName"; // this is the string i'm searching for
if (strpos($content, $string) !== false) {
echo "bingo";
}
else {
echo " not found ";
}
Just to be clear, the file I'm opening is generated from a PC program that stores its data in .DAT format. Like I said, I can see and read the content very easily using any program, including PHP. but when I try to search, its as if it doesn't recognize the content at all.
I am not aware of how to upload a file on StackOverflow, but if someone can tell me how to do it then I will gladly post the file itself.
Thank you very much for your help ARKASCHA. I was able to find an online HexEditor and when I saw the characters, it seems there is a NUL character between every single character in this file. that's probably why I couldn't see it with a regular view. I just had to run an additional function to remove NUL characters from the file, and then it works as its supposed. Thanks again.

Load file name with spaces from FTP

I'm trying to load a xml file from an external ftp server. Sadly the filename contains a space between two words.
Homepage-Filename Statistics-170210.xml
I'm able to load the file with simplexml_load_file, if there is an underscore a dash. For example:
Homepage-Filename_Statistics-170210.xml
simplexml_load_file('ftp://username:password#ftp.domain.com/Homepage_Filename_Statistics-170210.xml');
But I'm not able to change the file name, so I have to load the file with spacing inside.
I have tried to replace the space with %20 or backslash / , but it isn't working either. For example:
Homepage_Filename%20Statistics-170210.xml
or
Homepage_Filename/ Statistics-170210.xml
Someone has an idea, how load something like this?
Thanks!
I can also reproduce this problem with the simplexml_load_file.
Interestingly the file_get_contents works:
$url = 'ftp://username:password#ftp.domain.com/Homepage_Filename/ Statistics-170210.xml';
simplexml_load_string(file_get_contents($url));

PHP: imagecreatefromjpeg($url) doesn't work if $url contains spaces?

I use a script to get an image from another server and store it in the db, the problem is that when the url has a space in it, the function grabs nothing.
I tried to encode the url and to simply replace all spaces with %20 but with no success.
I'm running out of options, if any of you could give me some ideas would be great!
Thanks!
$thumb=imagecreatefromjpeg(http://www.dummysite.ca/imageone.jpg); //->WORKS
$thumb=imagecreatefromjpeg(http://www.dummysite.ca/image one.jpg); //->DOESN'T WORK
EDIT: more info: I'm running a CentOS machine, php 5.2.17
EDIT: found the answer, replacing spaces with %20 actually WORKS but I was foolish and only replace it before the imagecreatefromjpeg call, it turns out getimagesize needs it as well
So for those who will have a similar problem
replacing spaces with %20 actually WORKS but I was foolish and only replace it before the imagecreatefromjpeg call, it turns out getimagesize needs it as well
I would do everything in my power to keep spaces out of filenames. At whatever point the file enters your server it should be renamed to something with underscores. Personally For file uploads I rename every file to a combination of timestamp and the uploader's ip address. Grabbing from another server could use the same logic. If you need to save the original filename just save it as a text string associated with the DB entry.

Categories