Cakephp response cannot read UTF-8 file name - php

I want to download the file after login check so wrote a function in my controller like
// Function to check login and download News PDF file
public function download(){
if($this->Auth->user()){
// Get the news file path from newsId
$pNewsObj = ClassRegistry::init('PublicNews');
$news = $pNewsObj->findById($newsId);
$filePath = ROOT.DS.APP_DIR.DS.'webroot/upload_news'.DS.$news['PublicNews']['reference'];
// Check if file exists
if(!file_exists($filePath)){
return $this->redirect('/404/index.php');
}
$this->response->charset('UTF-8');
//$this->response->type('pdf');
$this->response->file('webroot/upload_news'.DS.$news['PublicNews']['reference'], array('download' => true, 'name' => $news['PublicNews']['reference']));
//$this->response->download($news['PublicNews']['reference']);
return $this->response;
}else{
return $this->redirect(array('controller'=> 'users', 'action' => 'login'));
}
}
Now, everything works fine as required.
PROBLEM : when the file name is in UTF-8 eg. テスト.pdf (its Test.pdf in japanese) cakephp throws error like this.
For English filename it works perfectly fine but my client wants the filename should be the same as uploaded, so I can't change the filename to English.

If you want to know character encoding, you can use mb_detect_encoding() function if input text has enough length to detect encoding.
But I am guessing your client would upload SJIS file. Because most Japanese people are using SJIS, as Windows has adopted SJIS for Japanese language.
I confirmed your code in my local environment. As cake's File class seems to be not able to handle SJIS correctly, you cannot use Response::file(). So I wrote alternative code.
public function download(){
if($this->Auth->user()){
// Get the news file path from newsId
$pNewsObj = ClassRegistry::init('PublicNews');
$news = $pNewsObj->findById($newsId);
if (!$news) {
throw new NotFoundException();
}
$fileName = mb_convert_encoding($news['PublicNews']['reference'], 'SJIS-win', 'UTF8');
// Directory traversal protection
if (strpos($fileName, '..') !== false) {
throw new ForbiddenException();
}
$filePath = WWW_ROOT . 'upload_news' . DS . $fileName;
if (!is_readable($filePath)) {
throw new NotFoundException();
}
if (function_exists('mime_content_type')) {
$type = mime_content_type($filePath);
$this->response->type( $type );
} else {
// TODO: If Finfo extension is not loaded, you need to detect content type here;
}
$this->response->download( $fileName );
$this->response->body( file_get_contents($filePath) );
return $this->response;
}else{
return $this->redirect(array('controller'=> 'users', 'action' => 'login'));
}
}
However, I recommend you to convert SJIS to UTF8 before save it into your database and your disk. It is difficult to handle SJIS characters without enough knowledge about it. Because SJIS characters may contain ascii characters in the second byte. Especially backslash (\) is most dangerous. For example, 表 (955C) contains a backslash (5C = backslash). Note that I am not talking about rare cases. 表 means table or appearance in Japanese. 十 also contains a backslash and it means 10 in Japanese. 能 also contains a backslash and it means skill.
Unlike UTF-8 byte sequence, if you handle SJIS characters, almost all string functions don't work correctly. explode() would break SJIS byte sequence. strpos() would return wrong result.
Does your client connect to your server by using FTP or SCP directly? If not, it would be better to convert SJIS to UTF-8 before save, and re-convert UTF-8 to SJIS before return to your client.

If you like you can change the file name before uploading the file so at time of downloading this error will not happen.
public function change_file_name($fileName= '') {
$ext = pathinfo($fileName, PATHINFO_EXTENSION);
$fileName = 'file_'.time().".".$ext;
$exFileName = strtolower(substr($fileName,strrpos($fileName,".") + 1));
$sampleFileName = str_replace('.'.$exFileName,'', $fileName);
$name = Sanitize::paranoid($sampleFileName,array('_'));
$fileRename = $name.'.'.$exFileName;
return $fileRename;
}
Call this function before uploading the file
$return_file_name = $this->change_file_name($file_name);
if($this->moveUploadedFile($tmp_name,WEBSITE_PROFILE_ROOT_PATH.$return_file_name)){
$saveData['profile_image'] = $return_file_name;
}
I know this is not proper answer for your case.For this you can make a function like this which will fetch data from database and automatic rename all your save file and update it in your database

Some more information about your client's specifications would help greatly, but Tom Scott found base64 to be the simplest method of making Unicode characters work correctly in PHP.
Depending on how crucial the preservation of filenames in storage is, a solution could be to encode the filenames in base64 when files are uploaded, and reverse the encoding on download. You can then know that you are dealing with ASCII, which should be much more likely to work correctly.
You may need to replace / characters with %2F to make it work.
Hope this helps,
Issa Chanzi

Related

Check if string is valid filename

I have a question. I got a PHP script (PHP 5) which is saving a URL-Parameter $_GET['file'] to the variable $file. Is there now a way to check if this variable is a valid filename (for example: hello.txt and not /../otherdir/secret.txt). Because without checking the $file variable a hacker would be able to use the /../ to get to my parent folder.
You may have a look in php's basename function, it will return with filename, see example below:
$file = '../../abc.txt';
echo basename($file); //output: abc.txt
Note: basename gets you the file name from path string irrespective of file physically exists or not. file_exists function can be used to verify that the file physically exists.
POSIX "Fully portable filenames" lists these: A-Z a-z 0-9 . _ -
Use this code to validate the filename against POSIX rules using regex:
/ - forward slash (if you need to validate a path rather than a filename)
\w - equivalent of [0-9a-zA-Z_]
- . - dash dot space
$filename = '../../test.jpg';
if (preg_match('/^[\/\w\-. ]+$/', $filename))
echo 'VALID FILENAME';
else
echo 'INVALID FILENAME';
If you want to ensure it has no path (no forwardslash) then change the regex to '/^[\w\-. ]+$/'.
Instead of checking valid characters why not looking for character you don't want. Also filenames are limited to 255 characters:
function valid_filename(string $filename)
{
if (strlen($filename) > 255) { // no mb_* since we check bytes
return false;
}
$invalidCharacters = '|\'\\?*&<";:>+[]=/';
if (false !== strpbrk($filename, $invalidCharacters)) {
return false;
}
return true;
}
valid_filename('hello'); // true
valid_filename('hello.php'); // true
valid_filename('foo:bar.php'); // false
valid_filename('foo/bar'); // false
Adapt $invalidCharacters according to your needs/OS.
Source: https://www.cyberciti.biz/faq/linuxunix-rules-for-naming-file-and-directory-names/
Would that work?
http://php.net/manual/en/function.file-exists.php
E.g, in your case,
if(file_exists(str_replace("../", "", $file))){
// valid
}
else{
// invalid
}
Files can be in subfolders but not in parent folders.
However, if you're just interested in the filename,
if(file_exists(pathinfo($file, PATHINFO_BASENAME))){
// valid
}
else{
// invalid
}
should work.
i will like to combine kamal pal's and Pancake_M0nster's answers to create simple:
if(file_exists(basename($file))){
// valid
}
else{
// invalid
}

base64 decode mixed results

I have the following code in PHP, and it works for the most, fine. I am sending a image from a mobile device to this script, which decodes it into a img file and creates a file out of it on the server. I am 99.9% sure every time its a base64 encoded.
<?php
header('Access-Control-Allow-Origin: *');
header('Content-Type: image/jpeg');
$data = ($_POST['imageData']);
define('UPLOAD_DIR', 'images/');
$img = str_replace('data:image/jpeg;base64,', '', $data);
$data = base64_decode($img);
$file = UPLOAD_DIR . uniqid() . '.jpg';
file_put_contents($file, $data);
echo ('{"imgUrl" : "' . $file . '"}');
?>
This then returns the image URL back to be added to a database.
The problem is, most of the time it does decode into a .jpg file, and other times into a txt file. I cannot see why it does it, as its a little random. But I have noticed that sometimes it will come as a $_POST, and other times, $_POST is Null. So I looked at using:
$data = json_decode(file_get_contents('php://input'));
But again, it seems inconsistant. But I put a logic statement such as:
$data = ($_POST['imageData']);
if($data == NULL) {
$data = json_decode(file_get_contents('php://input'));
}
Is there any reason I should be aware of why the code works, and sometimes does not work ?
I know this question is old, but is one of the first appearance by looking at this topic. So everyone looking at this question can find a link to a proper answer right away.
You should check this PHP - get base64 img string decode and save as jpg (resulting empty image )
Also check the conditions you're using, because
if ($data === NULL)
it may be different for
if ($data == NULL)
Also, you're saving the base64 string incorrectly to an image file.
Check that link and let me know how if it helped.

upload images through php using unique file names

I am currently in the process of writing a mobile app with the help of phonegap. One of the few features that I would like this app to have is the ability to capture an image and upload it to a remote server...
I currently have the image capturing and uploading/emailing portion working fine with a compiled apk... but in my php, I am currently naming the images "image[insert random number from 10 to 20]... The problem here is that the numbers can be repeated and the images can be overwritten... I have read and thought about just using rand() and selecting a random number from 0 to getrandmax(), but i feel that I might have the same chance of a file overwriting... I need the image to be uploaded to the server with a unique name every-time, no matter what... so the php script would check to see what the server already has and write/upload the image with a unique name...
any ideas other than "rand()"?
I was also thinking about maybe naming each image... img + date + time + random 5 characters, which would include letters and numbers... so if an image were taken using the app at 4:37 am on March 20, 2013, the image would be named something like "img_03-20-13_4-37am_e4r29.jpg" when uploaded to the server... I think that might work... (unless theres a better way) but i am fairly new to php and wouldn't understand how to write something like that...
my php is as follows...
print_r($_FILES);
$new_image_name = "image".rand(10, 20).".jpg";
move_uploaded_file($_FILES["file"]["tmp_name"], "/home/virtual/domain.com/public_html/upload/".$new_image_name);
Any help is appreciated...
Thanks in advance!
Also, Please let me know if there is any further info I may be leaving out...
You may want to consider the PHP's uniqid() function.
This way the code you suggested would look like the following:
$new_image_name = 'image_' . date('Y-m-d-H-i-s') . '_' . uniqid() . '.jpg';
// do some checks to make sure the file you have is an image and if you can trust it
move_uploaded_file($_FILES["file"]["tmp_name"], "/home/virtual/domain.com/public_html/upload/".$new_image_name);
Also keep in mind that your server's random functions are not really random. Try random.org if you need something indeed random. Random random random.
UPD: In order to use random.org from within your code, you'll have to do some API requests to their servers. The documentation on that is available here: www.random.org/clients/http/.
The example of the call would be: random.org/integers/?num=1&min=1&max=1000000000&col=1&base=10&format=plain&rnd=new. Note that you can change the min, max and the other parameters, as described in the documentation.
In PHP you can do a GET request to a remote server using the file_get_contents() function, the cURL library, or even sockets. If you're using a shared hosting, the outgoing connections should be available and enabled for your account.
$random_int = file_get_contents('http://www.random.org/integers/?num=1&min=1&max=1000000000&col=1&base=10&format=plain&rnd=new');
var_dump($random_int);
You should use tempnam() to generate a unique file name:
// $baseDirectory Defines where the uploaded file will go to
// $prefix The first part of your file name, e.g. "image"
$destinationFileName = tempnam($baseDirectory, $prefix);
The extension of your new file should be done after moving the uploaded file, i.e.:
// Assuming $_FILES['file']['error'] == 0 (no errors)
if (move_uploaded_file($_FILES['file']['tmp_name'], $destinationFileName)) {
// use extension from uploaded file
$fileExtension = '.' . pathinfo($_FILES['file']['name'], PATHINFO_EXTENSION);
// or fix the extension yourself
// $fileExtension = ".jpg";
rename($destinationFileName, $destinationFileName . $fileExtension);
} else {
// tempnam() created a new file, but moving the uploaded file failed
unlink($destinationFileName); // remove temporary file
}
Have you considered using md5_file ?
That way all of your files will have unique name and you would not have to worry about duplicate names. But please note that this will return same string if the contents are the same.
Also here is another method:
do {
$filename = DIR_UPLOAD_PATH . '/' . make_string(10) . '-' . make_string(10) . '-' . make_string(10) . '-' . make_string(10);
} while(is_file($filename));
return $filename;
/**
* Make random string
*
* #param integer $length
* #param string $allowed_chars
* #return string
*/
function make_string($length = 10, $allowed_chars = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890') {
$allowed_chars_len = strlen($allowed_chars);
if($allowed_chars_len == 1) {
return str_pad('', $length, $allowed_chars);
} else {
$result = '';
while(strlen($result) < $length) {
$result .= substr($allowed_chars, rand(0, $allowed_chars_len), 1);
} // while
return $result;
} // if
} // make_string
Function will create a unique name before uploading image.
// Upload file with unique name
if ( ! function_exists('getUniqueFilename'))
{
function getUniqueFilename($file)
{
if(is_array($file) and $file['name'] != '')
{
// getting file extension
$fnarr = explode(".", $file['name']);
$file_extension = strtolower($fnarr[count($fnarr)-1]);
// getting unique file name
$file_name = substr(md5($file['name'].time()), 5, 15).".".$file_extension;
return $file_name;
} // ends for is_array check
else
{
return '';
} // else ends
} // ends
}

PHP/regex : Script to create filenames with dashes instead of spaces

I want to amend a PHP script I'm using in wordPress (Auto Featured Image plugin).
The problem is that this script creates filenames for thumbnails based on the URLs of the image.
That sounds great until you get a filename with spaces and the thumbnail is something like this%20Thumbnail.jpg and when the browser goes to http://www.whatever.com/this%20Thumbnail.jpg it converts the %20 to a space and there is no filename on the server by that name (with spaces).
To fix this, I think I need to change the following line in such a way that $imageURL is filtered to convert %20 to spaces. Sound right?
Here is the code. Perhaps you can tell me if I'm barking up the wrong tree.
Thank you!
<?php
static function create_post_attachment_from_url($imageUrl = null)
{
if(is_null($imageUrl)) return null;
// get file name
$filename = substr($imageUrl, (strrpos($imageUrl, '/'))+1);
if (!(($uploads = wp_upload_dir(current_time('mysql')) ) && false === $uploads['error'])) {
return null;
}
// Generate unique file name
$filename = wp_unique_filename( $uploads['path'], $filename );
?>
Edited to a more appropriate and complete answer:
static function create_post_attachment_from_url($imageUrl = null)
{
if(is_null($imageUrl)) return null;
// get the original filename from the URL
$filename = substr($imageUrl, (strrpos($imageUrl, '/'))+1);
// this bit is not relevant to the question, but we'll leave it in
if (!(($uploads = wp_upload_dir(current_time('mysql')) ) && false === $uploads['error'])) {
return null;
}
// Sanitize the filename we extracted from the URL
// Replace any %-escaped character with a dash
$filename = preg_replace('/%[a-fA-F0-9]{2}/', '-', $filename);
// Let Wordpress further modify the filename if it may clash with
// an existing one in the same directory
$filename = wp_unique_filename( $uploads['path'], $filename );
// ...
}
You better to replace the spaces in image name with underscores or hypens using regexp.
$string = "Google%20%20%20Search%20Amit%20Singhal"
preg_replace('/%20+/g', ' ', $string);
This regex will replace multiple spaces (%20) with a single space(' ').

PHP regular expression to match a filepath

Can someone please help me with this preg_match
if (preg_match('~[^A-Za-z0-9_\./\]~', $filepath))
// Show Error message.
I need to match a possible filepath. So I need to check for double slashes, etc. Valid file path strings should look like this only:
mydir/aFile.php
or
mydir/another_dir/anyfile.js
So a slash at the beginning of this string should be checked also. Please help.
Thanks :)
EDIT:
Also, guys, this path is being read from within a text file. It is not a filepath on the system. So hopefully it should be able to support all systems in this case.
RE-EDIT:
Sorry, but the string can also look like this too:
myfile.php, or myfile.js, or myfile.anything
How do I allow strings like this as well?? I apologize for not being too specific on this before...
Please notice that there are many types of possible file paths.
For example:
"./"
"../"
"........" (yes this can be a file's name)
"file/file.txt"
"file/file"
"file.txt"
"file/.././/file/file/file"
"/file/.././/file/file/.file" (UNIX)
"C:\Windows\" (Windows)
"C:\Windows\asd/asd" (Windows, php accepts this)
"file/.././/file/file/file!##$"
"file/.././/file/file/file!##.php.php.php.pdf.php"
All these file paths are valid. I can't think of a simple regex that can make it perfect.
Let's assume it's just a UNIX path for now, this is what I think should work for most cases:
preg_match('/^[^*?"<>|:]*$/',$path)
It checks all string for ^, *, ?, ", <, >, |, :(remove this for windows). These are all character that windows does not allow for file name, along with / and .
If it's windows, you should replace the path's \ with / and then explode it and check if it's absolute. Here is one example that working in both unix and windows.
function is_filepath($path)
{
$path = trim($path);
if(preg_match('/^[^*?"<>|:]*$/',$path)) return true; // good to go
if(!defined('WINDOWS_SERVER'))
{
$tmp = dirname(__FILE__);
if (strpos($tmp, '/', 0)!==false) define('WINDOWS_SERVER', false);
else define('WINDOWS_SERVER', true);
}
/*first, we need to check if the system is windows*/
if(WINDOWS_SERVER)
{
if(strpos($path, ":") == 1 && preg_match('/[a-zA-Z]/', $path[0])) // check if it's something like C:\
{
$tmp = substr($path,2);
$bool = preg_match('/^[^*?"<>|:]*$/',$tmp);
return ($bool == 1); // so that it will return only true and false
}
return false;
}
//else // else is not needed
return false; // that t
}
You can do:
if(preg_match('#^(\w+/){1,2}\w+\.\w+$#',$path)) {
// valid path.
}else{
// invalid path
}

Categories