how to set encoding after using move_uploaded_file? - php

i am saving a file using move_uploaded_file($file['tmp_name'], $save_path . $FileName); but when the file name i choose is in arabic , the file is saved in strange characters like that : ÒíäÈ.pdf.
so when i try to open the uploaded file later, it says file not found .(the real one) what should i do ??

you can use transliteration on any string you want to convert to latin-similar characters. My very-custom transliteration code looks like that:
// convert a utf8-encoded string into latin representative
function transliterate_string($params=array())
{
// PARAMS: "string", "language"
// 0) fill-in "native" chars by language and their "latin" representative
$SP_trans = array(
"ae"=>array("native"=>"ة,بْ,...","latin"=>"a,b,..."),
...other langs if you want
);
// 1) break "native" & "latin" strings
$nc = explode(",",$SP_trans[ $params["language"] ]["native"]);
$lc = explode(",",$SP_trans[ $params["language"] ]["latin"]);
// 2) convert to lower first
$string = mb_strtolower($params["string"],"utf-8");
// 3) loop each character
mb_internal_encoding("UTF-8");
for($x=0,$sz=mb_strlen($string);$x<$sz;$x++) {
$char = mb_substr($string,$x,1);
$index = array_search($char,$nc);
$out[$x] = ($index===FALSE ? $char : $lc[$index]);
}
return trim(implode("",$out));
}
the function just scans a string and convert each character of specific character
set into latin in a custom way. Then you can safely save the file as latin.

It will be better to rename the image name using time stamp
$imgname = time().'.'.'jpg';
$imgtmpname=$_FILES['file']['tmp_name'];
$fullpath= $path.$imgname;
$filename = $imgname;
move_uploaded_file($imgtmpname,$fullpath);
and also store the $imgname in the database so we can fetch the image by this name..It will also avoid the name conflict between images as the timestamps keep changing.

Related

How to change filename's encode type?

I am going to change my filename's encode type from utf-8 to big5, and this is what I have so far:
$path = "stu_resume/104206002_87";
$result =iconv("utf-8", "big5", $path);
echo $result;
echo mb_detect_encoding($result);
Within the folder of 104206002_87, there are 2 files, which are 104206002_87_履歷, 104206002_87_自傳. After the code above is executed, I found that there is nothing changed in the folder. Does anyone know how to solve the problem? Thanks a lot.
iconv() doesn't modify files. It just converts a string. In this case, the string it's converting is ""stu_resume/104206002_87" -- since this string only contains ASCII characters, nothing changes when it's converted from UTF-8 to Big5.
If you want to rename the files in the directory with that name, you will need to do so explicitly, e.g.
$iter = new DirectoryIterator("stu_resume/104206002_87");
foreach ($iter as $file) {
if (!$file->isDot()) {
$old_name = $file->getPathname();
$new_name = iconv("utf-8", "big5", $old_name);
rename($old_name, $new_name);
}
}

Responsive file manager v9 uploading arabic file's name issue

I am using now Responsive file manager v9 as a plugin of tinymce, the version of tinymce is 4.7.4, PHP version is 5.5. The problem I was trying fix the uploaded arabic files' name issue, RFM doesn't upload files which their names is arabian with correct names.
The names of images I choose to test are "vvv" , "اختبار", "اختبار - Copy" all of them are 'jpg' after I upload the files those has an arabic names they give the result like this:
اختبار.jpg ===> ط§ط®طھط¨ط§ط±.jpg
اختبار - Copy.jpg ==> ط§ط®طھط¨ط§ط± - Copy.jpg
however, in config.php is the mb_internal_encoding function is UTF-8.
I tried use iconv by convert between utf-8 to cp1256 in UploadHandler.php line 1097 like this:
move_uploaded_file($uploaded_file, iconv("utf-8", "cp1256",$file_path));
instead of
move_uploaded_file($uploaded_file, $file_path);
and it allowed to upload the files with their arabian names but they appeared in RFM browser with ?????? and ????? - Copy and no thumbs images in browser, however the thumb folder had the images and the image اختبار.jpg didn't upload correctly and made it bad. only English files' names work fine.
I worked in all php files and I used base64_encode, and I tried change the the encoding in config.php but nothing work.
Does anyone have any idea to fix that ?
The reason why you're getting "?????? and ?????" is because you have to change the collection set of your database as well which could be UTF8 General CI and than save the file name (without iconv()) and move the file with file_name by using iconv()
You don't want to mess with UploadHandler.php. All of the preprocessing of the upload happens in upload.php, including massaging the filename in the function fix_filename in utils.php. By the time it gets to UploadHandler, the filename has already been modified so iconv and friends won't work. Take a look at fix_filename and try manipulating the string there:
/**
* Cleanup filename
*
* #param string $str
* #param bool $transliteration
* #param bool $convert_spaces
* #param string $replace_with
* #param bool $is_folder
*
* #return string
*/
function fix_filename($str, $config, $is_folder = false)
{
if ($config['convert_spaces'])
{
$str = str_replace(' ', $config['replace_with'], $str);
}
if ($config['transliteration'])
{
if (!mb_detect_encoding($str, 'UTF-8', true))
{
$str = utf8_encode($str);
}
if (function_exists('transliterator_transliterate'))
{
$str = transliterator_transliterate('Any-Latin; Latin-ASCII', $str);
}
else
{
$str = iconv('UTF-8', 'ASCII//TRANSLIT//IGNORE', $str);
}
$str = preg_replace("/[^a-zA-Z0-9\.\[\]_| -]/", '', $str);
}
$str = str_replace(array( '"', "'", "/", "\\" ), "", $str);
$str = strip_tags($str);
// Empty or incorrectly transliterated filename.
// Here is a point: a good file UNKNOWN_LANGUAGE.jpg could become .jpg in previous code.
// So we add that default 'file' name to fix that issue.
if (strpos($str, '.') === 0 && $is_folder === false)
{
$str = 'file' . $str;
}
return trim($str);
}

PHP: UTF8_decode needed with filter for ASCII values 126-160; proposed solution

I previously began exploring this problem here. Here is the true problem, and a proposed solution:
Filenames with ASCII characters values between 32 and 255 pose a problem for utf8_encode(). Specifically, it doesn't handle the character values inclusively between 126 and 160 correctly. While filenames with those character names may be written to a database, passing those filenames to a function in PHP code will produce error messages stating the file cannot be found, etc.
I discovered this when trying to pass a filename with the offending characters to getimagesize().
What is needed for utf8_encode is a filter to EXCLUDE the conversion of the inclusive values between 126 and 160, while INCLUDING the conversion of all other characters (or any character, characters, or character ranges of the user's dersire; mine is for the ranges stated, for the reason provided).
The solution I devised requires two functions, listed below, and their application that follows:
// With thanks to Mark Baker for this function, posted elsewhere on StackOverflow
function _unichr($o) {
if (function_exists('mb_convert_encoding')) {
return mb_convert_encoding('&#'.intval($o).';', 'UTF-8', 'HTML-ENTITIES');
} else {
return chr(intval($o));
}
}
// For each character where value is inclusively between 126 and 160,
// write out the _unichr of the character, else write out the UTF8_encode of the character
function smart_utf8_encode($source) {
$text_array = str_split($source, 1);
$new_string = '';
foreach ($text_array as $character) {
$value = ord($character);
if ((126 <= $value) && ($value <= 160)) {
$new_string .= _unichr($character);
} else {
$new_string .= utf8_encode($character);
}
}
return $new_string;
}
$file_name = "abcdefghijklmnopqrstuvxyz~€‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’“”–—˜™š›œžŸ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ.jpg";
// This MUST be done first:
$file_name = iconv('UTF-8', 'WINDOWS-1252', $file_name);
// Next, smart_utf8_encode the variable (from encoding.inc.php):
$file_name = smart_utf8_encode($file_name);
// Now the file name may be passed to getimagesize(), etc.
$getimagesize = getimagesize($file_name);
If only PHP7 (6 is being skipped in the numbering, yes?) would include a filter on utf8_encode() to exclude certain character values, none of this would be necessary.

php file_put_contents asian character filename encoding

I'm trying to get this scrape images off of wikipedia. What good is free licensed media if you can't get it? Original script is here.
If you put this
http://upload.wikimedia.org/wikipedia/commons/2/26/%E7%9A%84-bw.png
in firefox, it will immediately be transformed into
http://upload.wikimedia.org/wikipedia/commons/2/26/的-bw.png
so that when you save the image, it's saved as 的-bw.png
Simple enough eh? Now how to get php to do that? Just guessing, I tried utf8_decode($fileName) .. but getting the wrong Chinese characters.
$src= "http://upload.wikimedia.org/wikipedia/commons/2/26/%E7%9A%84-bw.png";
$pngData = file_get_contents($src);
$fileName = basename($src);
file_put_contents($fileName, $pngData);
Any help appreciated, as I really have no idea where to go from here.
Have you tried url_decode(); ?
<?php
$url = 'http://upload.wikimedia.org/wikipedia/commons/2/26/%E7%9A%84-bw.png';
$parts = explode('/', $url);
$title = $parts[count($parts)-1]; //get last section
$title = urldecode($title);
?>
Squirrelmail contains a nice function in the sources to convert unicode to entities:
<?php
function charset_decode_utf_8 ($string) {
/* Only do the slow convert if there are 8-bit characters */
/* avoid using 0xA0 (\240) in ereg ranges. RH73 does not like that */
if (! ereg("[\200-\237]", $string) and ! ereg("[\241-\377]", $string))
return $string;
// decode three byte unicode characters
$string = preg_replace("/([\340-\357])([\200-\277])([\200-\277])/e",
"'&#'.((ord('\\1')-224)*4096 + (ord('\\2')-128)*64 + (ord('\\3')-128)).';'",
$string);
// decode two byte unicode characters
$string = preg_replace("/([\300-\337])([\200-\277])/e",
"'&#'.((ord('\\1')-192)*64+(ord('\\2')-128)).';'",
$string);
return $string;
}
?>

How to extract only part of string in PHP?

I have a following string and I want to extract image123.jpg.
..here_can_be_any_length "and_here_any_length/image123.jpg" and_here_also_any_length
image123 can be any length (newimage123456 etc) and with extension of jpg, jpeg, gif or png.
I assume I need to use preg_match, but I am not really sure and like to know how to code it or if there are any other ways or function I can use.
You can use:
if(preg_match('#".*?\/(.*?)"#',$str,$matches)) {
$filename = $matches[1];
}
Alternatively you can extract the entire path between the double quotes using preg_match and then extract the filename from the path using the function basename:
if(preg_match('#"(.*?)"#',$str,$matches)) {
$path = $matches[1]; // extract the entire path.
$filename = basename ($path); // extract file name from path.
}
What about something like this :
$str = '..here_can_be_any_length "and_here_any_length/image123.jpg" and_here_also_any_length';
$m = array();
if (preg_match('#".*?/([^\.]+\.(jpg|jpeg|gif|png))"#', $str, $m)) {
var_dump($m[1]);
}
Which, here, will give you :
string(12) "image123.jpg"
I suppose the pattern could be a bit simpler -- you could not check the extension, for instance, and accept any kind of file ; but not sure it would suit your needs.
Basically, here, the pattern :
starts with a "
takes any number of characters until a / : .*?/
then takes any number of characters that are not a . : [^\.]+
then checks for a dot : \.
then comes the extension -- one of those you decided to allow : (jpg|jpeg|gif|png)
and, finally, the end of pattern, another "
And the whole portion of the pattern that corresponds to the filename is surrounded by (), so it's captured -- returned in $m
$string = '..here_can_be_any_length "and_here_any_length/image123.jpg" and_here_also_any_length';
$data = explode('"',$string);
$basename = basename($data[1]);

Categories