special characters in url filename cause problems - php

I have the following www.mywebsite.com/upload/server/php/files/foto/test/Aston_Martin_DBS_V12_coupé_(rear)_b-w.jpg
This file is uploaded trough a script. The file exists on the server.
However, because the special character in the url (é), I am experiencing some problems.
The filename on the server is Aston_Martin_DBS_V12_coup%C3%A9_(rear)_b-w.jpg, which is correct. However somehow my browser (Chrome) requests this page as ISO-8859-1 instead of UTF-8.
Therefore, I get a 404.
I am using jQuery file upload plugin.

I deleted my answer from here and i wrote new:
Usually websites does not contain files with non-standard characters. Files usually have removed non standard characters, sometimes that characters are replaced by similar standard chars (Polish ą to a, ś to s). For example - im renaming files manually, or when i have a lot of files - i just use bash or php script that removes/replaces that characters in filenames on server.
Anyway, if you HAVE TO use original filenames - you have to decode them from ISO and encode them to UTF8.
Take look at that php code fragment here:
how to serve HTTP files with special characters

Some special Charater make problem in url for filename
like
+ ,#,%,&
For those file which are accessing through url make file which not contain above letters
forex
str_replace(array(" ","&","'","+","#","%"),"-","filename")
it will works fine

If the filename contains the % character codes, you will need to encode those in your URL. Try accessing Aston_Martin_DBS_V12_coup%25C3%25E9_(rear)_b-w.jpg

Related

Filename with percent in URL

I have a directory that contains several pdfs, I have a script that sets the path for access and everything else,
Baixar
however I am having a 404 error in files that contain accents of type (comma and percent),
File with accent:
http://exodocientifica.com.br//_fispq/ALARANJADO%20DE%20METILA%200,2%.pdf
other files without accent are normally accessed,
File without accent:
http://exodocientifica.com.br//_fispq/ALARANJADO%20DE%20METILA.pdf
already I tried using functions like urlencode, htmlentities, hand did not get results
You should encode % with %25 since it is a special character in URL. Try using this URL instead:
http://exodocientifica.com.br//_fispq/ALARANJADO%20DE%20METILA%200,2%25.pdf

Can't access static files with a URL-encoding characters in the names

On my CDN, I have some old files with names that were created using rawurlencode function. E.g. one of the files has this name:
Cat%20presentation.pdf
Now, when I try to read this file, I get a "File not found" error:
GET cdn.example.com/documents/Cat%20presentation.pdf
I believe that's because of the encoded space character in the name - %20. For the browser (and the CDN) what I'm asking for is this:
Cat presentation.pdf
while the "%20" part is actually in the filename.
Is there any way I can get around this and access the file?
You need to URL-encode the "%" character into "%25":
GET cdn.example.com/documents/Cat%2520presentation.pdf
But if I were you I'd just fix their filenames instead.

Bug with php file converted from ansi to utf-8

I have a few php scripts files encoded in ANSI. Now that I converted my website to html5, I need everything in UTF-8, so that accents in these file are displayed correctly without any php conversion through iconv(). I used Notepad++ to set the encoding of my scripts on UTF-8 and save the files, and most are fine, accents are displayed correctly, only the main script now blocks everything, and the server only returns a white page, without any error message, even with ini_set('error_reporting', 'E_ALL') !
When I change the encoding back to ANSI in Notepad++, and save the file without any other change, it works again (except the accents are not displayed correctly without iconv() ).
I did also try to use a php script to change the encoding with ...$file = iconv('ISO-8859-1','UTF-8', $file);... but the result is exactly the same !
I wrote a short php script to look for high char() values, but the highest values seems to be usual French accents like é, è, etc which are also present on other files and pose no problem. I did remove other special chars, without any effect...
The problem is that the file is large, more than 4500 lines and I'm not sure how to proceed to correct this ? Anyone has had this problem, or has any idea ?
The issue was with the "£" (pound) character, I used it a lot as delimiter in preg_match("£(...)£", "...", $string) and preg_replace conditions.
For some reason these characters were not accepted after conversion. I had to replace all of them, then only it worked fine in utf-8... Apparently they are not a problem now that the file is converted, I can use them again.

Arabic characters and UTF-8 in aria2

I use aria2 to have download with XML_RPC and when i want to have a download like this in php :
$client->aria2_addUri( array($url), array("dir"=>'/home/amir/دانلود') );
it will create a folder named شسÛب instead of دانلود. i post a related post in aria2 forums. and they said aria2 has not problem if that string sent to aria2 with utf-8.
so, i used utf-8 header and convert the string to utf-8, but it's not works :
header('Content-type:application/json; charset=utf-8');
$dir_on_server = mb_convert_encoding($dir_on_server, 'UTF-8');
what do you think?
Try accessing the file or folder via the browser.
By writing a .htaccess-file with the content "Options Indexes" so that you're folders are shown.(I can even access them via http)
I created multiple files and folders by writing a script where the GET Value file or folder determines the name of the folder or file, I tried it with japanese and arabic characters. Albeit they won't be shown in FTP correctly (In my case only file names like: "?????") they are correctly displayed if you read them by script.
The problem might be at the program you're using to access your FTP, WinSCP for example has UTF-8 normally on "auto" by default, so forcing it might work out.(Although I have to admit that it's not working on my side, maybe my linux server is not supporting utf-8 file names which can also be a problem for you)
PS:
Also make sure your php-file is encoded(saved) in UTF-8 without BOM since you're using a constant utf-8 string.
EDIT:
Also if you still intent to use mb_convert_encoding, better add the optional parameter "from_encoding".
I tested this with japanese in a SHIFT-JIS encoded file:
$text = "A strange string to pass, maybe with some 日本語の characters.";
echo mb_convert_encoding($text, 'UTF-8');
and it's not displaying correctly although my browser has UTF-8 activated, so it seems to be not always right when it's trying to detect the Encoding.
So this for example works for me then:
$text = "A strange string to pass, maybe with some 日本語の characters.";
echo mb_convert_encoding($text, 'UTF-8', 'SJIS'); //from SJIS(SHIFT-JIS)
This little script is nice to findout the optional parameter you want for your arabic characters:
http://www.php.net/manual/de/function.mb-convert-encoding.php#97902
But converting won't be necessary if the file is already in UTF-8, it's only making sense if it's in some arabic encoding, so I think this is not really bringing you any further to the solution.
EDIT2:
Tried a different FTP-Program, Filezilla displays my files and folder, which have japanese names and the arabic one, correctly. (I was using WinSCP 4.3.4 before)

Image src with special characters

I have a problem with my images not being displayed when they have a # or % symbol.
I am using PHP to read a directory and display all images but any with those symbols just have broken links. The images are uploaded to the server fine but wont display.
I think you'll need to write a function which replaces the % and # characters with their corresponding url-encoding symbols, you can find a reference here:
http://www.w3schools.com/tags/ref_urlencode.asp
you should compare the output of your script to the directory list you get for that dir in your browser, then you will obviously see the correct mapping for your special chars.
are you uploading images via php? then you could maybe map special chars to spaces or dashes. don't think having special chars in file names is a good idea
Best will be to strip out all those characters before uploading the images. (If you are using FTP, rename your files. If you are using PHP to upload, write a function which does it)
Otherwise you will need to escape the symbols: Take a look at urlencode

Categories