Issues with encoding - working on one webserver but not another - php

I'm completely lost at the moment. I've been using a PHP scripts which read in Hebrew characters from two different sources; a CSV file (UTF-8) and a website (UTF-16). But since I've moved webhost the the characters from the CSV file are not showing up on the page.
The completely identical file runs perfectly on my previous webhost (hostgator) but since I've moved to a VPS its not printing the characters from the CSV file.
When I read from the website the page still shows the characters, but the CSV file doesn't.
I read in the CSV file using array_map('str_getcsv', file('tAreas.csv'));. On the previous webhost, the elements would be in UTF-8, on the VPS the elements are ASCII (Not sure whats causing the difference or how to rectify it).
I assume its something to do with the configuration of my webserver, though not entirely sure what needs to be done. Any help would be appreciated.

Related

Moved wordpress : broken uploaded files with accents

I moved my wordpress website from my old server to my new. Everything works, except that every uploaded files (PDFs for example) with accents (é,à, and so on) in the file name got their URL broken. So I would have to rename every files with accents, and change my links that are pointing to these files.
The images are showing though. It really seems to be only affecting files with accents in the filename.
Any solutions for this? I just wanted to move the wordpress website to my new server. It worked, but I would have to rename all these uploaded files.
Thanks!
I had such a problem, i dont have 100% correct answer but i want help you.
I remember it was problem with ftp program (i dont remember, maybe winscp?)
anyway this program does not interact with server and changed accent characters. You can try do it with other program or from server shell.
Second thing, we known which character is wrong and my friend wrote plugin to change characters but if your have you problem from other side (char to accent) this cant work...
Find where is your problem, possible you can fix it with plugin like this: https://wordpress.org/plugins/clean-image-filenames/
Last important thing - check whether your database use correct encoding
Sorry for english ;)

PHP and UTF-8 (and Windows as well)

Here is the situation:
I'm using UTF-8 to input Japanese characters into a MySQL database, using a php form. However while done from my PC it works perfectly and the script records the characters correctly into the DB, but from other PCs the script inputs raw symbols. I've declared completely all the things regarding the UTF-8 header, meta tag, etc. I'm sure this is not a php/sql issue, (because it works perfectly from one pc) but something from windows configuration I cannot understand.
Anyone knows something regarding this issue?

PHP file losing formatting after FTP upload

I am using WinSCP to transfer files to an FTP site. I have a situation currently where one specific file within a folder loses all of its formatting when it is uploaded causing the PHP file to no longer work.
All other PHP files within the folder work correctly when uploaded.
I can't understand why just one file could be affected in this way. Can anyone shed any light on the situation?
The file was probably transferd via ASCII mode which will modify the encoding and the line endings of the file.
As you have not stated what exactly do you mean by "losing formatting", it's difficult to answer, anyway:
As per src's answer, if the line endings change due to ASCII/text mode transfers, the resulting converted file can be perceived as if it lost formatting, if opened in an editor that does not support the target line endings. Though that hardly explains why there's only one affected file. Although can WinSCP technically choose a different transfer mode for example based on a file size or modification timestamp, if configured so, I doubt you did. Also note that WinSCP defaults to binary transfer mode. It would help if you state what transfer mode do you actually use with WinSCP. Definitive source for this information is WinSCP session log file. Also sharing relevant part of a log file would also help with investigation.
Another possibility is that the affected source file was created with a different line endings in the first place (like in a different editor than you use usually). As such the problem would have nothing to do with transfer mode, or WinSCP. And the difference is possibly revealed only after you open the files using a third editor on the remote side that supports only one of the line ending formats.
Though in both of these cases, the file should still work in PHP, as PHP supports both Unix and Windows line endings. Possibly the source file has such a strange format that during ASCII/text mode transfer, the server got confused and converted the file incorrectly. But that's just a wild guess.
Again, we need more information to help you.

No images when switching Wordpress site hosts

Here's my problem :
I switched a Wordpress site from HostGator to MediaTemple. Since the domain name stays the same, I backed up and re-imported the database, downloaded and re-uploaded the site content without issues.
The first time, everything worked well except that in place of the images, I could only see question marks. Opening the image in a new tab would show "Not found". I went in the FTP and realized that the file names were in french and with accents in them like "é" and "à", and that in the process of downloading the files to my Mac (Using Coda) and re-uploading them in the server, the accents were all replaced by weird characters...
I tried to manually rename them, it did not work
I tried to do it using different Ftp apps, did not work
I tried using windows to do it, did not work
I managed by playing in Coda's preferences to change the encoding and to re-upload the files to the server while keeping all the accents but it still didn't work...
Database is in UTF-8, and I tried multiple collation like UTF8_bin and general_ci but it didn't work either...
I am pretty sure it is a character encoding issue since there is 1 or 2 images working on the site and they have no accents in their names but I really don't know where to look anymore.
Switched multiple Wordpress websites and never had this problem before, could somebody point me in the right direction please ?
In wordpress all the links are saved into the database and are not hard coded in the html files. So when you're shifting your website from one host to another, you must find and replace the previous host's links with the new host's links in the database file exported from the previous host before importing it into the new host.
Go to the following links for details.
How to Move WordPress From Local Server to Live Site
Moving WordPress - Wordpress Codex
Wish you good luck.
Ask your new hosting service to chown your files to your new account, which may be solve the problem.

Migrate web-pages from different char-sets to UTF-8

For the last years I used Notepad++ on Win XP SP2.
As I just have seen, the setting in Notepad++ is to encode new files in "ANSI" in "Windows Format". Basically all files on my harddisk should be ANSI files then, but I'm not sure.
Most .html-files have a charset-tag as "text/html; charset=iso-8859-1", but some have none.
Other files, especially text-files (for example keyword-lists) I stored with Firefox XPCOM-system, I don't know how they are currently encoded.
On Server-side I have Apache with PHP and MySql.
For Upload I used Filezilla.
Now the problem is: I want to use Japanes signs (or arabic, etc.). This only works partly.
I can get my selfmade Firefox-Application to constantly write or read UTF-8. But I can't check everytime which of the old files is which encoding.
Having just read Joel Spolsky's old article about UTF-8 strengthens my view that I simply have to get my whole system changed as much as possible to UTF-8.
As long as I have it running that way locally on my Hard-Disk I could just re-upload everything to the server.
So: How do I get all my files locally transfered to UTF-8?
And: Is it possible at all to have Win XP SP2 using constantly UTF-8 everywhere? Or do I have to check it with every program, or even worse with every file, that the right encoding is to be used.
How about files I get for example in E-Mails or via an USB-stick, or that I download in zip-files? (Or a thousand possibilities more.)
Update:
1.-4. went OK so far. I tried first with BOM, but without seems to be better.
So to 5.) Something I have to change there too. I changed as in 3.) the charset in the html-template-file, and the text coming from the template is displayed correctly. But the text coming from MySql/Php shows the UnknownChar-sign at some places currently, i.e. where there should be Umlaute äöü.
I have changed all collations for text fields in the MySql-Database via phpmyadmin to "utf8_unicode_ci", but that didn't do the trick.
Is it a php-issue, or do I only have to convert somehow the data in the MySql-Database once?
The beauty of UTF-8 is that it's a superset to ASCII, so if your html and php files only contain Latin alphabets (i.e. English and programing/HTML syntax), you don't need to convert the file at all. You can leave most of your file unchanged.
Should you find few exceptions that you want to convert it manually, you may open them up in Notepad++, and do 'Encoding' - 'Convert to UTF-8 (No BOM)'.
Yes, you do need to change/add <meta> charset tag to all the HTML files to make sure the browser render your files in UTF-8.
In Notepad++ you could set the new file to always open with 'UTF-8 (No BOM), Unix'. Also, check the tick on "Apply to ANSI files" so old file can be correctly saved to the new encoding. I suggest the format is because even though you are working on a Windows machine, the web servers usually runs Linux/BSD so the format is the native form (keeping files in native form is important especially when you are using a version control system).
Migrate a live site with database is a different issue. Data in MySQL comes with their own encoding, and from your question I cannot tell if you need to do it and how to do it. Need more specifics on that (if you need to).

Categories