CSV to PHP file - Issue with multi language words - php

I am reading CSV file to PHP. File is reading properly except one thing. Issue is in reading words which are in non-English language.
Array is returning some symbols instead of other language exact word.
For eg: LBL_Hello Hello 你好
Above contents are there in CSV file. I am reading the words into PHP in form of Array using below code:
<?php
$file = file_get_contents("test.csv");
$data = (array_map("str_getcsv", preg_split('/\r*\n+|\r+/', $file)));
echo "<pre>";
print_r(($data));
exit;
?>
Output:
Array
(
[0] => Array
(
[0] => LBL_Hello
[1] => Hello
[2] => ���
)
[1] => Array
(
[0] =>
)
)
Here, Chinese word "你好" is showing as "���". This is happened with all other language words also.
Please help me to know what is missing so that I can get the exact other language word while reading CSV into PHP.
Thanks in Advance.

Try sending charset headers at the top of file(before any other output)
header('Content-Type: text/html; charset=UTF-8')
Sample Code to get data(contains Non-English characters) from CSV:
<?php
header('Content-Type: text/html; charset=UTF-8');
$handle = fopen ("test.csv", "r");
$dataArr = [];
echo "<pre>";
while ($data = fgetcsv ($handle, 1000, ",")) {
array_push($dataArr, $data);
}
print_r($dataArr);
echo "</pre>";
exit;
?>

Related

How to remove unwanted ";" before explode php [duplicate]

This question already has answers here:
Encoding issue, coverting & to & for html using php
(4 answers)
Closed 1 year ago.
I am trying to explode the txt file in one array per line. The file was give through a URL on this format:
Ville:Montréal; Fichier:montreal.txt
Ville:Québec; Fichier:quebec.txt
The problem it that the separator variable ";" is the same found in some other parts of the string.
The wanted result is:
[0] => Ville:Québec [1] => Fichier:quebec.txt
[0] => Ville:Montréal [1] => Fichier:montreal.txt
I am using this code:
<?php $tabCities = file ('redacted'); ?>
<?php //$oneLine = utf8_decode ($tabCities[0]); ?>
<?php $oneLine = $tabCities[0]; ?>
<?php $arrayLine = explode(";", $oneLine); ?>
<?php print_r ($oneLine); ?>
<?php print_r ($arrayLine); ?>
It outputs
Ville:Montréal; Fichier:montreal.txt Array ( [0] => Ville:Montré [1] => al [2] => Fichier:montreal.txt )
utf8_decode does not help. Is there any other function or strategy I can try?
You are looking for html_entity_decode.
It will convert the html entity (é) into a unicode character to é

PHP List all files in a directory with Unicode Filenames

Well, I've these JSON files in my directory & The filename contain Unicode Characters.
"Spanish - Estoy leyendo este archivo.json"
"Malayalam - ഞാൻ ഈ ഫയൽ വായിക്കുന്നു.json"
"Greek - Διαβάζω αυτό το αρχείο.json"
"Japanese - このファイルを読んでいます.json"
"English - I am reading this file.json"
I'm trying to list all files using scandir function with the below snippet.
<?php
$dir = './';
$files = scandir($dir);
echo "<pre>";
print_r($files);
echo "</pre>";
?>
But I'm not getting actual filenames but like below.
Array
(
[0] => .
[1] => ..
[2] => English - I am reading this file.json
[3] => Greek - ??a�??? a?t? t? a??e??.json
[4] => Japanese - ?????????????.json
[5] => Malayalam - ??? ? ??? ????????????.json
[6] => Spanish - Estoy leyendo este archivo.json
[7] => index.php
)
What is the Correct way to achieve the Same??
Seems the webserver is not using utf-8 character set in the header (double check response header in your browser's console->network tab).
So even if the php variable contains the correct filename, the browser is not expecting utf-8 characters and showing them wrong. Try explicitly setting the character set in php:
<?php
header('Content-type: text/html; charset=utf-8');
$dir = './';
$files = scandir($dir);
echo "<pre>";
print_r($files);
echo "</pre>";
?>

Can't replace this unicode (�) to " " (space) on csv upload to PHP

I have a Excel data converted from XLSX to CSV, then I need to upload it to my site. The data shown like this on CSV but changed after upload.
// On Excel (CSV)
Row Description
1 Enjoy this life without drugs
2 Life is so short, so enjoy it
After uploading to site and inserted to MySQL it's look like this.
// On MySQL
Row Description
1 Enjoy?this life without?drugs
2 Life is?so?short, so?enjoy it
// On PHP ( (echo loop).
Row Description
1 Enjoy�this life without�drugs
2 Life is�so�simple, so�enjoy it
I was checked on my CSV, it just space that changed into ? and �. So, I'm trying to replace that but all failed using :
// $the_string = Line of text Description.
1. str_replace("�", " ", $the_string);
2. str_replace("&#65533", " ", $the_string);
3. str_replace("&#xfffd", " ", $the_string);
4. str_replace("?", " ", $the_string");
But, If I'm test it only on <?php str_replace("�", " ", "a�b"); ?>, It's working.
I don't know where is the mistake.
This is my source code :
public function upload()
{
$config = array(
"upload_path" => "./uploads/",
"allowed_types" => "csv"
);
$this->load->library("upload", $config);
$this->load->helper("file");
$this->upload->initialize($config);
$upload = $this->upload->data();
$file = base_url()."uploads/{$upload['file_name']}";
$file_handle = fopen($file, "r");
$check_line = 0;
while ( ! feof($file_handle))
{
$line_of_text = fgetcsv($file_handle, 1024);
$check_line++;
}
fclose($file_handle);
if ($check_line > 1)
{
$file_handle2 = fopen($file, "r");
while ( ! feof($file_handle2))
{
$line_of_text = fgetcsv($file_handle, 1024);
$description = $line_of_text[1];
$this->model->insert_description($description);
}
fclose($file_handle);
}
}
Try with preg_replace():
preg_replace('/\x{FFFD}/u', ' ', $the_string);
Try it here.
Attention: This will remove the � character from the string, but ONLY if it is the real character stored in the string.
The � character may appear in substitution of every character that isn't encoded properly accordingly with the encoding used by, in this case, PHP.
To remove all non-printable characters use this:
preg_replace('/[\x00-\x1F\x7F-\xFF]/', '', $the_string);
Try it here.

PHP get base64 decoding xml array and encode it to pdf

I need to convert XML array (contained base64 decoding) and encode it to PDF.
This is the array:
<response>
<xmlArray>
<blabla>TG9yZW0gaXBzdW0gZG9sb3Igc2l0IGFtZXQsIGNvbnNlY3RldHVyIGFkaXBpc2NpbmcgZWxpdC4gTnVuYyB2ZW5lbmF0aXMsIGp
1c3RvIHV0IGF1Y3RvciBzZW1wZXIsIHB1cnVzIGxlY3R1cyBlbGVtZW50dW0gbGliZXJvLCBhYyBwZWxsZW50ZXNxdWUgYW50ZSBqdXN0b
yBldCB0dXJwaXMuIE51bmMgYmliZW5kdW0gZWdlc3RhcyBkb2xvciB2b2x1dHBhdCBlZ2VzdGFzLiBTdXNwZW5kaXNzZSBkYXBpYnVzIHN
lbSBvZGlvLCBpbiBmYXVjaWJ1cyBsZWN0dXMgcHVsdmluYXIgdml0YWUuIE5hbSBtYXR0aXMgZXVpc21vZCBhdWd1ZSwgZWdldCBmaW5pY
nVzIGxlbyBoZW5kcmVyaXQgbmVjLiBDbGFzcyBhcHRlbnQgdGFjaXRpIHNvY2lvc3F1IGFkIGxpdG9yYSB0b3JxdWVudCBwZXIgY29udWJ
pYSBub3N0cmEsIHBlciBpbmNlcHRvcyBoaW1lbmFlb3MuIE51bmMgaWQgbnVuYyBsZWN0dXMuIFBlbGxlbnRlc3F1ZSBsYWN1cyB1cm5hL
CB2aXZlcnJhIGNvbnZhbGxpcyBlZmZpY2l0dXIgdmVsLCBhbGlxdWFtIHNpdCBhbWV0IGRpYW0uIEluIGEgYWxpcXVldCBtYXNzYS4gU2V
udCwgdGVtcHVzIGhlbmRyZXJpdCBlbmltIGZhdWNpYnVzLiBEb25lYyBtYXR0aXMgZWxpdCBub24gbWFzc2EgaW50ZXJkdW0gZmF1Y2lid
XMuIEFlbmVhbiBub24gbWF1cmlzIGluIHVybmEgbWFsZXN1YWRhIGx1Y3R1cy4=
</blabla>
</xmlArray>
</response>
My attempt up until now is
<?php
// 1st. convert xml array; get the blabla array information
$blabla2 = $xml->xmlArray->blabla;
// remove CR, new lines and whitespace on blabla2
$blabla3 = str_replace(array("\n", "\t", "\r"), '', $blabla2);
// 2nd. encode it to PDF
header('Content-type: application/pdf');
$blabla4 = base64_decode($blabla3 );
echo $blabla4 ;
?>
The result is a pdf, but not the way I expected as it shows this line on pdf:
%PDF-1.410obj<</Title(þÿ)/Creator(þÿwkhtmltopdf0.12.2.4) ...
Would you mind to tell me how to show the pdf properly?
Thanks a lot!

PHP preg_match_all - how to get a content from HTML?

$Content contains HTML document
$contents = curl_exec ($ch)
I need to get a content from:
<span class="Menu1">Artur €2000</span>
It's repeated several times so I want to save it into Array
I try to do that this way:
preg_match_all('<span class=\"Menu1\">(.*?)</span>#si',$contents,$wynik2);
But I've got an error
Warning: preg_match_all() [function.preg-match-all]: Unknown modifier '('
Can You guys help me please?
EDIT: $contents = curl_exec ($ch)
SOLVED: The error was cased becasue of wrong HTML on CURLed website:
<span class="Menu1">Content</tr>
instead of:
<span class="Menu1">Content</tr>
I didn't expected that someone can write wrong HTML. Thank You guys for help!
You forgot the first delimiter (#):
$contents = '<span class="Menu1">Artur $2000</span> somehtml <span class="Menu1">Mark $1000</span>';
preg_match_all('#<span class="Menu1">(.*?)</span>#si', $contents, $wynik2);
print_r($wynik2);
/*
Array
(
[0] => Array
(
[0] => <span class="Menu1">Artur $2000</span>
[1] => <span class="Menu1">Mark $1000</span>
)
[1] => Array
(
[0] => Artur $2000
[1] => Mark $1000
)
)
*/
You should put this sign "|" in the start and the end of your regular expression :
preg_match_all("|<span class=\"Menu1\">(.*?)</span>|U",$contents,$wynik2);

Categories