Processing csv file as UTF-8 - php

Trying to figure out how to process a csv file with UTF encoding. Tried multiple ways like adding this utf8_encode() and with this in the header:
header('Content-Type: text/html; charset=UTF-8');
But nothing seems to work.
The code is:
<?php
include 'head.php';
$csv = array_map("str_getcsv", file("translations/dk.csv"));
foreach ($csv as $line){
$translate["dk"][ $line[0] ] = $line[1];
}if ($line[1] != NULL){
$line[0] = $line[1];
}
echo $line[0];
fclose($csv);
?>
How to I echo the output with UTF-8 encoding?

When you would display it in a browser you should use valid html and set the meta charset to utf8 too:
<?php
include 'head.php';
?>
<!DOCTYPE html>
<html lang="dk">
<head>
<meta charset="utf-8"/>
</head>
<body>
<?php
$csv = array_map("str_getcsv", file("translations/dk.csv"));
foreach ($csv as $line){
$translate["dk"][ $line[0] ] = $line[1];
}if ($line[1] != NULL){
$line[0] = $line[1];
}
echo $line[0];
fclose($csv);
?>
</body>
</html>
Or using text/plain instead of text/html can help:
header('Content-Type: text/plain; charset=UTF-8');
Hope that helps.

Based on what you described it looks like the file isn't in UTF-8 format, its probably in ISO-8859-1 but you are trying to display as if it was in UTF-8, hence why you see strange blocky symbols.
You have two options, you can convert the file entries to UTF-8 with:
foreach ($csv as $line)
$translate["dk"][$line[0]] = utf8_encode($line[1]);
Or declare the file real encoding to the browser so it will display correctly:
header('Content-Type: text/html; charset=ISO-8859-1');
Since W3C recommends UTF-8 as default encoding for web, the first option should be prefered.
Alternatively, you can convert the entire file to UTF-8 using your favorite text editor and save it that way, so you don't have to convert it to UTF-8 every time.

Related

Export HTML table to CSV file

I am making a script which gets a table from your mail and puts it into a CSV file.
This is the code I use to transfer my html table to CSV
$html = str_get_html($outputstr);
// For Excel
header('Content-type: application/ms-excel');
// Download File
header('Content-Disposition: attachment; filename=sample.csv');
$fp = fopen("php://output", "w");
// Take out empty lines
foreach($html->find('tr') as $element) {
$td = array();
foreach( $element->find('th') as $row) {
$td [] = $row->plaintext;
}
foreach( $element->find('td') as $row) {
$td [] = $row->plaintext;
}
fputcsv($fp, $td);
}
fclose($fp);
The only problem that I'm getting is that when I am opening the CSV file, some of the empty columns have a strange character:
I cannot read through with my PHP script to export it to a database
fgetcsv($handle, 1000, "\t");
How can I fix this problem?
Do I fix this by modifying the code on the part where I create the CSV file or where I read the CSV file when I'm transferring it to a MySQL database?
When I use an online html to CSV converter it works fine and I am not facing this issue then.
If there is any code needed then I'd love to share it.
Any help would be appreciated.
Have you tried setting your charset to UTF-8? Additionally, you're not setting this up as a CSV with your header, instead it is an Excel file.
header("content-type:application/csv;charset=UTF-8");

PHP issue with diacritics

I have thi code for read files from folder:
<?php
$directory = "Dokumenty/rozne";
$a = array_diff(scandir($directory), array('..', '.'));
$i = 1;
foreach($a as $key => $name){
$link = "http://mana.fara.sk/Dokumenty/rozne/" . $name;
echo "<p>$i: <a href='$link' >$name</a></p><br>";
$i++;
}
?>
but on the webpage diacritics is displayed incorrectly: here is example
Pamiatkovy���� vyskum.docx
Can you help me how to selve this problem?.... In head a have <meta charset="UTF-8"> and html lang is lang="sk-SK"
THX
That's probably because scandir return a non-UTF-8 string. You should either update your file names with the right encoding, or convert the string's encoding to UTF-8. Windows should use ISO-8859-1 or Windows-1252.
So, you can try with:
$name = iconv('Windows-1252', 'UTF-8', $name);

Export email from mysql with php, but export all my php page

I'd like to export my contact email, stored in mysql database with a script.
I need to export my email in csv file.
But, when the page reload, the file is downloading and into this file i have all my php page!
<?php
if(IsSet($_POST['export_test'])){
// output headers so that the file is downloaded rather than displayed
header('Content-Type: text/csv; charset=utf-8');
header('Content-Disposition: attachment; filename=data.csv');
// create a file pointer connected to the output stream
$output = fopen('php://output', 'w');
// output the column headings
fputcsv($output, array('E-mail'));
// fetch the data
$string = "SELECT Email FROM address";
$query = mysql_query($string);
// loop over the rows, outputting them
while ($row = mysql_fetch_assoc($query)) fputcsv($output, $row);
}
?>
in data.csv i can see all my page
(<!DOCTYPE html>
<html lang="en" ng-app>
<head>
<title>Test</title>
<link href='http://fonts.googleapis.com/css?family=Open+Sans:400,700,300' rel='stylesheet' type='text/css'>
<link rel="shortcut icon" href="icon/advancedsettings.png" type="image/x-icon" />.....)
thank you
You are opening the csv file and writing some content and then executing the query. Then how can you get the emails as output?. Try this code this will give the exact result which you need
<?php
$conn=mysqli_connect("localhost","root","","table_name");// connection to db
if(isset($_POST['export_test'])){
$sql="select email from address";// select query
$res=mysqli_query($conn,$sql);
$line .= "\n"; // new line
$filename='email.csv';// create csv file if dosent exist
$fp = fopen($filename, "w");// open the csv file to write
while($row=mysqli_fetch_array($res)){
$line = "";
$comma = "";
$line .= $comma . '"' . str_replace('"', '""', $row['email']) . '",';
$comma = ",";
$line .= "\n";
fputs($fp, $line); // put the line into csv
}
fclose($fp);
header('Content-Type: text/csv; charset=utf-8');// to download the email.csv
header('Content-Disposition: attachment; filename=email.csv');
}
?>
I solved this problem :)
I put my script in a external php file, and it works!
before my script was on top of my page and was not working
thank you!

Why when writing html file with PHP stalled at '&'?

I have an issue about writing HTML file with PHP.
I have this function :
function openHTML ($file) {
if (file_exists($file)) {
$handle = fopen($file, "r");
$output = fread($handle, filesize($file));
fclose($handle);
return $output; // output file text
}else{
return "This file is not exists";
}
}
function saveHTML ($file,$string) {
$a = fopen($file, 'w');
fputs($a,stripslashes($string));
fclose($a);
return "success";
}
When I'm using openHTML it's fine. But unfortunately, in file that I opened with openHTML() there are some &, and then I'm saving with saveHTML() but then string or codes that save stalled at char &.
Example : UPDATE !
I open blank file.html with openHTML() and I start to type some string bellow :
<html>
<head>
</head>
<body>
This is my login page & user page
</body>
</html>
After I save with code saveHTML():
<html>
<head>
</head>
<body>
This is my login page
At last code to be missing. Stalled at &.
I have using fputs, utf8_encode, fwrite, file_put_contents. Still not solved.
try
$output = htmlentities($output);
on the string your going to save.
check it out here
It will change all HTML entities to their applicable charactors.
In this case your & will be changed to
&
Lone ampersands are invalid in a PCDATA section. Use & instead.
Try to use fwrite() instaed of fputs()

UTF-8 problems while reading CSV file with fgetcsv

I try to read a CSV and echo the content. But the content displays the characters wrong.
Mäx Müstermänn -> Mäx Müstermänn
Encoding of the CSV file is UTF-8 without BOM (checked with Notepad++).
This is the content of the CSV file:
"Mäx";"Müstermänn"
My PHP script
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
</head>
<body>
<?php
$handle = fopen ("specialchars.csv","r");
echo '<table border="1"><tr><td>First name</td><td>Last name</td></tr><tr>';
while ($data = fgetcsv ($handle, 1000, ";")) {
$num = count ($data);
for ($c=0; $c < $num; $c++) {
// output data
echo "<td>$data[$c]</td>";
}
echo "</tr><tr>";
}
?>
</body>
</html>
I tried to use setlocale(LC_ALL, 'de_DE.utf8'); as suggested here without success. The content is still wrong displayed.
What I'm missing?
Edit:
An echo mb_detect_encoding($data[$c],'UTF-8'); gives me UTF-8 UTF-8.
echo file_get_contents("specialchars.csv"); gives me "Mäx";"Müstermänn".
And
print_r(str_getcsv(reset(explode("\n", file_get_contents("specialchars.csv"))), ';'))
gives me
Array ( [0] => Mäx [1] => Müstermänn )
What does it mean?
Try this:
<?php
$handle = fopen ("specialchars.csv","r");
echo '<table border="1"><tr><td>First name</td><td>Last name</td></tr><tr>';
while ($data = fgetcsv ($handle, 1000, ";")) {
$data = array_map("utf8_encode", $data); //added
$num = count ($data);
for ($c=0; $c < $num; $c++) {
// output data
echo "<td>$data[$c]</td>";
}
echo "</tr><tr>";
}
?>
Encountered similar problem: parsing CSV file with special characters like é, è, ö etc ...
The following worked fine for me:
To represent the characters correctly on the html page, the header was needed :
header('Content-Type: text/html; charset=UTF-8');
In order to parse every character correctly, I used:
utf8_encode(fgets($file));
Dont forget to use in all following string operations the 'Multibyte String Functions', like:
mb_strtolower($value, 'UTF-8');
In my case the source file has windows-1250 encoding and iconv prints tons of notices about illegal characters in input string...
So this solution helped me a lot:
/**
* getting CSV array with UTF-8 encoding
*
* #param resource &$handle
* #param integer $length
* #param string $separator
*
* #return array|false
*/
private function fgetcsvUTF8(&$handle, $length, $separator = ';')
{
if (($buffer = fgets($handle, $length)) !== false)
{
$buffer = $this->autoUTF($buffer);
return str_getcsv($buffer, $separator);
}
return false;
}
/**
* automatic convertion windows-1250 and iso-8859-2 info utf-8 string
*
* #param string $s
*
* #return string
*/
private function autoUTF($s)
{
// detect UTF-8
if (preg_match('#[\x80-\x{1FF}\x{2000}-\x{3FFF}]#u', $s))
return $s;
// detect WINDOWS-1250
if (preg_match('#[\x7F-\x9F\xBC]#', $s))
return iconv('WINDOWS-1250', 'UTF-8', $s);
// assume ISO-8859-2
return iconv('ISO-8859-2', 'UTF-8', $s);
}
Response to #manvel's answer - use str_getcsv instead of explode - because of cases like this:
some;nice;value;"and;here;comes;combinated;value";and;some;others
explode will explode string into parts:
some
nice
value
"and
here
comes
combinated
value"
and
some
others
but str_getcsv will explode string into parts:
some
nice
value
and;here;comes;combinated;value
and
some
others
Try putting this into the top of your file (before any other output):
<?php
header('Content-Type: text/html; charset=UTF-8');
?>
The problem is that the function returns UTF-8 (it can check using mb_detect_encoding), but do not convert, and these characters takes as UTF-8. Тherefore, it's necessary to do the reverse-convert to initial encoding (Windows-1251 or CP1251) using iconv. But since by the fgetcsv returns an array, I suggest to write a custom function:
[Sorry for my english]
function customfgetcsv(&$handle, $length, $separator = ';'){
if (($buffer = fgets($handle, $length)) !== false) {
return explode($separator, iconv("CP1251", "UTF-8", $buffer));
}
return false;
}
Now I got it working (after removing the header command). I think the problem was that the encoding of the php file was in ISO-8859-1. I set it to UTF-8 without BOM. I thought I already have done that, but perhaps I made an additional undo.
Furthermore, I used SET NAMES 'utf8' for the database. Now it is also correct in the database.

Categories