How to get FGETCSV to read entire Japanese cell - php

I have a large study conducted with about 50 questions and 70,000 entries, so manually editing or using pivot tables just won't really work, I need to upload the data into a database. I can't get the Japanese characters to be read with any accuracy while using fgcsv(). I've tried setting the locale to UTF-8 and SJIS, but neither one seem to want to read all of the Japanese characters. I read somewhere this might be a bug, but I don't know..
The data looks like this:
Q-004 必須回答 あなたは、以下のどちらにお住まいですか? S/A
1 北海道 Hokkaido
2 青森県 Aomori
3 岩手県 Iwate
4 宮城県 Miyagi
5 秋田県 Akita
Here is my code:
setlocale(LC_ALL, 'ja_JP.SJIS');
$fp = fopen($_POST["filename"],'r') or die("can't open file");
$csv_line = fgetcsv($fp,1024);
$query = "";
$count = 0;
$question = false;
while($csv_line = fgetcsv($fp,1024)) {
if (!$question && strpos($csv_line[0],"Q-")!== false)
{
echo "Found a question: ".$csv_line[2] . "<br>";
$question = true;
}
else if($question && strlen($csv_line[0])==0)
{
echo "<hr>";
$question = false;
}
else if($question && intval($csv_line[0])>0)
{
echo $csv_line[0]. " has value ". $csv_line[2]." - ".$csv_line[3]. "<br>";
}
$count++;
}
echo "$count records read successfully";
fclose($fp) or die("can't close file");
Here is the result:
Found a question: A以下のどちらにお住まいですか?
1 has value k海道 - Hokkaido
2 has value X県 - Aomori
3 has value - Iwate
4 has value {城県 - Miyagi
5 has value H田県 - Akita

When it comes to reading a CSV in PHP, I would say... don't do it, and use an SQL database instead, wherein you can set a collation such as ujis_japanese_ci in MySQL.
You should be able to easily import your CSV into a MySQL database using phpMyAdmin, if that is what you have, and then render the data from the MySQL database instead of reading a CSV file.
It is a work-around, granted, but my general experience is that CSV + foreign/special characters == problems.
I believe it is at least worth the try. Good luck

Related

Writing into csv with php problems

I have a problem. I'm trying to get some data from a database into a .csv table.
$fn=fopen($path.$filename, "w");
$addstring = file_get_contents($path.$filename);
$addstring = 'Azonosito;Datum;Ido;Leiras;IP-cim;allomasnév;MAC-cim;Felhasznalonev;Tranzakcioazonosito;Lekerdezes eredmenye;Vizsgalat ideje;Korrelacios azonosito;DHCID;';
/*$addstring .= "\n";*/
$sql="select * from dhcpertekeles.dhcpk";
$result =mysqli_query($conn, $sql);
if ($result=mysqli_query($conn,$sql))
{
while ($row=mysqli_fetch_row($result))
{
$addstring .= "\n".$row[0].";".$row[1].";".$row[2].";".$row[3].";".$row[4].";".$row[5].";".$row[6].";".$row[7].";".$row[8].";".$row[9].";".$row[10].";".$row[11].";".$row[12].";";
};
};
/*file_put_contents($path.$filename, $addstring);*/
fwrite($fn, $addstring);
fclose($fn);
The data is in the following format:
The first addstring contains the column names, and has no issues
the second (addstring .=) contains the data:
ID($row[0]), Date($row[1]), Time($row[2]), Description($row[3]), IP($row[4]), Computer name($row[5]), MAC($row[6]), User($row[7])(empty), Transactionid($row[8]), query result($row[9]), query time($row[10]), correlation id($row[11])(empty), DHCID($row[12])(empty)
It is basically daily DHCP server data, uploaded to a database. Now, the code works, it does write everything i want to the csv, but there are 2 problems.
1, the code for some inexplicable reason, inserts an empty row into the csv table between the rows that contain the data. Removing $row[12] fixes this. I tried removing special characters, converting spaces into something that can be seen, and even converting empty string into something that can be seen. Yet nothing actually worked, i even tried file_puts_content(same for the second problem) instead of fwrite, but nothing. The same thing keeps happening. If i remove \n it will work, but the 2nd row onwards will be misplaced to the right by 1 column.
2, For some reason, the last 2 character is removed from the csv. The string that is to be inserted into the csv still contains said 2 characters before writing it to the file. Tried both fwrite and file_puts_content.
As for the .csv format, the data clumns are divided by ; and rows by \n.
Also tried reading the file with both libre office and excel thinking it might be excel that was splurging but no.
Try using fputcsv() function. I didn't test following code but I think it should work.
$file = fopen($path . $filename, 'w');
$header = array(
'Azonosito',
'Datum',
'Ido',
'Leiras',
'IP-cim',
'allomasnév',
'MAC-cim',
'Felhasznalonev',
'Tranzakcioazonosito',
'Lekerdezes eredmenye',
'Vizsgalat ideje',
'Korrelacios azonosito',
'DHCID'
);
fputcsv($file, $header, ';');
$sql = "select * from dhcpertekeles.dhcpk";
$result = mysqli_query($conn, $sql);
if ($result = mysqli_query($conn, $sql)) {
while ($row = mysqli_fetch_row($result)) {
fputcsv($file, $row, ';');
}
}
fclose($file);
The $addstring = file_get_contents($path.$filename) doesn't does nothing because you're overwriting that variable in the next line.
To remove the extra row on 12 did you tried removing the \n AND the \r with something like:
$row[12] = strtr($row[12], array("\n"=>'', "\r"=>''));
You can also check which ascii characters are you receiving in the $row[12] with this function taken form the php site:
function AsciiToInt($char){
$success = "";
if(strlen($char) == 1)
return "char(".ord($char).")";
else{
for($i = 0; $i < strlen($char); $i++){
if($i == strlen($char) - 1)
$success = $success.ord($char[$i]);
else
$success = $success.ord($char[$i]).",";
}
return "char(".$success.")";
}
}
Another tip can be the database it's returning UTF-8 or UTF-16 and you're losing some characters in the text file.
Try looking at that with the mb_detect_encoding function.

php+odbc writing to files limitation

i got a function in PHP to read table from ODBC (to IBM AS400) and write it to a text file on daily basis. it works fine until it reach more than 1GB++. Then it just stop to some rows and didn't write completely.
function write_data_to_txt($table_new, $query)
{
global $path_data;
global $odbc_db, $date2;
if(!($odbc_rs = odbc_exec($odbc_db,$query))) die("Error executing query $query");
$num_cols = odbc_num_fields($odbc_rs);
$path_folder = $path_data.$table_new."/";
if (!file_exists($path_folder)) mkdir ($path_folder,0777);
$filename1 = $path_folder. $table_new. "_" . $date2 . ".txt";
$comma = "|";
$newline = chr(13).chr(10);
$handle = fopen($filename1, "w+");
if (is_writable($filename1)) {
$ctr=0;
while(odbc_fetch_row($odbc_rs))
{
//function for writing all field
// for($i=1; $i<=$num_cols; $i++)
// {
// $data = odbc_result($odbc_rs, $i);
// if (!fwrite($handle, $data) || !fwrite($handle, $comma)) {
// print "Cannot write to file ($filename1)";
// exit;
// }
//}
//end of function writing all field
$data = odbc_result($odbc_rs, 1);
fwrite($handle,$ctr.$comma.$data.$newline);
$ctr++;
}
echo "Write Success. Row = $ctr <br><br>";
}
else
{
echo "Write Failed<br><br>";
}
fclose($handle);
}
no errors, just success message but it should be 3,690,498 rows (and still increase) but i just got roughly 3,670,009 rows
My query is ordinary select like :
select field1 , field2, field3 , field4, fieldetc from table1
What i try and what i assume :
I think it was fwrite limitation so i try not to write all field (just write $ctr and 1st record) but it still stuck in same row.. so i assume its not about fwrite exceed limit..
I try to reduce field i select and it can works completely!! so i assume it have some limitation on odbc.
I try to use same odbc datasource with SQL Server and try to select all field and it give me complete rows. So i assume its not odbc limitation.
Even i try on 64 bits machine but it even worse, it just return roughly 3,145,812 rows.. So i assume it's not about 32/64 bit infrastructure.
I try to increase memory_limit in php ini to 1024mb but it didnt work also..
Is there anyone know if i need to set something in my PHP to odbc connection??

SELECT DISTINCT still showing duplicate result

I have a HTML select form filled by SQL Query using SELECT DISTINCT...
The idea is to not show duplicate values from database, and it's almost working, but in some case it's giving a problem. To fill the SQL columns I'm using a PHP reading a TXT file with delimiters and explode function. If I have in the TXT file 10 duplicate columns, on my HTML it shows 2, instead of only 1... And I note that 1 is with 9 of the entries from database, and the other one have 1 entry that is always the last line from TXT file.
Resuming: the last line of TXT file always duplicate on the HTML select form.
Checking the database, everything looks ok, I really don't know why it's duplicating always in the last one.
I told about the PHP that make the SQL entry because i'm not sure if the problem is in the PHP that contains the HTML select or in the PHP that fill the database... I believe that the problem is in the PHP with HTML select since I'm looking in the database and everything is ok. The SQL query in this php is like this:
<td class="formstyle"><select name="basenamelst" class="formstyle" id="basenamelst">
<option value="Any">Any</option>
<?
$sql = mysql_query("SELECT DISTINCT basename FROM dumpsbase where sold=0");
while($row = mysql_fetch_assoc($sql))
{
if($row['basename'] == "")
{
echo '<option value="'.htmlspecialchars($row['basename'], ENT_QUOTES, 'UTF-8').'">unknOwn</option>';
}
else
{
echo '<option value="'.htmlspecialchars($row['basename'], ENT_QUOTES, 'UTF-8').'">'.htmlspecialchars($row['basename'], ENT_QUOTES, 'UTF-8').'</option>';
}
}
?>
</select>
Remember: if I upload to database 10 duplicate columns, it shows 2 on select. One with 9 entries, and another with 1 entry (always the last line of my TXT file)...
Okay, many people told me to trim() the columns and it still showing duplicate... So I came to the conclusion that I have some issue while loading the TXT for database. Here is the code where I get the values to put on database:
$file = fopen($targetpath, "r") or exit("Unable to open uploaded file!");
while(!feof($file))
{
$line = fgets($file);
$details = explode(" | ", $line);
foreach($details as &$value) // clean each field
{
$value = mysql_real_escape_string($value);
if($value == "")
{
$value = "NONE";
}
}
unset($value);
mysql_query("INSERT INTO dumpsbase VALUES('NULL', '$details[0]', '$details[1]', '$details[2]', '$details[3]', '$details[4]', '0', '$price', 'NONE', now(), 'NONE', 'NONE')") or die ("Uploading Error!");
It sounds to me like the error is when you are populating the table from the file, and that one of the values is ending up subtly different to the others.
The fact that it's the last line that differs makes me wonder if there are newline characters being included in each value (except that last line).
If this is the case, you should be able to correct it by running trim() or similar in your DB.
[Edit] Ideally, you want to do this as early as possible, i.e. correct the data rather than remembering it's wrong when you access it. If you can't find why the initial import is messing it up, you could correct the data immediately afterwards with UPDATE dumpsbase SET basename = TRIM(basename)
Try changing your query to the following:
SELECT DISTINCT TRIM(basename) FROM dumpsbase WHERE sold=0
Hope this helps.

MySQL to Excel charset issue

I have a database set up which accepts user registrations and their details etc. I'm looking to export the database to an excel file using php.
The problem I am having is that some of the entrants have entered foreign characters in, such as Turkish, which has been written into the database 'incorrectly' - as far as I have ascertained, the charset was likely set up incorrectly when it was first made.
I have made my code to export the database into excel (below) but I cannot get the Excel document to show correctly regardless of how I try to encode the data
<?php
require_once('../php/db.php');
header("Content-type: application/octet-stream");
header("Content-Disposition: attachment; filename=Download.xls");
header("Pragma: no-cache");
header("Expires: 0");
$query = "SELECT * FROM users";
$result = mysqli_query($link, $query);
if($result) {
$count = mysqli_num_rows($result);
for($i=0; $i<$count; $i++) {
$field = mysqli_fetch_field($result);
$header .= $field->name."\t";
while($row = mysqli_fetch_row($result)) {
$line = '';
foreach($row as $value) {
if((!isset($value)) OR ($value == "")) {
$value = "\t";
} else {
$value = str_replace('"', '""', $value);
$value = '"'.$value.'"'."\t";
}
$line .= $value;
}
$data .= trim($line)."\n";
}
$data = str_replace("\r", "", $data);
if($data == "") {
$data = "\n(0) Records Found!\n";
}
}
print mb_convert_encoding("$header\n$data", 'UTF-16LE', 'UTF-8');
} else die(mysqli_error());
?>
When I do this it comes up with an error when opening it up saying that Excel doesn't recognise the file type, it opens the document but its drawn boxes around all the Turkish characters its tried to write.
I'm no PHP expert this is just information I've kind of pieced together.
Can anyone give me a hand?
Much appreciated
Moz
First of all, you appear to be creating a tab-delimited text file and then returning it to the browser with the MIME-type application/octet-stream and the file extension .xls. Excel might work out that's tab-delimited (but it sounds from your error as though it doesn't), but in any case you really should use the text/tab-separated-values MIME type and .txt file extension so that everything knows exactly what the data is.
Secondly, to create tab-delimited files, you'd be very wise to export the data directly from MySQL (using SELECT ... INTO OUTFILE), as all manner of pain can arise with escaping delimiters and such when you try to cook it yourself. For example:
SELECT * FROM users INTO OUFILE '/tmp/users.txt' FIELDS TERMINATED BY '\t'
Then you would merely need to read the contents of that file to the browser using readfile().
If you absolutely must create the delimited file from within PHP, consider using its fputscsv() function (you can still specify that you wish to use a tab-delimiter).
Always use the .txt file extension rather than .csv even if your file is comma-separated as some versions of Excel assume that all files with the .csv extension are encoded using Windows-1252.
As far as character encodings go, you will need to inspect the contents of your database to determine whether data is stored correctly or not: the best way to do this is to SELECT HEX(column) ... in order that you can inspect the underlying bytes. Once that has been determined, you can UPDATE the records if conversions are required.

Import large file on MySQL DB

I want to insert about 50,000 mysql query for 'insert' in mysql db,
for this i have 2 options,
1- Directly import the (.sql) file:
Following error is occur
" You probably tried to upload too large file. Please refer to documentation for ways to workaround this limit. "
2- Use php code to insert these queries in form of different chunks from the (.sql) file.
here is my code:
<?php
// Configure DB
include "config.php";
// Get file data
$file = file('country.txt');
// Set pointers & position variables
$position = 0;
$eof = 0;
while ($eof < sizeof($file))
{
for ($i = $position; $i < ($position + 2); $i++)
{
if ($i < sizeof($file))
{
$flag = mysql_query($file[$i]);
if (isset($flag))
{
echo "Insert Successfully<br />";
$position++;
}
else
{
echo mysql_error() . "<br>\n";
}
}
else
{
echo "<br />End of File";
break;
}
}
$eof++;
}
?>
But memory size error is occur however i have extend memory limit from 128M to 256M or even 512M.
Then i think that if i could be able to load a limited rows from (.sql) file like 1000 at a time and execute mysql query then it may be import all records from file to db.
But here i dont have any idea for how to handle file start location to end and how can i update the start and end location, so that it will not fetch the previously fetched rows from .sql file.
Here is the code you need, now prettified! =D
<?php
include('config.php');
$file = #fopen('country.txt', 'r');
if ($file)
{
while (!feof($file))
{
$line = trim(fgets($file));
$flag = mysql_query($line);
if (isset($flag))
{
echo 'Insert Successfully<br />';
}
else
{
echo mysql_error() . '<br/>';
}
flush();
}
fclose($file);
}
echo '<br />End of File';
?>
Basically it's a less greedy version of your code, instead of opening the whole file in memory it reads and executes small chunks (one liners) of SQL statements.
Instead of loading the entire file into memory, which is what's done when using the file function, a possible solution would be to read it line by line, using a combinaison of fopen, fgets, and fclose -- the idea being to read only what you need, deal with the lines you have, and only then, read the next couple of ones.
Additionnaly, you might want to take a look at this answer : Best practice: Import mySQL file in PHP; split queries
There is no accepted answer yet, but some of the given answers might already help you...
Use the command line client, it is far more efficient, and should easily handle 50K inserts:
mysql -uUser -p <db_name> < dump.sql
I read recently about inserting lots of queries into a database to quickly. The article suggested using the sleep() (or usleep) function to delay a few seconds between queries so as not to overload the MySQL server.

Categories