Php write strange character on txt file

Php write strange character on txt file - php

I to everyone, when i execute thi code for write on a file:
$fileTXT = 'prodotti.txt';
$newfileTXT = 'prodotti_2'.date("d-m-Y_h_m_s").'.txt';
if (!copy($fileTXT, $newfileTXT)) {
echo "Impossibile continuare, impossibile creare file TXT.";
exit;
}
$towriteinfile = "";
$fp = fopen($path . $filename, "r") or die("Couldn't open $filename");
$fpTXT = fopen($newfileTXT, 'w') or die("Couldn't open $newfileTXT");
while (!feof($fp)) {
$line = fgets($fp, 1024);
$arr = explode("\t", $line);
$arr[7] = '<img src="http://link/imgHigh/' . $arr[7] . '.jpg" />;';
echo "Prodotto: ".$arr[4]."<br>";
foreach ($arr as $fields) {
fwrite($fpTXT, $fields.";");
}
fwrite($fpTXT, "\n");
}
fclose($fpTXT);
fclose($fp);
I have thi result on txt file:
175;13563;desc;01;category;..............c etc etc.....
mercato.㰻浩⁧牳㵣栢瑴㩰⼯睷⹷獯畣慬楴挮浯椯⽴慣⽴浩䡧杩⽨　㄀⸀㄀　　⸀砀砀 漀欀ഀ਀樮杰•㸯㬻
the html code for image is written as chinese caharcter, why?

Do you want to add content to the end of $newFileTXT from $filename ?
IF so, you should change:
$fpTXT = fopen($newfileTXT, 'w') or die("Couldn't open $newfileTXT");
to
$fpTXT = fopen($newfileTXT, 'a') or die("Couldn't open $newfileTXT");

The file is probably interpreted as unicode (probably UTF-8). In unicode, characters can consist of multiple bytes. When you read the file, you just read 1024 bytes, which can result in half a unicode character at the end of the part that you read, and the other half at the start of the next part. When you start adding new characters inbetween, you get other unicode sequences instead, causing the text to be a complete mess.

I have resolved the problem, i have passed any line to this function:
function cleanString($string){
$string = preg_replace('/[\x00-\x1F\x80-\xFF]/', '', $string);
return $string;
}
My old string contained binary chars, i have cleaned the string and now all is ok

Related

utf-16le to UTF-8

I am using php on osx terminal to open the file generated with windows.
I confirmed file is utf-16le encoded
$file --mime myfile.ini
myfile.ini: text/plain; charset=utf-16le
Now I convert it to UTF-8 with this script.
while ($line = fgets($handle)) {
$line = rtrim($line);
$line = mb_convert_encoding($line,"UTF-8","UTF-16LE");
var_dump($line);
}
somehow it shows the corruption like this
string(63) "䘀爀漀洀䐀愀琀攀㴀㈀　㄀㄀⸀　㄀⸀　㄀ഀ਀"
How can I get the correct encoding???
When I don't use mb_convert_encoding
while ($line = fgets($handle)) {
$line = rtrim($line);
$line = mb_convert_encoding($line,"UTF-8","UTF-16LE");
var_dump($line);
if (preg_match('/Optimization/',$line)){print "hit";}
}
var_dump shows the strange result why 28????
string(28) "Optimization=0"
and preg_match also dosen't hit.

You could try doing this:
while ($line = fgets($handle)) {
$line = rtrim($line);
$line = iconv(mb_detect_encoding($line, mb_detect_order(), true), "UTF-8", $line);;
var_dump($line);
}

fgets() won't possibly detect line endings reliably if the stream isn't encoded in an ASCII-compatible encoding. Similarly, when rtrim() seeks for e.g. \n ('LINE FEED (LF)' (U+000A)) it expects a literal 0x0A but in UTF-16LE the encoding is 0x0A00. Bad things can happen.
I suggest you read the file in chunks that are a multiple of 4 bytes, so you won't split individual characters, and forget about line endings until you've successfully re-encoded the file:
$output = '';
while ($line = fgets($handle, 4 * 4096)) {
$output .= mb_convert_encoding($line, "UTF-8", "UTF-16LE");
}
var_dump(bin2hex($output));
Ideally, save output to a file so you can use a text editor or hexadecimal editor to inspect the result.

Finally I use UTF-16BE not UTF-16LE , it shows the correct strings.
My problem was solved.
$line = mb_convert_encoding($line,"UTF-8","UTF-16BE");
However I don't know why it works,
Even file commend says This file is utf-16le
$file --mime myfile.ini
myfile.ini: text/plain; charset=utf-16le

Excel utf8 encoding (BOM is not working)

I just want to export data in the csv format and open it in excel. This method writes one row into it.
public function writeRow(array $row)
{
$str = $this->rowToStr($row);
$encodedStr = mb_convert_encoding($str, 'UTF-16LE', 'UTF-8');
$ret = fwrite($this->_getFilePointer('w+'), $encodedStr);
/* According to http://php.net/fwrite the fwrite() function
should return false on error. However not writing the full
string (which may occur e.g. when disk is full) is not considered
as an error. Therefore both conditions are necessary. */
if (($ret === false) || (($ret === 0) && (strlen($str) > 0))) {
throw new Exception("Cannot open file $this",
Exception::WRITE_ERROR, NULL, 'writeError');
}
}
Then i will try to write a row.
$csvFile->writeRow(array(chr(0xEF) . chr(0xBB) . chr(0xBF)));
$csvHeaders = array('ID', 'Email', 'Variabilní symbol', 'Jméno', 'Příjmení',
'Stav', 'Zaregistrován', 'Zaregistrován do');
$csvFile->writeRow($csvHeaders);
And the result is :
ID,"Email","Variabilní symbol","Jméno","PYíjmení","Stav","Zaregistrován","Zaregistrován do"
Only a few letters are not correct (the method mb_convert_encoding does the trick)
I have tried the traditional way
// Open file pointer to standard output
$fp = fopen($filePath, 'w');
// Add BOM to fix UTF-8 in Excel
fputs($fp, $bom = (chr(0xEF) . chr(0xBB) . chr(0xBF)));
fclose($fp)
And the result was the same.

The BOM you've mentioned is for UTF-8, but your data is UTF-16LE. Therefore you should use a different BOM:
$bom = chr(0xFF) . chr(0xFE)
Or in your code:
$fp = fopen($filePath, 'w');
fputs($fp, chr(0xFF) . chr(0xFE));
// Add lines here...
fclose($fp);

including leading whitespaces when using fgets in php

I am using PHP to read a simple text file with the fgets() command:
$file = fopen("filename.txt", "r") or exit('oops');
$data = "";
while(!feof($file)) {
$data .= fgets($file) . '<br>';
}
fclose($file);
The text file has leading white spaces before the first character of each line. The fgets() is not grabbing the white spaces. Any idea why? I made sure not to use trim() on the variable. I tried this, but the leading white spaces still don't appear:
$data = str_replace(" ", " ", $data);
Not sure where to go from here.
Thanks in advance,
Doug
UPDATE:
The text appears correctly if I dump it into a textarea but not if I ECHO it to the webpage.

Function fgets() grabs the whitespaces. I don't know what you are exactly doing with the $data variable, but if you simply display it on a HTML page then you won't see whitespaces. It's because HTML strips all whitespaces. Try this code to read and show your file content:
$file = fopen('file.txt', 'r') or exit('error');
$data = '';
while(!feof($file))
{
$data .= '<pre>' . fgets($file) . '</pre><br>';
}
fclose($file);
echo $data;
The PRE tag allows you to display $data without parsing it.

Try it with:
$data = preg_replace('/\s+/', ' ', $data);
fgets should not trim whitespaces.

Try to read the file using file_get_contents it is successfully reading the whitespace in the begining of the file.
$data = file_get_contents("xyz.txt");
$data = str_replace(" ","~",$data);
echo $data;
Hope this helps

I currently have the same requirement and experienced that some characters are written as a tab character.
What i did was:
$tabChar = ' ';
$regularChar = ' '
$file = fopen('/file.txt');
while($line = fgets($file)) {
$l = str_replace("\t", $tabChar, $line);
$l = str_replace(" ", $regularChar, $line);
// ...
// replacing can be done till it matches your needs
$lines .= $l; // maybe append <br /> if neccessary
}
$result = '<pre'> . $lines . '</pre>';
This one worked for me, maybe it helps you too :-).

PHP Write and Read from Text File

I have an issue with writing and reading to text file.
I have to first write from a text file to another text file some values which I need to read again. Below are the code snippets:
Write to text file:
$fp = #fopen ("text1.txt", "r");
$fh = #fopen("text2.txt", 'a+');
if ($fp) {
//for each line in file
while(!feof($fp)) {
//push lines into array
$thisline = fgets($fp);
$thisline1 = trim($thisline);
$stringData = $thisline1. "\r\n";
fwrite($fh, $stringData);
fwrite($fh, "test");
}
}
fclose($fp);
fclose($fh);
Read from the written textfile
$page = join("",file("text2.txt"));
$kw = explode("\n", $page);
for($i=0;$i<count($kw);$i++){
echo rtrim($kw[$i]);
}
But, if I am not mistaken due to the "/r/n" I used to insert the newline, when I am reading back, there are issues and I need to pass the read values from only the even lines to a function to perform other operations.
How do I resolve this issue? Basically, I need to write certain values to a textfile and then read only the values from the even lines.

I'm not sure whether you have issues with the even line numbers or with reading the file back in.
Here is the solution for the even line numbers.
$page = join("",file("text2.txt"));
$kw = explode("\n", $page);
for($i=0;$i<count($kw);$i++){
$myValue = rtrim($kw[$i]);
if(i % 2 == 0)
{
echo $myValue;
}
}

fputcsv and newline codes

I'm using fputcsv in PHP to output a comma-delimited file of a database query. When opening the file in gedit in Ubuntu, it looks correct - each record has a line break (no visible line break characters, but you can tell each record is separated,and opening it in OpenOffice spreadsheet allows me to view the file correctly.)
However, we're sending these files on to a client on Windows, and on their systems, the file comes in as one big, long line. Opening it in Excel, it doesn't recognize multiple lines at all.
I've read several questions on here that are pretty similar, including this one, which includes a link to the really informative Great Newline Schism explanation.
Unfortunately, we can't just tell our clients to open the files in a "smarter" editor. They need to be able to open them in Excel. Is there any programmatic way to ensure that the correct newline characters are added so the file can be opened in a spreadsheet program on any OS?
I'm already using a custom function to force quotes around all values, since fputcsv is selective about it. I've tried doing something like this:
function my_fputcsv($handle, $fieldsarray, $delimiter = "~", $enclosure ='"'){
$glue = $enclosure . $delimiter . $enclosure;
return fwrite($handle, $enclosure . implode($glue,$fieldsarray) . $enclosure."\r\n");
}
But when the file is opened in a Windows text editor, it still shows up as a single long line.

// Writes an array to an open CSV file with a custom end of line.
//
// $fp: a seekable file pointer. Most file pointers are seekable,
// but some are not. example: fopen('php://output', 'w') is not seekable.
// $eol: probably one of "\r\n", "\n", or for super old macs: "\r"
function fputcsv_eol($fp, $array, $eol) {
fputcsv($fp, $array);
if("\n" != $eol && 0 === fseek($fp, -1, SEEK_CUR)) {
fwrite($fp, $eol);
}
}

This is an improved version of #John Douthat's great answer, preserving the possibility of using custom delimiters and enclosures and returning fputcsv's original output:
function fputcsv_eol($handle, $array, $delimiter = ',', $enclosure = '"', $eol = "\n") {
$return = fputcsv($handle, $array, $delimiter, $enclosure);
if($return !== FALSE && "\n" != $eol && 0 === fseek($handle, -1, SEEK_CUR)) {
fwrite($handle, $eol);
}
return $return;
}

Using the php function fputcsv writes only \n and cannot be customized. This makes the function worthless for microsoft environment although some packages will detect the linux newline also.
Still the benefits of fputcsv kept me digging into a solution to replace the newline character just before sending to the file. This can be done by streaming the fputcsv to the build in php temp stream first. Then adapt the newline character(s) to whatever you want and then save to file. Like this:
function getcsvline($list, $seperator, $enclosure, $newline = "" ){
$fp = fopen('php://temp', 'r+');
fputcsv($fp, $list, $seperator, $enclosure );
rewind($fp);
$line = fgets($fp);
if( $newline and $newline != "\n" ) {
if( $line[strlen($line)-2] != "\r" and $line[strlen($line)-1] == "\n") {
$line = substr_replace($line,"",-1) . $newline;
} else {
// return the line as is (literal string)
//die( 'original csv line is already \r\n style' );
}
}
return $line;
}
/* to call the function with the array $row and save to file with filehandle $fp */
$line = getcsvline( $row, ",", "\"", "\r\n" );
fwrite( $fp, $line);

As webbiedave pointed out (thx!) probably the cleanest way is to use a stream filter.
It is a bit more complex than other solutions, but even works on streams that are not editable after writing to them (like a download using $handle = fopen('php://output', 'w'); )
Here is my approach:
class StreamFilterNewlines extends php_user_filter {
function filter($in, $out, &$consumed, $closing) {
while ( $bucket = stream_bucket_make_writeable($in) ) {
$bucket->data = preg_replace('/([^\r])\n/', "$1\r\n", $bucket->data);
$consumed += $bucket->datalen;
stream_bucket_append($out, $bucket);
}
return PSFS_PASS_ON;
}
}
stream_filter_register("newlines", "StreamFilterNewlines");
stream_filter_append($handle, "newlines");
fputcsv($handle, $list, $seperator, $enclosure);
...

alternatively, you can output in native unix format (\n only) then run unix2dos on the resulting file to convert to \r\n in the appropriate places. Just be careful that your data contains no \n's . Also, I see you are using a default separator of ~ . try a default separator of \t .

I've been dealing with a similiar situation. Here's a solution I've found that outputs CSV files with windows friendly line-endings.
http://www.php.net/manual/en/function.fputcsv.php#90883
I wasn't able to use the since I'm trying to stream a file to the client and can't use the fseeks.

windows needs \r\n as the linebreak/carriage return combo in order to show separate lines.

I did eventually get an answer over at experts-exchange; here's what worked:
function my_fputcsv($handle, $fieldsarray, $delimiter = "~", $enclosure ='"'){
$glue = $enclosure . $delimiter . $enclosure;
return fwrite($handle, $enclosure . implode($glue,$fieldsarray) . $enclosure.PHP_EOL);
}
to be used in place of standard fputcsv.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Php write strange character on txt file - php

Do you want to add content to the end of $newFileTXT from $filename ? IF so, you should change: $fpTXT = fopen($newfileTXT, 'w') or die("Couldn't open $newfileTXT"); to $fpTXT = fopen($newfileTXT, 'a') or die("Couldn't open $newfileTXT");

I have resolved the problem, i have passed any line to this function: function cleanString($string){ $string = preg_replace('/[\x00-\x1F\x80-\xFF]/', '', $string); return $string; } My old string contained binary chars, i have cleaned the string and now all is ok

Related

utf-16le to UTF-8

Excel utf8 encoding (BOM is not working)

including leading whitespaces when using fgets in php

PHP Write and Read from Text File

fputcsv and newline codes

Categories

Resources