I need to export some data using PHP and for each line I'm adding a \r\n. When I open the exported data file that I downloaded, I see that the \r\n is interpreted as [LF] in Notepad.
But the application in which I open the file doesn't read the [LF] as a new line.
Now if I do a [CR][LF] in Notepad, the application can read the [CR][LF] as a new line.
How can I echo the equivalent of [CR][LF] with PHP?
It's as simple as doing:
echo "\r\n";
(note the double quotes)
Do echo PHP_EOL;
That way it'll always display the correct linefeed/carriage return combination that's valid for the system you're on, since not all OSes use the same newline convention. (More information: PHP global constants.)
Problem solved: the string was passed in the POST parameter. Removing the \r.
You just have to do a str_replace("\n", "\r\n", $...);.
I need to convert a CSV file exported from Mac Excel 2011 to an importable format recognized by a CMS (the solution should not be related, however the import format is for Drupal Feeds module, although the target).
In order to do this currently I need to perform the following operations in Vim:
:%s/\r/\r/g
:w ++enc=utf8
Which basically means:
Convert carriage returns to some sort of universal format
Initially as Excel exports them, the carriage return character is represented by ^M
the Vim command :%s/\r/\r/g converts them all to a format the CMS recognizes as a carriage return
Convert the character encoding to UTF8.
As exported initially, the character set is ASCII Extended or something similar.
Ideally this process will need to be triggered upon uploading the file as part of the import, which means PHP will trigger the process, whether that has any bearing on the process. However I feel more comfortable at this point handling the solution as a shell script or something similar, but of course PHP solutions are welcome if I can figure out how to hook it into Drupal 7 Feeds.
Some untested code:
#!/bin/php
<?php
$replacements = array(
// Adjust destination char to your liking
"\r\n" => "\n",
"\r" => "\n",
"\n" => "\n",
);
// No risk to split chars: input is single byte
while( $line = fread(STDIN, 10240) ){
// Normalize line feeds
$line = strtr($line, $replacements);
// Convert to UTF-8 (adjust source encoding to your needs)
$line = iconv('CP1252', 'UTF-8', $line);
fwrite(STDOUT, $line);
}
Usage:
./fix-csv < input.csv > output.csv
I am generating a CSV file but the people who are processing these file tells me it needs to be in ASCII format?? How do I go about to make that?
This is what I have to generate the file:
$filename = '/logs/'.date('Ymd').'.txt';
$myfile = fopen($filename,'a');
fwrite($myfile, $data);
fclose($myfile);
This file generates fine and opens fine...everything is ok to the naked eye but they said it needs to be in ascii format...
Output of file:
"","932-4","Mike","Tanner","","1234 Testing Lane","","Los Angeles","CA","90066","","(993)857-7727","","","","SALE","","","V","4111111111111111","01/14","AXLW","","ZENC","","","REG","","511.80","","07/21/11","932-359","D1234","4","","1","","","","","","","Tanner","Mike","","1234 Testing Lane","","CA","Los Angeles","90066","","CC","","","","Y","100.00","","100.00","","","","","","","","Y","11.8","info#info.com","359","001","001","(993)857-7727","(993)857-7727","","","","","","","","","","","","","","","","","","","","222","","","","","","","","","","","","","",
Anyone?
Thanks...
I'm going to play Carnac the Magnificent and say that you're just using a line-feed (ascii 10, aka \n) to terminate each line. I'll bet they want carriage-return plus line-feed (ascii 13,10). Just a wild guess. :)
ANSI = Windows-1252, so probably: $data = iconv("windows-1252","ASCII",$data);
I am trying to search and replace special characters in strings that I am parsing from a csv file. When I open the text file with vim it shows me the character is <95> . I can't for the life of me figure out what character this is to use preg_replace with. Any help would be appreciated.
Thanks,
Chris Edwards
0x95 is probably supposed to represent the character U+2022 Bullet (•), encoded in Windows code page 1252. You can get rid of it in a byte string using:
$line= str_replace("\x95", '', $line);
or you can use iconv to convert the character set of the data from cp1252 to utf8 (or whatever other encoding you want), if you've got a CSV parser that can read non-ASCII characters reliably. Otherwise, you probably want to remove all non-ASCII characters, eg with:
$line= preg_replace("/[\x80-\xFF]/", '', $line);
If your CSV parser is fgetcsv() you've got problems. Theoretically you should be able to do this as a preprocessing step on a string before passing it to str_getcsv() (PHP 5.3) instead. Unfortunately this also means you have to read the file and split it row-by-row yourself, and this is not trivial to do given that quoted CSV values may contain newlines. By the time you've written the code to handle properly that you've pretty much written a CSV parser. So what you actually have to do is read the file into a string, do your pre-processing changes, write it back out to a temporary file, and have fgetcsv() read that.
The alternative would be to post-process each string returned by fgetcsv() individually. But that's also unpredictable, because PHP mangles the input by decoding it using the system default encoding instead of just giving you the damned bytes. And the default encoding outside of Windows is usually UTF-8, which won't read a 0x95 byte on its own as that'd be an invalid byte sequence. And whilst you could try to work around that using setlocale() to change the system default encoding, that is pretty bad practice which won't play nicely with any other apps you've got running that depend on system locale.
In summary, PHP's built-in CSV parsing stuff is pretty crap.
Following Bobince's suggestion, the following worked for me:
analyse_file() -> http://www.php.net/manual/en/function.fgetcsv.php#101238
function file_get_contents_utf8($fn) {
$content = file_get_contents($fn);
return mb_convert_encoding($content, 'UTF-8', mb_detect_encoding($content, 'UTF-8, ISO-8859-1', true));
}
if( !($_FILES['file']['error'] == 4) ) {
foreach($_FILES as $file) {
$n = $file['name'];
$s = $file['size'];
$filename = $file['tmp_name'];
ini_set('auto_detect_line_endings',TRUE); // in case Mac csv
// dealing with fgetcsv() special chars
// read the file into a string, do your pre-processing changes
// write it back out to a temporary file, and have fgetcsv() read that.
$file = file_get_contents_utf8($filename);
$tempFile = tempnam(sys_get_temp_dir(), '');
$handle = fopen($tempFile, "w+");
fwrite($handle,$file);
fseek($handle, 0);
$filename = $tempFile;
// END -- dealing with fgetcsv() special chars
$Array = analyse_file($filename, 10);
$csvDelim = $Array['delimiter']['value'];
while (($data = fgetcsv($handle, 1000, $csvDelim)) !== FALSE) {
// process the csv file
}
} // end foreach
}
For example, when I create a new file:
$message = "Hello!";
$fh = fopen(index.html, 'w');
fwrite($fh, $message);
fclose($fh);
How can I set it's encoding(utf-8 or shift-jis or euc-jp) and linebreaks(LF or CR+LF or CR) in PHP?
The encoding of a string literal should match the encoding of the source file, to convert between encodings you could use iconv.
$utf8=iconv("ISO-8859-1", "UTF-8", $message);
Line breaks are entirely up to you. You could use the PHP_EOL constant, or if you think you might need to vary the type of line break, store the desired sequence in a variable and configure it at runtime.
To add carriage returns and linefeeds use the special characters \r and \n. So:
$message = "Hello!\r\n";