writing special characters to txt file - php

I am reading some data from a remote file, got every thing working till the point when i write some specific lines to a text file.
problem here is, when i write something like Girl's Goldtone 'X' CZ Ring it becomes Girl & apos;s Goldtone &apos ;X & apos; CZ Ring in txt file.
how do i write to txt file so that it retains text like written above and not show character code but actual character.
sample of my code.
$content_to_write = '<li class="category-top"><span class="top-span"><a class="category-top" href="'.$linktext.'.html">'.$productName.'</a></span></li>'."\r\n";
fwrite($fp, $content_to_write);
$linktext = "Girls-Goldtone-X-CZ-Ring";
$productName = "Girl's Goldtone 'X' CZ Ring";
var_dump
string '<li class="category-top"><span class="top-span"><a class="category-top" href="Stellar-Steed-Gallery-wrapped-Canvas-Art.html">&apos;Stellar Steed&apos; Gallery-wrapped Canvas Art</a></span></li>
' (length=195)
Code
$productName =$linktext;
$linktext = str_replace(" ", "-", $linktext);
$delChar = substr($linktext, -1);
if($delChar == '.')
{
$linktext = substr($linktext, 0, -1);
}
$linktext = removeRepeated($linktext);
$linktext = remove_invalid_char($linktext);
$productName = html_entity_decode($productName);
$content_to_write = '<li class="category-top"><span class="top-span"><a class="category-top" href="'.$linktext.'.html">'.$productName.'</a></span></li>'."\r\n";
var_dump($content_to_write);
fwrite($fp, utf8_encode($content_to_write));

Is it that you are reading the data from a remote file and then writing the same to a txt file? Agree with the above comment, its an issue with encoding. Try the following code:
$file = file_get_contents("messages.txt");
$file = mb_convert_encoding($file, 'HTML-ENTITIES', "UTF-8");
echo $file;
echo the response to your browser and see. If found proper, write the response to your txt file. Ensure that your txt file is UTF8 - encoded.
Check this out:: Write Special characters in a file.

fwrite is binary-safe, meaning it doesn't do any encoding stuff but just writes whatever you feed it directly to the file. It looks like the $productName variable you're writing is already entity-encoded before writing. Try running html_entity_decode over the variable first.
Note that html_entity_decode doesn't touch single quotes (&apos;) by default; you'll have to set the ENT_QUOTES flag in the second parameter. You might also want to explicitly specify an encoding in the third parameter.

Related

Trying to read a csv file with thailand's character in it using php but after reading it the characters are changed to some unidentified characters

I have a csv file that have data like this:
Sub District District
A Hi อาฮี Tha Li District ท่าลี่
A Phon อาโพน Buachet District บัวเชด
when I tried to read it using php code by following this SO question:
<?php
//set internal encoding to utf8
mb_internal_encoding('utf8');
$fileContent = file_get_contents('thai_unicode.csv');
//convert content from unicode to utf
$fileContentUtf = mb_convert_encoding($fileContent, 'utf8', 'unicode');
echo "parse utf8 string:\n";
var_dump(str_getcsv($fileContentUtf, ';'));
But it didn't work at all. Someone please let me know what I am doing wrong here.
Thanks in advance.
There are 2 issues with your code:
Your code applies str_getcsv to whole file contents (instead of individual line)
Your code example is using delimiter ";" but there is no such symbol in your input file.
Your data is in either fixed field length format (which is actually not a csv file) or in tab delimited csv file format.
If it is tab delimited file format then you can use 2 ways to read your file:
$lines = file('thai_unicode.csv');
foreach($lines as $line){
$data = str_getcsv($line,"\t");
echo "sub_district: ". $data[0].", district: ".$data[1]."\n";
}
or
$f = fopen('thai_unicode.csv',"r");
while($data = fgetcsv($f,0,"\t")){
echo "sub_district: ". $data[0].", district: ".$data[1]."\n";
}
fclose($f);
And in case you have fixed length fields data format you need to split each line yourself because csv related php function are not suitable for this purpose.
So you will end up with something like this:
$f = fopen('thai_unicode.csv',"r");
while($line = fgets($f)){
$sub_district = mb_substr($line,0,20);
$district = mb_substr($line,20);
echo "sub_district: $sub_district, district: $district\n";
}
fclose($f);

How to format I/O data from script

I was using a script to exclude a list of words from another list of keywords. I would like to change the format of the output. (I found the script on this website and I have made some modification.)
Example:
Phrase from outcome: my word
I would like to add quotes: "my word"
I was thinking that I should put the outcome in new-file.txt and after to rewrite it, but I do not understand how to capture the result. Please, kindly give me some tips. It's my first script :)
Here is the code:
<?php
$myfile = fopen("newfile1.txt", "w") or die("Unable to open file!");
// Open a file to write the changes - test
$file = file_get_contents("test-action-write-a-doc-small.txt");
// In small.txt there are words that will be excluded from the big list
$searchstrings = file_get_contents("test-action-write-a-doc-full.txt");
// From this list the script is excluding the words that are in small.txt
$breakstrings = explode(',',$searchstrings);
foreach ($breakstrings as $values){
if(!strpos($file, $values)) {
echo $values." = Not found;\n";
}
else {
echo $values." = Found; \n";
}
}
echo "<h1>Outcome:</h1>";
foreach ($breakstrings as $values){
if(!strpos($file, $values)) {
echo $values."\n";
}
}
fwrite($myfile, $values); // write the result in newfile1.txt - test
// a loop is missing?
fclose($myfile); // close newfile1.txt - test
?>
There is also a little mistake in the script. It works fine however before entering the list of words in test-action-write-a-doc-full.txt and in test-action-write-a-doc-small.txt I have to put a break for the first line otherwise it does not find the first word.
Example:
In test-action-write-a-doc-small.txt words:
pick, lol, file, cool,
In test-action-write-a-doc-full.txt wwords:
pick, bad, computer, lol, break, file.
Outcome:
Pick = Not found -- here is the mistake.
It happens if I do not put a break for the first line in .txt
lol = Found
file = Found
Thanks in advance for any help! :)
You can collect the accepted words in an array, and then glue all those array elements into one text, which you then write to the file. Like this:
echo "<h1>Outcome:</h1>";
// Build an array with accepted words
$keepWords = array();
foreach ($breakstrings as $values){
// remove white space surrounding word
$values = trim($values);
// compare with false, and skip empty strings
if ($values !== "" and false === strpos($file, $values)) {
// Add word to end of array, you can add quotes if you want
$keepWords[] = '"' . $values . '"';
}
}
// Glue all words together with commas
$keepText = implode(",", $keepWords);
// Write that to file
fwrite($myfile, $keepText);
Note that you should not write !strpos(..) but false === strpos(..) as explained in the docs.
Note also that this method of searching in $file will maybe give unexpected results. For instance, if you have "misery" in your $file string then the word "is" (if separated by commas in the original file) will be refused, as it is found in $file. You might want to review this.
Concerning the second problem
The fact that it does not work without first adding a line-break in your file leads me to think it is related to the Byte-Order Mark (BOM) that appears in the beginning of many UTF-8 encoded files. The problem and possible solutions are discussed here and elsewhere.
If indeed it is this problem, there are two solutions I would propose:
Use your text editor to save the file as UTF-8, but without BOM. For instance, notepad++ has this possibility in the encoding menu.
Or, add this to your code:
function removeBOM($str = "") {
if (substr($str, 0,3) == pack("CCC",0xef,0xbb,0xbf)) {
$str = substr($str, 3);
}
return $str;
}
and then wrap all your file_get_contents calls with that function, like this:
$file = removeBOM(file_get_contents("test-action-write-a-doc-small.txt"));
// In small.txt there are words that will be excluded from the big list
$searchstrings = removeBOM(file_get_contents("test-action-write-a-doc-full.txt"));
// From this list the script is excluding the words that are in small.txt
This will strip these funny bytes from the start of the string taken from the file.

Writing to a file adds weird content at the end of the line

I am working on a program that parses text files uploaded by a user and then saves the parsed XML file on the server. However, when I write the XML file I get some the text
at the end of each line. This text is not in my original text file. I didn't even notice it until I opened the new XML file to verify that it was righting all of the content. Has anyone ran into this before and if so can you tell me if it's due to the way I'm creating and writing my file?
fileUpload.php - These 3 lines occur when the user uploads the file.
$fileName = basename($_FILES['fileaddress']['name']);
$fileContents = file_get_contents($_FILES['fileaddress']['tmp_name']);
$xml = $parser->parseUnformattedText($fileContents);
$parsedFileName = pathinfo($fileName, PATHINFO_FILENAME) . ".xml";
file_put_contents($parsedFileName, $xml);
parser.php
function parseUnformattedText($inputText, $bookName = "")
{
//create book, clause, text nodes
$book = new SimpleXmlElement("<book></book>");
$book->addAttribute("bookName", $bookName);
$conj = $book->addChild("conj", "X");
$clause = $book->addChild("clause");
$trimmedText = $this->trimNewLines($inputText);
$trimmedText = $this->trimSpaces($inputText);
$text = $clause->addChild("text", $trimmedText);
$this->addChapterVerse($text, "", "");
//make list of pconj's for beginning of file
$pconjs = $this->getPconjList();
//convert the xml to string
$xml = $book->asXml();
//combine the list of pconj's and xml string
$xml = "$pconjs\n$xml";
return $xml;
}
Input text file
1:1 X
it seemed good to me also,
X
having had perfect understanding of all things from the very first
to write you an orderly account, [most] excellent Theophilius
and
1:4
that
you may know the certainty of those things in which you were instructed
1:5 X
There was in the days of Herod, the king of Judea and a certain priest named Zacharias
X
his wife[was] of the daughters of Aaron
and
her name [was] Elizabeth.
1:8 So
it was,
that
while he was serving as priest 1:9 before God in the order of his division,
1:10 and
the whole multitude of the people was praying outside at the hour of incense
but
therefore
it was done.
Going off of Seroczynski's answer I was able to create a function that trimmed removed any carriage returns from the text. The XML output looked fine after that. Here's the function I used to fix the issue:
function trimCarriageReturns($text)
{
$textOut = str_replace("\r", "\n", $text);
$textOut = str_replace("\n\n", "\n", $textOut);
return $textOut;
}
is the ASCII character for \r\n which doesn't seem to come out correctly from parseUnformattedText().
Try $xml = nl2br($parser->parseUnformattedText($fileContents));

PHP Write Posted Data to File

I'm relatively new to PHP and I'm trying to get a small script running. I have a VB .net program that posts data using the following function.
Public Sub PHPPost(ByVal User As String, ByVal Score As String)
Dim postData As String = "user=" & User & "&" & "score=" & Score
Dim encoding As New UTF8Encoding
Dim byteData As Byte() = encoding.GetBytes(postData)
Dim postReq As HttpWebRequest = DirectCast(WebRequest.Create("http://myphpscript"), HttpWebRequest)
postReq.Method = "POST"
postReq.KeepAlive = True
postReq.ContentType = "application/x-www-form-urlencoded"
postReq.ContentLength = byteData.Length
Dim postReqStream As Stream = postReq.GetRequestStream()
postReqStream.Write(byteData, 0, byteData.Length)
postReqStream.Close()
End Sub
Where "myphpscript" is acutally the full URL to the PHP script. Basically I'm trying to POST the "User" variable and the "Score" variable to the PHP script. The script I've tried is as follows:
<?php
$File = "scores.rtf";
$f = fopen($File,'a');
$name = $_POST["name"];
$score = $_POST["score"];
fwrite($f,"\n$name $score");
fclose($f);
?>
The "scores.rtf" does not change. Any help would be appreciated. Thanks ahead of time, I'm new to PHP.
Ensure that your script is receiving the POST variables.
http://php.net/manual/en/function.file-put-contents.php
You can try file_put_contents, it combines the use of fopen ,fwrite & fclose.
It could be wise to use something like isset/empty to check that there is something to write before writing.
<?php
$file = 'scores.rtf';
// Open the file to get existing content
$current = file_get_contents($file);
// Append a new person to the file
$current .= print_r($_POST);
//Once confirmed remove the above line and use below
$current .= $_POST['name'] . ' ' . $_POST['score'] . "\n";
// Write the contents back to the file
file_put_contents($file, $current);
?>
Also, totally overlooked the RTF part, definitely look into what Mahan mentioned. I'd suggest the above if you have no need for that specific file type.
The "scores.rtf" does not change.
well RTF files is handled differently because its not really a pure text file it contains meta-data and tags that controls how text is displayed on the rtf file. please have a time to read to the following sources
http://www.webdev-tuts.com/generate-rtf-file-using-php.html
http://b-l-w.de/phprtf_en.php
http://paggard.com/projects/doc.generator/doc_generator_help.html
if in any case you want a normal text file you can use the code below, don't use fwrite(), please use file_put_contents()
file_put_contents("scores.txt", "\n$name $score");

php file() function creates quotation marks and commas

I'm trying to write some php-code that takes $_GET-data as an input and saves it into a csv-file.
When running the code more than once my csv-file looks like this:
Date,Time,Temperature,"Air Humidity","Soil Humidity",Light,"Wind Direction","Wind Speed",Rain
2013-03-16,16:24:27,12,80,40,82,255,10,0
"2013-03-16,16:24:26,12,80,40,82,255,10,0
","""2013-03-16,16:24:26,12,80,40,82,255,10,0
",""",""""""2013-03-16,16:24:25,12,80,40,82,255,10,0
",""","""""",""""
",""",""""""
","""
"
As you can see, the program adds quotation marks and commas into my data that I don't want. This is apparently done by 'file("weather_data.csv")' but I don't know how to disable or work around this.
This is my code for now:
<?php
// Save received data into variables:
$temperature = $_GET["t"];
$airHumidity = $_GET["ha"];
$soilHumidity = $_GET["hs"];
$light = $_GET["l"];
$windDir = $_GET["wd"];
$windSpeed = $_GET["ws"];
$rain = $_GET["r"];
// Arrays for the column descriptor (first line in the csv-file) and the recent data:
$columnDescriptor = array("Date","Time","Temperature","Air Humidity","Soil Humidity","Light","Wind Direction","Wind Speed","Rain");
$recentData = array(date("Y-m-d"),date("H:i:s"),$temperature,$airHumidity,$soilHumidity,$light,$windDir,$windSpeed,$rain);
$fileContents = file("weather_data.csv");
array_shift($fileContents); // removes first field of $fileContents
$file = fopen("weather_data.csv","w");
fputcsv($file,$columnDescriptor);
fputcsv($file,$recentData);
fputcsv($file,$fileContents);
fclose($file);
?>
$fileContents is read as an array of strings, one entry per line of the CSV file but the actual CSV data is not parsed. The last fputcsv tries to write this data as CSV and escapes it (adding quotes and stuff). You need to add the old file contents ($fileContents) to your file with fwrite instead of fputcsv:
fwrite($file, implode("\n", $fileContents));

Categories