Can't replace this unicode (�) to " " (space) on csv upload to PHP - php

I have a Excel data converted from XLSX to CSV, then I need to upload it to my site. The data shown like this on CSV but changed after upload.
// On Excel (CSV)
Row Description
1 Enjoy this life without drugs
2 Life is so short, so enjoy it
After uploading to site and inserted to MySQL it's look like this.
// On MySQL
Row Description
1 Enjoy?this life without?drugs
2 Life is?so?short, so?enjoy it
// On PHP ( (echo loop).
Row Description
1 Enjoy�this life without�drugs
2 Life is�so�simple, so�enjoy it
I was checked on my CSV, it just space that changed into ? and �. So, I'm trying to replace that but all failed using :
// $the_string = Line of text Description.
1. str_replace("�", " ", $the_string);
2. str_replace("&#65533", " ", $the_string);
3. str_replace("&#xfffd", " ", $the_string);
4. str_replace("?", " ", $the_string");
But, If I'm test it only on <?php str_replace("�", " ", "a�b"); ?>, It's working.
I don't know where is the mistake.
This is my source code :
public function upload()
{
$config = array(
"upload_path" => "./uploads/",
"allowed_types" => "csv"
);
$this->load->library("upload", $config);
$this->load->helper("file");
$this->upload->initialize($config);
$upload = $this->upload->data();
$file = base_url()."uploads/{$upload['file_name']}";
$file_handle = fopen($file, "r");
$check_line = 0;
while ( ! feof($file_handle))
{
$line_of_text = fgetcsv($file_handle, 1024);
$check_line++;
}
fclose($file_handle);
if ($check_line > 1)
{
$file_handle2 = fopen($file, "r");
while ( ! feof($file_handle2))
{
$line_of_text = fgetcsv($file_handle, 1024);
$description = $line_of_text[1];
$this->model->insert_description($description);
}
fclose($file_handle);
}
}

Try with preg_replace():
preg_replace('/\x{FFFD}/u', ' ', $the_string);
Try it here.
Attention: This will remove the � character from the string, but ONLY if it is the real character stored in the string.
The � character may appear in substitution of every character that isn't encoded properly accordingly with the encoding used by, in this case, PHP.
To remove all non-printable characters use this:
preg_replace('/[\x00-\x1F\x7F-\xFF]/', '', $the_string);
Try it here.

Related

How to replace all occurrences of string one by one (loop)

I have string with x occurrences of some substring. I need replace every occurrences of substring one by one (not all at one step) and everytime when I replace one substring, i need save this substring into variable before it will be replaced...
Exactly - I have input with text and three images in base64. I need cut base64 code of images (replace it with '') and save this img base64 code into new file and echo only input text without base64 code of imgs...
I tried for loops, while loops, functions... but everytime it replace only first occurrences of substring and not other
I think that I must save $original_string every time in loop cycle, because in my codes it always start with whole $original_string from input. But I don't know how I must do it...
$img_count = substr_count($original_string, "data:image");
for ($i = 0; $i < $img_count; $i++) {
$img_name = get_string_between($original_string, 'data-filename="', '.');
$img_base64 = get_string_between($original_string, 'src="', '"');
$original_string = str_replace($img_base64, '', $original_string);
$myfile = fopen("../img/clanky/" . $id_noveho_clanku . "/" . $img_name . ".txt", "w") or die("Unable to open file!");
$txt = $img_base64;
fwrite($myfile, $txt);
fclose($myfile);
}
while (strpos($original_string, "data:image") !== false) {
$img_name = get_string_between($original_string, 'data-filename="', '.');
$img_base64 = get_string_between($original_string, 'src="', '"');
$original_string = str_replace($img_base64, '', $original_string);
$myfile = fopen("../img/clanky/" . $id_noveho_clanku . "/" . $img_name . ".txt", "w") or die("Unable to open file!");
$txt = $img_base64;
fwrite($myfile, $txt);
fclose($myfile);
}
I want to have 3 files with base64 code of images and twxt from input without this base64. But I have only one file, and text without only first occurrence.
EDIT:
so, I find out this...
I upload img "výstřižek.jpg", On my web img are render correctly, without errors. if I see on my admin site on webhosting, I see this name correctly in files, but if I connect to my host server by filezilla FTP and I open file tree in this FTP client, there I see "VýstÃÂiþek3.jpg". But why??
I cant post images here, so I upload screenshots on gdrive, please view them, for better understood of actual problem
https://drive.google.com/open?id=1OQn-L1xePo6o1I6U7mmvZbxiEOBhPLvp
Yes, your code always returns the first result because it's always matching from the beginning of the string, you should remove the part you've already processed for it to work or use an offset from which to start the search in each iterarion: start with 0 and before removing the data, use its position as the next offset. You would then have to pass it to your get_string_between function to limit the search.
But since your original string is in fact html, I'd recommend using some html parsing library to avoid headaches with attributes being on a different order, possible escaping issues, etc. This solution uses the DOM extension that should be enabled by default.
$doc = new DOMDocument();
// Treat the string as a frament and do not add a DOCTYPE or mandatory tags
$doc->loadHTML(mb_convert_encoding($original_string, 'HTML-ENTITIES', 'UTF-8'), LIBXML_HTML_NODEFDTD | LIBXML_HTML_NOIMPLIED);
// Get all images
$imgs = $doc->getElementsByTagName('img');
foreach ($imgs as $img) {
$src = $img->getAttribute('src');
$name = $img->getAttribute('data-filename');
// Check if it contains base64 data and a filename
if (false !== mb_strpos($src, 'data:image') && "" !== $name) {
$data = mb_substr($src, strpos($src, ',') + 1); // Data start
$img->setAttribute('src', mb_str_replace($data, '', $src)); // Remove data
$name = mb_substr($name, 0, strrpos($name, '.'));
// Save the image to disk
$path = "../img/clanky/${id_noveho_clanku}/${name}.txt";
file_put_contents($path, $data);
}
}
// Get the HTML with the data removed
$parsed = $doc->saveHTML($doc);
Demo on 3v4l.org

How to extract string line by line from one text file to another text file

I have a text file of phone numbers like below:
+2348089219281 +2348081231580 +2347088911847 +2347082645764 +2348121718153 +2348126315930 +2348023646683.
I want to extract each number, strip +234 from it and replace with 0 , then add the following text "Names" . "\t" in front of the modified number.
Then I want to insert this new string in a new text file (line-by-line)..
This is what I get in the new_textFile with the code I've wrote:
Names 00urce id #3
Names 00urce id #3
Here's my code:
$this_crap_file = fopen($old_file_name, "r");
$total_number_lines_for_this_crap_file = count($this_crap_file);
while(!feof($this_crap_file))
{
$the_new_writing = fopen($new_file_name, "a");
$the_string = substr_replace($this_crap_file, "0", 0, 4);
$new_string = "Names" . "\t" . 0 . $the_string . "\n";
fwrite($the_new_writing, $new_string);
}
fclose($this_crap_file);
No space between fopen and the parenthesis? Sorry, I don't see the relevance of that statement.
Assuming your input file only has one phone number per line, and that they all start with '+234', you could use a regular expression to pick out just the part you want to put in the new file, like this:
$this_crap_file = fopen($old_file_name, "r");
$the_new_writing = fopen($new_file_name, "a");
while ($line = fgets($this_crap_file))
{
preg_match('/\+234(\d+)/', $line, $matches);
$new_string = "Names\t" . $matches[1] . "\n";
fwrite($the_new_writing, $new_string);
}
fclose($the_new_writing);
fclose($this_crap_file);

Using tabs spacing in php for text file

I'm reading a CSV file and printing the data from the CSV file to 2 .txt files. The output of the text files are as follows
John
Georgina,Sinclair,408999703657,cheque,"First National Bank",Fourways,275.00,12/01/2012
Toby,Henderson,401255489873,cheque,"First National Bank",Edenvale,181.03,12/13/2012
Here is my code:
$file_handle = fopen("debitorders.csv", "r") or die("can't open debitorders.csv");
$absaFile = fopen("ABSA.txt", "w") or die("can't open ABSA.txt");
$firstNationalBankFile = fopen("First National Bank.txt", "w") or die("can't open First National Bank.txt");
while (!feof($file_handle) ) {
$debitorders = fgetcsv($file_handle, 1024, ",");
if ($debitorders[4] == "ABSA"){
print_r ($debitorders[4] . "<br />");
fputcsv($absaFile, $debitorders);
$ABSA_bank = "ABSA";
fopen("ABSA.txt", "a");
file_put_contents('ABSA.txt', $ABSA_bank, FILE_APPEND);
}
if ($debitorders[4] == "First National Bank"){
$FNB_bank = "First National Bank";
print_r ($debitorders[4] . "<br />");
fputcsv($firstNationalBankFile, $debitorders);
$FNB_bank = "First National Bank";
fopen("First National Bank.txt", "a");
file_put_contents('First National Bank.txt', $FNB_bank, FILE_APPEND);
}
}
fclose($file_handle);
fclose($absaFile);
fclose($firstNationalBankFile);
How can I put tab spaces in the output file instead of commas, so that the output instead looks like this:
John
GeorginaSinclair 408999703657 cheque First National Bank Fourways 275.0012/01/2012
TobyHenderson 401255489873 cheque First National Bank Edenvale 181.0312/13/2012
Any help would be appreciated. Thank You
I did something similar a while back although it was the other way around. I replaced tabs with commas. Here is the code I used:
preg_replace('/[ ]{2,}|[\t]/', ',', $string);
The code above removed two tabs and replaced with one comma. So maybe try it this way:
preg_replace(',', '/[ ]{1,}|[\t]/', $string);
As per php manual
int fputcsv ( resource $handle , array $fields [, string $delimiter = "," [, string $enclosure = '"' ]] )
So you can try
fputcsv($absaFile, $debitorders ,"\t");

Removing spaces from a text file using PHP

I need to add itens from a text file, into a MySQL database, to do that i'm trying to use PHP to read the file and insert everything into the database, the problem is that the text file, has 2 itens per line, divided by spaces, something like this:
teste.txt
ITEM1 ITEM2
ITEM323 ITEM4
ITEM54 ITEM6
ITEM34234 ITEM8
I'm trying to remove the spaces using explode, but because the number of spaces, is random, i cannot do that. This is my Code:
$handle = #fopen("teste.txt", "r"); //read line one by one
while (!feof($handle)) // Loop til end of file.
{
$buffer = fgets($handle, 4096); // Read a line.
list($a,$b)=explode(" ",$buffer);//Separate string by Spaces
//values.=($a,$b);// save values and use insert query at last or
echo $a;
echo "<br>";
echo $b . "<br>"; // NEVER echoes anything
}
What should i do?
You can insert a method to change multiple spaces to one:
$buffer = preg_replace('/\s+/',' ',$buffer);
Right before exploding it:
$buffer = fgets($handle, 4096); // Read a line.
$buffer = preg_replace('/\s+/',' ',$buffer);
list($a,$b)=explode(" ",$buffer);//Separate string by Spaces
You can use preg_replace() to perform a regular expression:
$buffer = preg_replace("/\ +/", " ", fgets($handle, 4096));

PHP: Opening .txt file, reading line by line, removing (YEAR)

I am looking for a suggestion on this:
I have a text file called movies.txt with about 900 lines, which contains one movie name per line. However, I would like to remove the year the movie has been released using PHP (which I am new to)
The format is basically:
A Nous la Liberte (1932)
About Schmidt (2002)
Absence of Malice (1981)
Adam's Rib (1949)
Adaptation (2002)
The Adjuster (1991)
The Adventures of Robin Hood (1938)
Affliction (1998)
The African Queen (1952)
So I am looking for a way to open the text file, reading it line by line and removing the (YEAR) values while also removing the space before the (YEAR).
Then I would like to save it as newmovies.txt
Would be great if you could show and explain me a solution that works for my needs. I am still very new to PHP (started a week ago) so it's all still magic to me.
You can read a file line-wise using the file() function. Then foreach over that and strip lines until the opening parenthesis.
For example
foreach (file($fn) as $line) {
$output[] = strtok($line, "(");
}
You may need to trim the extra space and add linebreaks again.
So a regex might be simpler and also asserts some structure without blindly cutting things off:
$text = file_get_contents($fn);
$text = preg_replace('/\s*\(\d+\)/m', '', $text);
# \s* is for spaces and \d+ is a placeholder for numbers
Then save that back.
<?php
$toWrite = "";
$handle = #fopen("input.txt", "r");
if ($handle) {
while (($buffer = fgets($handle, 4096)) !== false) {
$toWrite .= preg_replace('/\(\d+\)/', '', $buffer) . "\n";
}
if (!feof($handle)) {
echo "Error: unexpected fgets() fail\n";
}
fclose($handle);
file_put_contents("input.txt", $toWrite);
}
?>

Categories