How to "clean" string readed from file() [duplicate] - php

It drives me crazy ... I try to parse a csv file and there is a very strange behavior.
Here is the csv
action;id;nom;sites;heures;jours
i;;"un nom a la con";1200|128;;1|1|1|1|1|1|1
Now the php code
$required_fields = array('id','nom','sites','heures','jours');
if (($handle = fopen($filename, "r")) !== FALSE)
{
$cols = 0;
while (($row = fgetcsv($handle, 1000, ";")) !== FALSE)
{
$row = array_map('trim',$row);
// Identify headers
if(!isset($headers))
{
$cols = count($row);
for($i=0;$i<$cols;$i++) $headers[strtolower($row[$i])] = $i;
foreach($required_fields as $val) if(!isset($headers[$val])) break 2;
$headers = array_flip($headers);
print_r($headers);
}
elseif(count($row) >= 4)
{
$temp = array();
for($i=0;$i<$cols;$i++)
{
if(isset($headers[$i]))
{
$temp[$headers[$i]] = $row[$i];
}
}
print_r($temp);
print_r($temp['action']);
var_dump(array_key_exists('action',$temp));
die();
}
}
}
And the output
Array
(
[0] => action
[1] => id
[2] => nom
[3] => sites
[4] => heures
[5] => jours
)
Array
(
[action] => i
[id] =>
[nom] => un nom a la con
[sites] => 1200|128
[heures] =>
[jours] => 1|1|1|1|1|1|1
)
<b>Notice</b>: Undefined index: action in <b>index.php</b> on line <b>110</b>
bool(false)
The key "action" exists in $temp but $temp['action'] returns Undefined and array_key_exists returns false. I've tried with a different key name, but still the same. And absolutely no problem with the others keys.
What's wrong with this ?
PS: line 110 is the print_r($temp['action']);
EDIT 1
If i add another empty field in the csv at the begining of each line, action display correctly
;action;id;nom;sites;heures;jours
;i;;"un nom a la con";1200|128;;1|1|1|1|1|1|1

Probably there is some special character at the beginning of the first line and trim isn't removing it.
Try to remove every non-word character this way:
// Identify headers
if(!isset($headers))
{
for($i=0;$i<$cols;$i++)
{
$headers[preg_replace("/[^\w\d]/","",strtolower($row[$i]))] = $i;
....

If your CSV file is in UTF-8 encoding,
make sure that it's UTF-8 and not UTF-8-BOM.
(you can check that in Notepad++, Encoding menu)

I had the same problem with CSV files generated in MS Excel using UTF-8 encoding. Adding the following code to where you read the CSV solves the issue:
$handle = fopen($file, 'r');
// ...
$bom = pack('CCC', 0xef, 0xbb, 0xbf);
if (0 !== strcmp(fread($handle, 3), $bom)) {
fseek($handle, 0);
}
// ...
What it does, is checking for the presence of UTF-8 byte order mark. If there is one, we move the pointer past BOM. This is not a generic solution since there are other types BOMs, but you can adjust it as needed.

Sorry I am posting on an old thread, but thought my answer could add to ones already provided here...
I'm working with a Vagrant guest VM (Ubuntu 16.04) from a Windows 10 host. When I first came across this bug (in my case, seeding a database table using Laravel and a csv file), #ojovirtual's answer immediately made sense, since there can be formatting issues between Windows and Linux.
#ojovirtual's answer didn't quite work for me, so I ended up doing touch new_csv_file.csv through Bash, and pasting contents from the 'problematic' CSV file (which was originally created on my Windows 10 host) into this newly-created one. This definitely fixed my issues - it would have been good to learn and debug some more, but I just wanted to get my particular task completed.

I struggled with this issue for a few hours only to realize that the issue was being caused by a null key in the array. Please ensure that none of the keys has a null value.

I struggled with this issue until I realised that my chunk of code has been run twice.
First run when index was present and my array was printed out properly, and the second run when index was not present and the notice error is triggered. That left me wondering "why my obviously existing and properly printed out array is triggering an 'Undefined index' notice". :)
Maybe this will help somebody.

Related

Why does the character "x" change to "×" when placed into a .csv file after running my PHP script? [duplicate]

This question already has answers here:
UTF-8 all the way through
(13 answers)
Closed 4 years ago.
I wrote a PHP script that connects to a distributor's server, downloads several inventory files, and creates a massive .csv file to import into WooCommerce. Everything works except for one thing: when I look at the exported .csv file, the "x" character in my "caliber" column is always converted to the string "×".
updateInventoryFunctions.php:
function fixCalibers($number, $attributesList) {
$calibers = array (
# More calibers...
"9×23mm Winchester" => '9X23'
);
$pos = array_search($attributesList[$number], $calibers);
if ($pos !== false) {
$attributesList[$number] = $pos;
return $attributesList[$number];
} elseif ($attributesList[$number] == "40SW") {
$attributesList[$number] = '.40 S&W';
return $attributesList[$number];
} # More conditionals...
}
updateInventory.php:
# Code that connects to the distributor's server, downloads files, and places headers into the .csv file.
if (($localHandle = fopen("current_rsr_inventory.csv", "w")) !== false) {
# Code that defines arrays for future fixes and creates a multidimensional array of attributes...
foreach ($tempInventoryFile as &$line) {
$line = explode(";", $line);
# Code that fixes several inconsistencies from the distributor...
$matchingKey = array_search($line[0], $skuList);
$attributesList = array();
if ($matchingKey !== false) {
# Code that fixes more inconsistencies...
if ($attributesList[18] === "" || $attributesList[18] === null) {
array_splice($attributesList, 18, 1);
include_once "updateInventoryFunctions.php";
$attributesList[17] = fixCalibers(17, $attributesList);
} # More conditionals...
# Code that fixes more inconsistencies...
foreach ($attributesList as $attribute) {
$line[] = $attribute;
} // End foreach.
} // End if.
fputcsv($localHandle, $line);
} // End foreach.
} // End if.
# Code that closes files and displays success message...
The caliber "9×23mm Winchester" is displayed as "9×23mm Winchester" in the .csv file. I've tried placing single quotes around the array key and escaping the character "x". There are multiple instances of this mysterious switch.
Thanks in advance for any help!
This is an encoding issue. The character "×" is incorrectly encoded from UTF-8 to ISO-8859-1. Specify the output encoding as UTF-8, for example header('Content-Type: text/html; charset=utf-8');, or manually specify encoding in your browser will solve this issue.
"×" is U+C397, and code point C3 in ISO-8859-1 is tilde A "Ã".
Try to put header on top of your script:
header('Content-Type: text/html; charset=utf-8');

php - for loop repeating itself / going out of sequence

I'm very new to PHP, making errors and learning as I go. Please be gentle! :)
I want to access some data from Blizzard.com's API. For this particular data set, it's not a block of data in JSON, rather each object has it's own URL to access. I estimate that there are approx 150000 objects, however I don't know the start or end points of the number range. So I'm having to assume 1 and work past the highest number I know (269065)
To get the data, I need to access each object's data via a JSON file, which I read, get the contents of & drop in to a text file (this could be written as an insert in to a SQL db too, as I'm able to do this if it's the text file that's the issue). But to be honest, I would love to get to the bottom of why this is happening as much as anything!
I wasn't going to try and run ~250000 iterations in a for loop, I thought I'd try something I considered small, 2000.
The for loop starts with $a as 1, uses $a as part of the URL, loads & decodes the JSON, checks to see if the first field (ID) in the object is set, if it is, it writes a few fields to data.txt & if the first field (ID) isn't set it just writes $a to data.txt (so I know it's a null for other purposes not outlined here).
Simple! Or so I thought, after approx after 183 iterations, the data written to the text file goes awry as seen by the quote below. It is out of sequence and starts at 1 again, then back to 184 ad nauseam. The loop then seems to be locked in some kind of infinite loop of running, outputting in a random order until I close the page 10-20 minutes later.
I have obviously made a big mistake! But I have no idea what I have done wrong to have caused this. During my attempts I have rewritten the code with new variable names, so a new text does not conflict with code that could be running in memory.
I've tried resetting variables to blank at the end of the loop in case it something was being reused that was causing a problem.
If anyone could point out any errors in my code, or suggest something for me to look in to, to handle bigger loops that would be brilliant. I am assuming my issue may be a time out or memory problem. But I don't know where to start & was hoping I'd find some suggestions here.
If it's relevant, I am using 000webhostapp.com as my host provider for now, until I get some paid for hosting.
1 ... 182 183 1 184 2 3 185 4 186 5 187 6 188 7 189 190 8 191
for ($a = 1; $a <= 2000; $a++) {
$json = "https://eu.api.battle.net/wow/recipe/".$a."?locale=en_GB&<MYPRIVATEAPIKEY>";
$contents = file_get_contents($json);
$data = json_decode($contents,true);
if (isset($data['id'])) {
$file = fopen("data.txt","a");
fwrite($file,$data['id'].",'".$data['name']."'\n");
fclose($file);
} else {
$file = fopen("data.txt","a");
fwrite($file,$a."\n");
fclose($file);
}
}
The content of the file I'm trying to access is
{"id":33994,"name":"Precise Strikes","profession":"Enchanting","icon":"spell_holy_greaterheal"}
I scrapped the original plan and wrote this instead. Thank you again who took the time out of their day to help and offer suggestions!
$b = $mysqli->query("SELECT id FROM `static_recipes` order by id desc LIMIT 1;")->fetch_object()->id;
if (empty($b)) {$b=1;};
$count = $b+101;
$write = [];
for ($a = $b+1; $a < $count; $a++) {
$json = "https://eu.api.battle.net/wow/recipe/".$a."?locale=en_GB&apikey=";
$contents = #file_get_contents($json);
$data = json_decode($contents,true);
if (isset($data['id'])) {
$write [] = "(".$data['id'].",'".addslashes($data['name'])."','".addslashes($data['profession'])."','".addslashes($data['icon'])."')";
} else {
$write [] = "(".$a.",'a','a','a'".")";
}
}
$SQL = ('INSERT INTO `static_recipes` (id, name, profession, icon) VALUES '.implode(',', $write));
$mysqli->query($SQL);
$mysqli->close();
$write = [];
for ($a = 1; $a <= 2000; $a++) {
$json = "https://eu.api.battle.net/wow/".$a."?locale=en_GB&<MYPRIVATEAPIKEY>";
$contents = file_get_contents($json);
$data = json_decode($contents,true);
if (isset($data['id'])) {
$write [] = $data['id'].",'".$data['name']."'\n";
} else {
$write [] = $a."\n";
}
}
$file = fopen("data.txt","a");
fwrite($file, implode('', $write));
fclose($file);
Also, why you are think what some IDS isn't duplicated at several "https://eu.api.battle.net/wow/[N]" urls data?
Also if you are I wasn't going to try and run ~250000 think about curl_multi_init(): http://php.net/manual/en/function.curl-multi-init.php
I can't really see anything obviously wrong with your code, can't run it though as I don't have the JSON
It could be possible that there is some kind of race condition since you're opening and closing the same file hundreds of times very quickly.
File operations might seem atomic but not necessarily so - here's an interesting SO thread:
Does PHP wait for filesystem operations (like file_put_contents) to complete before moving on?
Like some others' suggested - maybe just open the file before you enter the loop then close the file when the loop breaks.
I'd try it first and see if it helps.
There's nothing in your original code that would cause that sort of behaviour. PHP will not arbitrarily change the value of a variable. You are opening this file in append mode, are you certain that you're not looking at old data? Maybe output some debug messages as you process the data. It's likely you'd run up against some rate limiting on the API server, so putting a pause in there somewhere may improve reliability.
The only substantive change I'd suggest to your code is opening the file once and closing it when you're done.
$file = fopen("data_1_2000.txt", "w");
for ($a = 1; $a <= 2000; $a++) {
$json = "https://eu.api.battle.net/wow/recipe/$a?locale=en_GB&<MYPRIVATEAPIKEY>";
$contents = file_get_contents($json);
$data = json_decode($contents, true);
if (!empty($data['id'])) {
$data["name"] = str_replace("'", "\\'", $data["name"]);
$record = "$data[id],'$data[name]'";
} else {
$record = $a;
}
fwrite($file, "$record\n");
sleep(1);
echo "$a "; if ($a % 50 === 0) echo "\n";
}
fclose($file);

fgetcsv leaving quotes on first item

I am having problems reading a CSV file where the values are encapsulated in quotes.
The first line of my CSV file are headers and they look like the following:
"Header 1","Header 2","Header 3","Header 4","Header 5"
When using fgetcsv, the first header retains the surrounding quotes.
while (($row = fgetcsv($file, 6000, ',')) !== false)
{
echo '<pre>';
print_r($row);
echo '</pre>';
exit;
}
This outputs the following to the page
Array
(
[0] => "Header 1"
[1] => Header 2
[2] => Header 3
[3] => Header 4
[4] => Header 5
)
Does anyone have any advice on how to make sure the quotes are not included in the first array item?
Thanks
As Karsten Koop as a comment stated, it's probably due to a utf8 BOM character. And since php is not solving this behaviour, you'll need to get rid of that char before opening the csv-file for reading.
i.e. using a function like this (more info):
public static function removeUtf8Bom($fileUri){
$content = file_get_contents($fileUri);
$content = str_replace("\xEF\xBB\xBF",'',$content);
file_put_contents($fileUri,$content);
}

How do I put a .csv file correctly into an array in PHP?

So I'd like to make a basic login/register page. I got a CSV file which roughly looks like this:
a, b
r,d
login, pass
I am already able to correctly add new combinations to the file. But if I want to put the CSV into an array so that I can check if the username/password combination is true, I only get the first row in the array, so [0] = "a" and [1] = "b". There are similar questions on this site on how to put a csv into an array, but with every solution this problem comes up. How do I get the other elements in the array, too?
Edit: as suggested, the code I used:
$database = fopen("database.csv", "r");
$data = fgetcsv($database, 1000, ",");
print_r($data);
This returns: Array ( [0] => q [1] => w )
Exact data:
q,w
g,h
o,p
t,y
c,d
o,p
o,p
a,b
Hope you can help me.
You can see from the documentation that fgetcsv returns just one line from the file pointer, and NULL or FALSE if it was unable to get another line.
You should put your code in a while loop to get all of the CSV rows.
$credentials = array();
$database = fopen("database.csv", "r");
while (is_array($data = fgetcsv($database, 1000, ','))) {
$credentials[] = $data;
}
fclose($database);
var_dump($credentials); // This contains all of the credentials.

PHP: Undefined index even if it exists

It drives me crazy ... I try to parse a csv file and there is a very strange behavior.
Here is the csv
action;id;nom;sites;heures;jours
i;;"un nom a la con";1200|128;;1|1|1|1|1|1|1
Now the php code
$required_fields = array('id','nom','sites','heures','jours');
if (($handle = fopen($filename, "r")) !== FALSE)
{
$cols = 0;
while (($row = fgetcsv($handle, 1000, ";")) !== FALSE)
{
$row = array_map('trim',$row);
// Identify headers
if(!isset($headers))
{
$cols = count($row);
for($i=0;$i<$cols;$i++) $headers[strtolower($row[$i])] = $i;
foreach($required_fields as $val) if(!isset($headers[$val])) break 2;
$headers = array_flip($headers);
print_r($headers);
}
elseif(count($row) >= 4)
{
$temp = array();
for($i=0;$i<$cols;$i++)
{
if(isset($headers[$i]))
{
$temp[$headers[$i]] = $row[$i];
}
}
print_r($temp);
print_r($temp['action']);
var_dump(array_key_exists('action',$temp));
die();
}
}
}
And the output
Array
(
[0] => action
[1] => id
[2] => nom
[3] => sites
[4] => heures
[5] => jours
)
Array
(
[action] => i
[id] =>
[nom] => un nom a la con
[sites] => 1200|128
[heures] =>
[jours] => 1|1|1|1|1|1|1
)
<b>Notice</b>: Undefined index: action in <b>index.php</b> on line <b>110</b>
bool(false)
The key "action" exists in $temp but $temp['action'] returns Undefined and array_key_exists returns false. I've tried with a different key name, but still the same. And absolutely no problem with the others keys.
What's wrong with this ?
PS: line 110 is the print_r($temp['action']);
EDIT 1
If i add another empty field in the csv at the begining of each line, action display correctly
;action;id;nom;sites;heures;jours
;i;;"un nom a la con";1200|128;;1|1|1|1|1|1|1
Probably there is some special character at the beginning of the first line and trim isn't removing it.
Try to remove every non-word character this way:
// Identify headers
if(!isset($headers))
{
for($i=0;$i<$cols;$i++)
{
$headers[preg_replace("/[^\w\d]/","",strtolower($row[$i]))] = $i;
....
If your CSV file is in UTF-8 encoding,
make sure that it's UTF-8 and not UTF-8-BOM.
(you can check that in Notepad++, Encoding menu)
I had the same problem with CSV files generated in MS Excel using UTF-8 encoding. Adding the following code to where you read the CSV solves the issue:
$handle = fopen($file, 'r');
// ...
$bom = pack('CCC', 0xef, 0xbb, 0xbf);
if (0 !== strcmp(fread($handle, 3), $bom)) {
fseek($handle, 0);
}
// ...
What it does, is checking for the presence of UTF-8 byte order mark. If there is one, we move the pointer past BOM. This is not a generic solution since there are other types BOMs, but you can adjust it as needed.
Sorry I am posting on an old thread, but thought my answer could add to ones already provided here...
I'm working with a Vagrant guest VM (Ubuntu 16.04) from a Windows 10 host. When I first came across this bug (in my case, seeding a database table using Laravel and a csv file), #ojovirtual's answer immediately made sense, since there can be formatting issues between Windows and Linux.
#ojovirtual's answer didn't quite work for me, so I ended up doing touch new_csv_file.csv through Bash, and pasting contents from the 'problematic' CSV file (which was originally created on my Windows 10 host) into this newly-created one. This definitely fixed my issues - it would have been good to learn and debug some more, but I just wanted to get my particular task completed.
I struggled with this issue for a few hours only to realize that the issue was being caused by a null key in the array. Please ensure that none of the keys has a null value.
I struggled with this issue until I realised that my chunk of code has been run twice.
First run when index was present and my array was printed out properly, and the second run when index was not present and the notice error is triggered. That left me wondering "why my obviously existing and properly printed out array is triggering an 'Undefined index' notice". :)
Maybe this will help somebody.

Categories