CSV encoding php - php

Here's the problem i need to post a .csv file from one server to another.
I do this by reading the contents of the .csv file and sending that with curl as post data.
This is working without problems.
But then when i try to parse the data and store it in a table in the database the trouble begins.
I have all the variables in a array, if i print this array it displays correctly.
But if i echo a value from that array i get all kinds of weird characters.
My best guess is it has something to do with the encoding of the csv file but i wouldnt have a clue how to fix that.
here's the function i use to parse the csv data:
public function parseCsv($data)
{
$quote = '"';
$newline = "\n";
$seperator = ';';
$dbQuote = $quote . $quote;
// Clean up file
$data = trim($data);
$data = str_replace("\r\n", $newline, $data);
$data = str_replace($dbQuote,'"', $data);
$data = str_replace(',",', ',,', $data);
$data .= $seperator;
$inquotes = false;
$startPoint = $row = $cellNo = 0;
for($i=0; $i<strlen($data); $i++) {
$char = $data[$i];
if ($char == $quote) {
if ($inquotes) $inquotes = false;
else $inquotes = true;
}
if (($char == $seperator or $char == $newline) and !$inquotes) {
$cell = substr($data,$startPoint,$i-$startPoint);
$cell = str_replace($quote,'',$cell);
$cell = str_replace('"',$quote,$cell);
$result[$row][$this->csvMap[$cellNo]] = $this->_parseValue($cellNo, $cell);
++$cellNo;
$startPoint = $i + 1;
if ($char == $newline) {
$cellNo = 0;
++$row;
}
}
}
return $result;
}
any help is appreciated!
EDIT:
Ok so after some more trial and error i found out its just the very first value of the first row that has some extra characters. If i echo that value everything i output after that gets messed up.
So i tried to change the encoding now if i echo the value its all good but i have a new problem, its a string but i need a int:
echo $val; //output: 7655 but messes up everything outputted after it
$val = mb_convert_encoding($val, "UTF-8");
echo $val // output: 7655
echo intval($val) //output: 0
EDIT:
expected output:
7655Array ( [kenmerk] => ÿþ7655 [status] => 205 [status_date] => 1991-12-30 [dob] => 1936-09-04 ) succes
messed up output
7655牁慲੹ਨ††歛湥敭歲⁝㸽@㟾㘀㔀㔀਀††獛慴畴嵳㴠‾㈀㤀㔀਀††獛慴畴彳慤整⁝㸽 201ⴱ㄀㈀ⴀ30 †嬠潤嵢㴠‾㄀㤀㘀㘀-08〭㐀਀਩畳捣獥
i first echo the element 'kenmerk' after that i print the array
as you can see in the array the element 'kenmerk' has some extra charcters..
converting the data to utf-8 like so:
$data = mb_convert_encoding($data, "UTF-8");
eliminates the problem with messed up output and removes the 'ÿþ' (incorrectly-interpreted BOM?) but i still cant convert the values to a int
EDIT:
ok i sort of found a solution..
but as i have no idea why it works i'd appreciate any info
var_dump((int) $val); // output: 0
var_dump((int) strip_tags($val); // output: 7655

You need to remove ÿþ from 7655. intval() and int ($val = (int)$val;) will always output 0 when the first character is not a number. Ex. 765ÿþ5 will return 765, etc.
Regarding your first problem, I would also recommend you to read this answer. PHP messing with HTML Charset Encoding
I hope that it will give you more clarity about what you struggle with.
I will also build you striping process more stable, so it ex. match 7655 instead of ÿþ7655.

Related

Warning: substr_count(): Empty substring

I'm suddenly getting an awful lot of errors saying "Empty substring" referring to line 8
$score3 = substr_count($name_only, $text);
I have no idea what the issue is, this is a search function. Is it empty submission into the search box?
I thought it might be so I made changes in with JS and HTML so isn't possible to submit the search form blank or with just whitespace, but still the error continues.
This is my php, does anything stand out as the source of the issue to anyone with better knowledge than I have?
function search_now($images){
global $text, $scores, $scores2, $scores3, $do_search;
$images2 = array();
foreach ($images as $key => $value) {
$name_only = $value['name'];
similar_text($text, $name_only, $score);
$score2 = substr_compare($name_only, $text, 0);
$score3 = substr_count($name_only, $text);
if($do_search){
if($score<20)
continue;
}
$images2[$key] = $value;
$images2[$key]['score'] = $score;
$images2[$key]['score2'] = $score2;
$images2[$key]['score3'] = $score3;
//$scores[$key] = ($do_search)? $score : $key;
$scores[$key] = $score;
$scores2[$key] = $score2;
$scores3[$key] = $score3;
}
return $images2;
}
That error message is triggered when the second argument for substr_count() is an empty string. If you dump $text I imagine you will find it's an empty string.
Not sure how your snippet relates to the rest of your code but you could include a check in your function...
if ($text == '') {
// handle scoring differently
}

How to normalise CSV content in PHP?

Problem:
I'm looking for a PHP function to easily and efficiently normalise CSV content in a string (not in a file). I have made a function for that. I provide it in an answer, because it is a possible solution. Unfortuanately it doesn't work when the separator is included in incomming string values.
Can anyone provide a better solution?
Why not using fputcsv / fgetcsv ?
Because:
it requires at least PHP 5.1.0 (which is sometimes not available)
it can only read from files, but not from a string. even though, sometimes the input is not a file (eg. if you fetch the CSV from an email)
putting the content into a temporary file might be unavailable due to security policies.
Why / what kind of normalisation?
Normalise in a way, that the encloser encloses every field. Because the encloser can be optional and different per line and per field. This can happen if one is implementing unclean/incomplete specifications and/or using CSV content from different sources/programs/developers.
Example function call:
$csvContent = "'a a',\"b\",c,1, 2 ,3 \n a a,'bb',cc, 1, 2, 3 ";
echo "BEFORE:\n$csvContent\n";
normaliseCSV($csvContent);
echo "AFTER:\n$csvContent\n";
Output:
BEFORE:
'a a',"b",c,1, 2 ,3
a a,'bb',cc, 1, 2, 3
AFTER:
"a a","b","c","1","2","3"
"a a","bb","cc","1","2","3"
To specifically address your concern regarding f*csv working only with files:
Since PHP 5.3 there's str_getcsv.
For at least PHP >= 5.1 (and I really hope that's the oldest you'll have to deal with these days), you can use stream wrappers:
$buffer = fopen('php://memory', 'r+');
fwrite($buffer, $string);
rewind($buffer);
fgetcsv($buffer) ..
Or obviously the reverse if you want to use fputcsv.
This is a possible solution. But it doesn't consider the case that the separator (,) might be included in incoming strings.
function normaliseCSV(&$csv,$lineseperator = "\n", $fieldseperator = ',', $encloser = '"')
{
$csvArray = explode ($lineseperator,$csv);
foreach ($csvArray as &$line)
{
$lineArray = explode ($fieldseperator,$line);
foreach ($lineArray as &$field)
{
$field = $encloser.trim($field,"\0\t\n\x0B\r \"'").$encloser;
}
$line = implode ($fieldseperator,$lineArray);
}
$csv = implode ($lineseperator,$csvArray);
}
It is a simple chain of explode -> explode -> trim -> implode -> implode .
Although I agree with #deceze that you could expect atleast 5.1 these days, i'm sure there are some internal company servers somewhere who don't want to update.
I altered your method to be able to use field and line separators between double quotes, or in your case the $encloser value.
<?php
/*
In regards to the specs on http://tools.ietf.org/html/rfc4180 I use the following rules:
- "Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes."
- "If double-quotes are used to enclose fields, then a double-quote appearing inside a field must be escaped by preceding it with another double quote."
Exception:
Even though the specs says use double quotes, I 'm using your $encloser variable
*/
echo normaliseCSV('a,b,\'c\',"d,e","f","g""h""i","""j"""' . "\n" . "\"k\nl\nm\"");
function normaliseCSV($csv,$lineseperator = "\n", $fieldseperator = ',', $encloser = '"')
{
//We need 4 temporary replacement values
//line seperator, fieldseperator, double qoutes, triple qoutes
$keys = array();
while (count($keys)<3) {
$tmp = "##".md5(rand().rand().microtime())."##";
if (strpos($csv, $tmp)===false) {
$keys[] = $tmp;
}
}
//first we exchange "" (double $encloser) and """ to make sure its not exploded
$csv = str_replace($encloser.$encloser.$encloser, $keys[0], $csv);
$csv = str_replace($encloser.$encloser, $keys[0], $csv);
//Explode on $encloser
//Every odd index is within quotes
//Exchange line and field seperators for something not used.
$content = explode($encloser,$csv);
$len = count($content);
if ($len>1) {
for ($x=1;$x<$len;$x=$x+2) {
$content[$x] = str_replace($lineseperator,$keys[1], $content[$x]);
$content[$x] = str_replace($fieldseperator,$keys[2], $content[$x]);
}
}
$csv = implode('',$content);
$csvArray = explode ($lineseperator,$csv);
foreach ($csvArray as &$line)
{
$lineArray = explode ($fieldseperator,$line);
foreach ($lineArray as &$field)
{
$val = trim($field,"\0\t\n\x0B\r '");
//put back the exchanged values
$val = str_replace($keys[0],$encloser.$encloser,$val);
$val = str_replace($keys[1],$lineseperator,$val);
$val = str_replace($keys[2],$fieldseperator,$val);
$val = $encloser.$val.$encloser;
$field = $val;
}
$line = implode ($fieldseperator,$lineArray);
}
$csv = implode ($lineseperator,$csvArray);
return $csv;
}
?>
Output would be:
"a","b","c","d,e","f","g""h""i","""j"""
"k
l
m"
Codepad example
when i first read this question wasn´t sure if it should be solved or not, since <5.1 environments should be extinguished a long time ago, dispite of that is a hell of a question how to solve this so we should be thinking wich approach to take... and my guess is it should be char by char examination.
I have separated logic in three main scenarios:
A: CHAR is a separator
B: CHAR is a Fuc$€/& quotation
C: CHAR is a Value
Obtaining as a reulst this weapon class (including log for it) for our arsenal:
<?php
Class CSVParser
{
#basic requirements
public $input;
public $separator;
public $currentQuote;
public $insideQuote;
public $result;
public $field;
public $quotation = array();
public $parsedArray = array();
# for logging purposes only
public $logging = TRUE;
public $log = array();
function __construct($input, $separator, $quotation=array())
{
$this->separator = $separator;
$this->input = $input;
$this->quotation = $quotation;
}
/**
* The main idea is to go through the string to parse char by char to analize
* when a complete field is detected it´ll be quoted according and added to an array
*/
public function parse()
{
for($i = 0; $i < strlen($this->input); $i++){
$this->processStream($i);
}
foreach($this->parsedArray as $value)
{
if(!is_null($value))
$this->result .= '"'.addslashes($value).'",';
}
return rtrim($this->result, ',');
}
private function processStream($i)
{
#A case (its a separator)
if($this->input[$i]===$this->separator){
$this->log("A", $this->input[$i]);
if($this->insideQuote){
$this->field .= $this->input[$i];
}else
{
$this->saveField($this->field);
$this->field = NULL;
}
}
#B case (its a f"·%$% quote)
if(in_array($this->input[$i], $this->quotation)){
$this->log("B", $this->input[$i]);
if(!$this->insideQuote){
$this->insideQuote = TRUE;
$this->currentQuote = $this->input[$i];
}
else{
if($this->currentQuote===$this->input[$i]){
$this->insideQuote = FALSE;
$this->currentQuote ='';
$this->saveField($this->field);
$this->field = NULL;
}else{
$this->field .= $this->input[$i];
}
}
}
#C case (its a value :-) )
if(!in_array($this->input[$i], array_merge(array($this->separator), $this->quotation))){
$this->log("C", $this->input[$i]);
$this->field .= $this->input[$i];
}
}
private function saveField($field)
{
$this->parsedArray[] = $field;
}
private function log($type, $value)
{
if($this->logging){
$this->log[] = "CASE ".$type." WITH ".$value." AS VALUE";
}
}
}
and example of how to use it would be:
$original = 'a,"ab",\'ab\'';
$test = new CSVParser($original, ',', array('"', "'"));
echo "<PRE>ORIGINAL: ".$original."</PRE>";
echo "<PRE>PARSED: ".$test->parse()."</PRE>";
echo "<pre>";
print_r($test->log);
echo "</pre>";
and here are the results:
ORIGINAL: a,"ab",'ab'
PARSED: "a","ab","ab"
Array
(
[0] => CASE C WITH a AS VALUE
[1] => CASE A WITH , AS VALUE
[2] => CASE B WITH " AS VALUE
[3] => CASE C WITH a AS VALUE
[4] => CASE C WITH b AS VALUE
[5] => CASE B WITH " AS VALUE
[6] => CASE A WITH , AS VALUE
[7] => CASE B WITH ' AS VALUE
[8] => CASE C WITH a AS VALUE
[9] => CASE C WITH b AS VALUE
[10] => CASE B WITH ' AS VALUE
)
I might have mistakes since i only dedicated 25 mins to it, so any comment will be appreciated an edited.

Convert CSV to JSON using PHP

I am trying to convert CSV file to JSON using PHP.
Here is my code
<?php
date_default_timezone_set('UTC');
$today = date("n_j"); // Today is 1/23/2015 -> $today = 1_23
$file_name = $today.'.CSV'; // My file name is 1_23.csv
$file_path = 'C:\\Users\\bheng\\Desktop\\qb\\'.$file_name;
$file_handle = fopen($file_path, "r");
$result = array();
if ($file_handle !== FALSE) {
$column_headers = fgetcsv($file_handle);
foreach($column_headers as $header) {
$result[$header] = array();
}
while (($data = fgetcsv($file_handle)) !== FALSE) {
$i = 0;
foreach($result as &$column) {
$column[] = $data[$i++];
}
}
fclose($file_handle);
}
// print_r($result); // I see all data(s) except the header
$json = json_encode($result);
echo $json;
?>
print_r($result); // I see all data(s)
Then I json_encode($result); and tried to display it, but nothing is displaying on the screen at all. All I see is the blank screen, and 0 error message.
Am I doing anything wrong ? Can someone help me ?
Added Result of print_r($result);
Array (
[Inventory] => Array (
[0] => bs-0468R(20ug)
[1] => bs-1338R(1ml)
[2] => bs-1557G(no bsa)
[3] => bs-3295R(no BSA)
[4] => bs-0730R-Cy5"
[5] => bs-3889R-PE-Cy7"
[6] => 11033R
[7] => 1554R-A647
[8] => 4667
[9] => ABIN731018
[10] => Anti-DBNL protein
.... more ....
Try like this:
$file="1_23.csv";
$csv= file_get_contents($file);
$array = array_map("str_getcsv", explode("\n", $csv));
$json = json_encode($array);
print_r($json);
data.csv
Game,Skill
Treasure Hunter,pilipala
Rocket Launcher,bibobibo
Rocket Engine,hehehohoho
To convert with column name, this is how I do it.
csv2json.php
<?php
if (($handle = fopen("data.csv", "r")) !== FALSE) {
$csvs = [];
while(! feof($handle)) {
$csvs[] = fgetcsv($handle);
}
$datas = [];
$column_names = [];
foreach ($csvs[0] as $single_csv) {
$column_names[] = $single_csv;
}
foreach ($csvs as $key => $csv) {
if ($key === 0) {
continue;
}
foreach ($column_names as $column_key => $column_name) {
$datas[$key-1][$column_name] = $csv[$column_key];
}
}
$json = json_encode($datas);
fclose($handle);
print_r($json);
}
The output result
[
{
"Game": "Treasure Hunter",
"Skill": "pilipala"
},
{
"Game": "Rocket Launcher",
"Skill": "bibobibo"
},
{
"Game": "Rocket Engine",
"Skill": "hehehohoho"
}
]
You can try this way too.
<?php
function csvtojson($file,$delimiter)
{
if (($handle = fopen($file, "r")) === false)
{
die("can't open the file.");
}
$csv_headers = fgetcsv($handle, 4000, $delimiter);
$csv_json = array();
while ($row = fgetcsv($handle, 4000, $delimiter))
{
$csv_json[] = array_combine($csv_headers, $row);
}
fclose($handle);
return json_encode($csv_json);
}
$jsonresult = csvtojson("./doc.csv", ",");
echo $jsonresult;
I ran into a similar problem, I ended up using this to recursively convert the data to UTF-8 on an array before encoding to JSON.
function utf8_converter($array)
{
array_walk_recursive($array, function(&$item, $key){
if(!mb_detect_encoding($item, 'utf-8', true)){
$item = utf8_encode($item);
}
});
return $array;
}
From:
http://nazcalabs.com/blog/convert-php-array-to-utf8-recursively/
This issue is pretty old by now, but hoping this helps someone, as it seemed like the simplest example I found, and I know this is a pretty common thing devs might need to do as a beginner, and lots of answers gloss over the magic.
$file = storage_path('app/public/waitlist_users_test.csv'); //--> laravel helper, but you can use any path here
function csv_to_json($file)
{
// file() loads each row as an array value, then array map uses the 'str_getcsv' callback to
$csv = array_map('str_getcsv', file($file));
// array_walk - "walks" through each item of the array and applies the call back function. the & in "&row" means that alterations to $row actually change the original $csv array, rather than treating it as immutable (*sort of immutable...)
array_walk($csv, function(&$row) use ($csv) {
// array_combine takes the header row ($csv[0]) and uses it as array keys for each column in the row
$row = array_combine($csv[0], $row);
});
array_shift($csv); # removes now very redundant column header --> contains {'col_1':'col_1', 'col_2':'col_2'...}
$json = json_encode($csv);
return $json;
}
There's a lot of magic going on with these functions that accept callback functions, that didn't seem to be explained thoroughly above. I'm self taught and have been programming for years, and find that it's often just glossed over without detailing how callbacks work, so I'll dive in just a little bit for the array_map('str_getcsv', file($file)) function - if you pass a function you've written, or inbuilt php function name as a string, it will take the value of whatever (in this case - array) element is being evaluated by the calling function (in this case array_map), and pass that to the callback function without the need to explicitly pass in a variable - super helpful once you get the hang of it, but I find it's not explained thoroughly very often which leaves beginners to not understand why it works, just that it works.
I've linked most of these above, but here's a little more information:
str-getcsv do? Array Walk Array Map Callables/Callbacks
as #MoonCactus noted, the file() function only loads 1 row at a time which helps save on memory usage for large .csv files.
Also, some other posts reference using explode - why not use explode() instead of str_getcsv() to parse rows? Because explode() would not treat possible enclosured parts of string or escaped characters correctly.
Hope somebody finds this helpful!
If you are converting a dynamic CSV file, you can pass the URL through a parameter (url=http://example.com/some.csv) and it will show you the most up-to-date version:
<?php
// Lets the browser and tools such as Postman know it's JSON
header( "Content-Type: application/json" );
// Get CSV source through the 'url' parameter
if ( isset( $_GET['url'] ) ) {
$csv = explode( "\n", file_get_contents( $_GET['url'] ) );
$index = str_getcsv( array_shift( $csv ) );
$json = array_map(
function ( $e ) use ( $index ) {
return array_combine( $index, str_getcsv( $e ) );
}, $csv
);
}
else {
$json = "Please set the path to your CSV by using the '?url=' query string.";
}
// Output JSON
echo json_encode( $json );
Alternate solution that uses similar method as #Whirlwind's solution but returns a more standard JSON result (with named fields for each object/record):
// takes a string of CSV data and returns a JSON representing an array of objects (one object per row)
function convert_csv_to_json($csv_data){
$flat_array = array_map("str_getcsv", explode("\n", $csv_data));
// take the first array item to use for the final object's property labels
$columns = $flat_array[0];
for ($i=1; $i<count($flat_array)-1; $i++){
foreach ($columns as $column_index => $column){
$obj[$i]->$column = $flat_array[$i][$column_index];
}
}
$json = json_encode($obj);
return $json; // or just return $obj if that's a more useful return value
}
The accepted answer uses file_get_contents() to read the entire file as a string in memory, and then explode() it to make it an array.
But it can be made faster, smaller in memory, and more useful:
function ReadCsv($fn)
{
$lines= file($fn); // read file directly as an array of lines
array_pop($lines); // you can remove the last empty line (if required)
$json= json_encode(array_map("str_getcsv", $lines), JSON_NUMERIC_CHECK);
print_r($json);
}
Nb: I used JSON_NUMERIC_CHECK here to avoid numbers being double quoted into strings. It also reduces the output size and it usually helps javascript on the other side (e.g. to compute or plot the data). Beware of phone numbers though!
I liked #ian-d-miller's solution for converting the data into a key / value style format, but I kept running into issues with his code.
Here's what worked for me:
function convert_CSV_to_JSON($csv_data){
// convert csv data to an array
$data = array_map("str_getcsv", explode("\n", $csv_data));
// use the first row as column headers
$columns = $data[0];
// create array to hold our converted data
$json = [];
// iterate through each row in the data
foreach ($data as $row_index => $row_data) {
// skip the first row, since it's the headers
if($row_index === 0) continue;
// make sure we establish each new row as an array
$json[$row_index] = [];
// iterate through each column in the row
foreach ($row_data as $column_index => $column_value) {
// get the key for each entry
$label = $columns[$column_index];
// add this column's value to this row's index / column's key
$json[$row_index][$label] = $column_value;
}
}
// bam
return $json;
}
Usage:
// as is
$json = convert_CSV_to_JSON($csv);
// encoded
$json = json_encode($json);
Something that i've made for myself and may be useful for others :)
This will convert CSV into JSON array with objects (key => value pair).
function csv2json($a, $e = true) {
$b = ["\r\n","\r","\n",];
foreach ($b as $c => $d) {
$a = explode($d, $a);
$a = isset($b[$c + 1]) ? implode($b[$c + 1], $a) : implode(PHP_EOL, $a);
}
// Convert to CSV
$a = array_map("str_getcsv", explode(PHP_EOL, $a));
// Get the first part of the array as the keys
$a = [
"keys" => array_shift($a),
"rows" => $a,
"row" => null,
];
// Define JSON
$b = [];
foreach ($a["rows"] as $a["row"]) {
$a["row"] = [ "csv" => $a["row"], "json" => (object)[], ];
for ($c = 0; $c < count($a["row"]["csv"]); $c++) {
$a["row"]["csv"][$c] = [#json_decode($a["row"]["csv"][$c]),$a["row"]["csv"][$c]];
// Switch from string to booleans, numbers and others
$a["row"]["csv"][$c] = isset($a["row"]["csv"][$c][0]) ? $a["row"]["csv"][$c][0] : $a["row"]["csv"][$c][1];
// Push it back
$a["row"]["json"]->{$a["keys"][$c]} = $a["row"]["csv"][$c];
}
$a["row"] = $a["row"]["json"];
$b[] = $a["row"];
unset($a["row"]);
}
// $e will be "return"
$e = $e ? json_encode($b) : $b;
// Unset useless variables
unset($a, $b, $c, $d);
return $e;
}
How to use?
If you want to return the JSON as a string, Leave it as default.
If you want to return the JSON as an object / array, set the second parameter to false.
Examples:
$csv = "name,age,gender
John Doe,35,male
Jane Doe,32,female";
echo csv2json($csv, true); // Or without the second parameter, just csv2json($csv)
The example above (^) will return a JSON stringified, Like this:
[{"name":"John Doe","age":35,"gender":"male"},{"name":"Jane Doe","age":32,"gender":"female"}]
and the example below:
var_dump(csv2json($csv, false));
will return a JSON array with these objects:
array(2) {
[0]=>
object(stdClass)#1 (3) {
["name"]=>
string(8) "John Doe"
["age"]=>
int(35)
["gender"]=>
string(4) "male"
}
[1]=>
object(stdClass)#2 (3) {
["name"]=>
string(8) "Jane Doe"
["age"]=>
int(32)
["gender"]=>
string(6) "female"
}
}
public function CsvToJson($fileContent){
//Convert CSV To Json and Return
$all_rows = array();
$newhead =array();
//Extract csv data to array on \n
$array = explode("\n",$fileContent);
//Extract csv header to array on 0 Index
$header = explode(",",$array[0]);
//Remove Header Row From Main Data Array
array_shift($array);
//Extract All Arrays To Saperate Orders
foreach($array as $arr){
$sliced = explode(",",$arr);
array_push($all_rows,$sliced);
}
//Extract All Orders Element To Saperate Array Item
foreach($all_rows as $row){
$sliced = explode(",",$arr);
array_push($all_rows,$sliced);
}
//Remove \r From Header Elements
foreach($header as $key=>$value){
$sliced = str_replace ("\r", "", $value);
array_push($newhead,$sliced);
}
//COMBINE Header as KEY And Row Element As Value
$arrrr = array();
foreach($all_rows as $row) {
//Remove Last Element of ROW if it is \r (Break given in css file for next row)
$count= count($row);
if ($row[$count-1] == "\r") {
array_splice($row, count($row) - 1, 1);
}
//CHECK IF HADER COUNT == ROW COUNT
if (count($header) == count($row)) {
array_push($arrrr,array_combine($newhead,$row));
}
}
//CONVERT ARRAY TO JSON
$json = json_encode($arrrr);
//Remove backslasesh from json key and and value to remove \r
$clean = stripslashes($json);
//CONVERT ARRAY TO JSON AGAIN FOR EXPORT
$jsonagain = json_encode($clean);
return $jsonagain;
}

php json_encode big array

I am trying to use json_encode on a big array, and the result returns nothing (yes, I checked that it is utf-8). When I started to investigate this issue I found that the problem arise when a string becomes bigger than 65536.
So when my array is of size 1245, its string from json_encode has length of string(65493), but when I increase array by just one, the string becomes longer than 65536, json_encode fails to output any result.
I thought that the problem is because of memory limit, but when I checked my php.ini I see that it is -1.
Any idea what can be a problem?
Basically I am doing something like this:
$arr = array();
for($i =0; $i<9000; $i++){
$arr[] = array(
'name' => 'test',
'str' => md5($i)
);
}
echo '<pre>'.json_encode($arr).'</pre>';
P.S. sorry guys. I found the problem, thanks to a person with an unreprintable name :-) (thank your Lawrence).
<pre> is the culprit... for some reason it does not print the string in my browser, but it is there.
Lawrence, if you want, you can just write it and I will accept it as correct. Because you were the reason that I came up with this.
Just to remove confusion about this question. The answer is already found and it is in the question.
There is nothing wrong with json_encode function. It works correctly for every output. There is no limitation there except of your memory and how much of it are you giving to your script.
The problem was with browser's implementation of <pre> tag. If you provide too big string to this tag it does not print anything. So the way out is to output answer without <pre> tag
I had the same problem and the array was so big that increasing the memory limit didn't solve my problem. Had to write my own jsonEncode()-method to overcome this:
/**
* Alternative to json_encode() to handle big arrays
* Regular json_encode would return NULL due to memory issues.
* #param $arr
* #return string
*/
private function jsonEncode($arr) {
$str = '{';
$count = count($arr);
$current = 0;
foreach ($arr as $key => $value) {
$str .= sprintf('"%s":', $this->sanitizeForJSON($key));
if (is_array($value)) {
$str .= '[';
foreach ($value as &$val) {
$val = $this->sanitizeForJSON($val);
}
$str .= '"' . implode('","', $value) . '"';
$str .= ']';
} else {
$str .= sprintf('"%s"', $this->sanitizeForJSON($value));
}
$current ++;
if ($current < $count) {
$str .= ',';
}
}
$str.= '}';
return $str;
}
/**
* #param string $str
* #return string
*/
private function sanitizeForJSON($str)
{
// Strip all slashes:
$str = stripslashes($str);
// Only escape backslashes:
$str = str_replace('"', '\"', $str);
return $str;
}
Please try this,
$arr = array();
for($i =0; $i<3000; $i++){
$arr[] = array(
'name' => 'test',
'str' => md5($i)
);
}
$contentArr = str_split(json_encode($arr), 65536);
foreach ($contentArr as $part) {
echo $part;
}
It also occur if the array exceed memory limit, you can try change memory_limit in php.ini like
memory_limit=256M
in my case I found out that the array (derived from my database) contains strings including special characters so I made sure to convert them to utf-8 before using json_encode() function. more on that:
explained here

Scraping a plain text file with no HTML?

I have the following data in a plain text file:
1. Value
Location : Value
Owner: Value
Architect: Value
2. Value
Location : Value
Owner: Value
Architect: Value
... upto 200+ ...
The numbering and the word Value changes for each segment.
Now I need to insert this data in to a MySQL database.
Do you have a suggestion on how can I traverse and scrape it so I can get the value of the text beside the number, and the value of "location", "owner", "architect" ?
Seems hard to do with DOM scraping class since there is no HTML tags present.
If the data is constantly structured, you can use fscanf to scan them from file.
/* Notice the newlines at the end! */
$format = <<<FORMAT
%d. %s
Location : %s
Owner: %s
Arcihtect: %s
FORMAT;
$file = fopen('file.txt', 'r');
while ($data = fscanf($file, $format)) {
list($number, $title, $location, $owner, $architect) = $data;
// Insert the data to database here
}
fclose($file);
More about fscanf in docs.
If every block has the same structure, you could do this with the file() function: http://nl.php.net/manual/en/function.file.php
$data = file('path/to/file.txt');
With this every row is an item in the array, and you could loop through it.
for ($i = 0; $i<count($data); $i+=5){
$valuerow = $data[$i];
$locationrow = $data[$i+1];
$ownerrow = $data[$i+2];
$architectrow = $data[$i+3];
// strip the data you don't want here, and instert it into the database.
}
That will work with a very simple stateful line-oriented parser. Every line you cumulate parsed data into an array(). When something tells you're on a new record, you dump what you parsed and proceed again.
Line-oriented parsers have a great property : they require little memory and what's most important, constant memory. They can proceed with gigabytes of data without any sweat. I'm managing a bunch of production servers and there's nothing worse than those scripts slurping whole files into memory (then stuffing arrays with parsed content which requires more than twice the original file size as memory).
This works and is mostly unbreakable :
<?php
$in_name = 'in.txt';
$in = fopen($in_name, 'r') or die();
function dump_record($r) {
print_r($r);
}
$current = array();
while ($line = fgets($in)) {
/* Skip empty lines (any number of whitespaces is 'empty' */
if (preg_match('/^\s*$/', $line)) continue;
/* Search for '123. <value> ' stanzas */
if (preg_match('/^(\d+)\.\s+(.*)\s*$/', $line, $start)) {
/* If we already parsed a record, this is the time to dump it */
if (!empty($current)) dump_record($current);
/* Let's start the new record */
$current = array( 'id' => $start[1] );
}
else if (preg_match('/^(.*):\s+(.*)\s*/', $line, $keyval)) {
/* Otherwise parse a plain 'key: value' stanza */
$current[ $keyval[1] ] = $keyval[2];
}
else {
error_log("parsing error: '$line'");
}
}
/* Don't forget to dump the last parsed record, situation
* we only detect at EOF (end of file) */
if (!empty($current)) dump_record($current);
fclose($in);
?>
Obvously you'll need something suited to your taste in function dump_record, like printing a correctly formated INSERT SQL statement.
This will give you what you want,
$array = explode("\n\n", $txt);
foreach($array as $key=>$value) {
$id_pattern = '#'.($key+1).'. (.*?)\n#';
preg_match($id_pattern, $value, $id);
$location_pattern = '#Location \: (.*?)\n#';
preg_match($location_pattern, $value, $location);
$owner_pattern = '#Owner\: (.*?)\n#';
preg_match($owner_pattern, $value, $owner);
$architect_pattern = '#Architect\: (.*?)#';
preg_match($architect_pattern, $value, $architect);
$id = $id[1];
$location = $location[1];
$owner = $owner[1];
$architect = $architect[1];
mysql_query("INSERT INTO table (id, location, owner, architect) VALUES ('".$id."', '".$location."', '".$owner."', '".$architect."')");
//Change MYSQL query
}
Agreed with Topener solution, here's an example if each block is 4 lines + blank line:
$data = file('path/to/file.txt');
$id = 0;
$parsedData = array();
foreach ($data as $n => $row) {
if (($n % 5) == 0) $id = (int) $row[0];
else {
$parsedData[$id][$row[0]] = $row[1];
}
}
Structure will be convenient to use, for MySQL or whatelse. I didn't add code to remove the colon from the first segment.
Good luck!
preg_match_all("/(\d+)\.(.*?)\sLocation\s*\:\s*(.*?)\sOwner\s*\:\s*(.*?)\sArchitect\s*\:\s*(.*?)\s?/i",$txt,$m);
$matched = array();
foreach($m[1] as $k => $v) {
$matched[$v] = array(
"location" => trim($m[2][$v]),
"owner" => trim($m[3][$v]),
"architect" => trim($m[4][$v])
);
}

Categories