I have pretty simple test case to write with parse data of the uploaded csv file. Google is suggesting THIS library that can be helpful in unit tests to mock the real file system but I would like to to it in some simpler way if possible.
I covered basic part of the method, but seems that code coverage stops on fopen method:
function fn_smdl_parse_csv($file, $max_line_size = 65536, $delimiter = ',')
{
$all_data = false;
if (!empty($file) && file_exists($file)) {
if ($f = fopen($file, 'r')) {
while (!feof($f)) {
$data = fgetcsv($f, $max_line_size, $delimiter);
if (!empty($data)) {
$all_data[] = $data;
}
}
fclose($f);
}
}
return $all_data;
}
I tried with:
public function testFnSmdlParseCsv()
{
$result = fn_smdl_parse_csv('file.csv');
self::assertFalse($result);
}
It is not much, but I can not find a way to do it. I want to enter and cover second if() statement. Is there a way to do it with usual mock and not using external libraries?
Related
I know that yield can be used to create a data iterator, e.g. to read data from a CSV file.
function csv_generator($file) {
$handle = fopen($file,"r");
while (!feof($handle)) {
yield fgetcsv($file);
}
fclose($file);
}
But the Generator::send() method suggests that I can do the same for sequential writing, instead of reading.
E.g. I want to use the thing like this:
function csv_output_generator($file) {
$handle = fopen('file.csv', 'w');
while (null !== $row = yield) {
fputcsv($handle, $row);
}
fclose($handle);
}
$output_generator = csv_output_generator($file);
$output_generator->send($rows[0]);
$output_generator->send($rows[1]);
$output_generator->send($rows[2]);
// Close the output generator.
$output_generator->send(null);
The above will work, I think.
But $output_generator->send(null); for closing seems wrong, or not ideal. It means that I can never send a literal null. Which is ok for csv writing, but maybe there is a use case for sending null.
Is there any "best practice" for using php generators for sequential writing?
Not saying this is a marvelous idea but if you're talking semantics, this 'feels' great.
Check against a class. Like pass in objects of a particular class to terminate the generator. Like:
// should probably use namespacing here.
class GeneratorUtilClose {}
class GeneratorUtil {
public static function close() {
return new GeneratorUtilClose;
}
}
function csv_output_generator($file) {
$handle = fopen('file.csv', 'w');
while (!(($row = yield) instanceof GeneratorUtilClose)) {
fputcsv($handle, $row);
}
fclose($handle);
}
$output_generator = csv_output_generator($file);
$output_generator->send($rows[0]);
$output_generator->send(GeneratorUtil::close());
Added a little factory in here for extra semantic sugar.
Not ideal either but works without creating any other class
function csv_output_generator($file) {
$handle = fopen($file, 'w');
try {
while ($row = yield) {
fputcsv($handle, $row);
}
} catch (ClosedGeneratorException $e) {
// closing generator
}
fclose($handle);
}
$output_generator = csv_output_generator($file);
$output_generator->send($rows[0]);
$output_generator->send($rows[1]);
$output_generator->send($rows[2]);
// Close the output generator.
$output_generator->throw(new ClosedGeneratorException());
I have a large CSV I am writing a PHP CLI script to import into an existing PHP application, while utilizing the existing application's ORM to do relationship management between fields (trust me, I looked at a SQL CSV import, its easier this way). The issue at hand is that the CSV data I have is either malformed, or I have my fgetcsv() call wrong.
This is an example data set i'm aiming to import:
id,fname,lname,email\n
1,John,Public,johnqpublic#mailinator.com\n
1,Jane,Public,janeqpublic#mailinator.com\n
And the CSV Import code pretty much takes from the PHP Docs on fgetcsv():
function import_users($filepath) {
$row = 0;
$linesExecuted = 0;
if(($file = fopen($filepath, 'r')) !== false) {
$header = fgetcsv($file); //Loads line 1
while(($data = fgetcsv($file, 0, ",")) !== false) {
$userObj = user_record_generate_stdclass($data);
//A future method actually pipes the data through via the ORM
$row++;
}
fclose($file);
} else {
echo "It's going horribly wrong";
}
echo $row." records imported.";
}
The resulting logic of this method is pretty much a % sign, which is baffling. Am I overlooking something?
I have several files to parse (with PHP) in order to insert their respective content in different database tables.
First point : the client gave me 6 files, 5 are CSV with values separated by coma ; The last one do not come from the same database and its content is tabulation-based.
I built a FileParser that uses SplFileObject to execute a method on each line of the file-content (basically, create an Entity with each dataset and persist it to the database, with Symfony2 and Doctrine2).
But I cannot manage to parse the tabulation-based text file with SplFileObject, it does not split the content in lines as I expect it to do...
// In my controller context
$parser = new MyAmazingFileParser();
$parser->parse($filename, $delimitor, function ($data) use ($em) {
$e = new Entity();
$e->setSomething($data[0);
// [...]
$em->persist($e);
});
// In my parser
public function parse($filename, $delimitor = ',', $run = null) {
if (is_callable($run)) {
$handle = new SplFileObject($filename);
$infos = new SplFileInfo($filename);
if ($infos->getExtension() === 'csv') {
// Everything is going well here
$handle->setCsvControl(',');
$handle->setFlags(SplFileObject::DROP_NEW_LINE + SplFileObject::READ_AHEAD + SplFileObject::SKIP_EMPTY + SplFileObject::READ_CSV);
foreach (new LimitIterator($handle, 1) as $data) {
$result = $run($data);
}
} else {
// Why does the Iterator-way does not work ?
$handle->setCsvControl("\t");
// I have tried with all the possible flags combinations, without success...
foreach (new LimitIterator($handle, 1) as $data) {
// It always only gets the first line...
$result = $run($data);
}
// And the old-memory-killing-dirty-way works ?
$fd = fopen($filename, 'r');
$contents = fread($fd, filesize($filename));
foreach (explode("\t", $contents) as $line) {
// Get all the line as I want... But it's dirty and memory-expensive !
$result = $run($line);
}
}
}
}
It is probably related with the horrible formatting of my client's file, but after a long discussion with them, they really cannot get another format for me, for some acceptable reasons (constraints in their side), unfortunately.
The file is currently long of 49459 lines, so I really think the memory is important at this step ; So I have to make the SplFileObject way working, but do not know how.
An extract of the file can be found here :
Data-extract-hosted
I'm trying to stick with the "Fat Model" approach in programming my CakePHP app and am currently stumped in my attempts to get a function working in one of my models. Here's a slightly simplified version of my code.
See the part where it says "This is where the trouble starts" and a bunch of lines are commented out. I can't figure out how to save the headers info to the model instance, because I'm using "findByID" which returns an array instead of an object.
I found another thread (CakePHP 2.0 Object not Array) discussing the array vs. object issue. It seems that's part of the design of Cake 2.X, though I don't really understand why. Anyway I don't want a workaround as is suggested in that other thread, but rather I want to understand how to do this correctly:
// app/Model/Datareportdoc.php
App::uses('AppModel', 'Model');
class Datareportdoc extends AppModel {
public function parseData($id) {
$results = ""; // init
$filepath = ""; // init
$reportdoc = $this->findById($id);
$filepath = $reportdoc['Datareportdoc']['filepath'];
$headers = array(); // init
// Parse and save the report data
if (($handle = fopen($filepath, "r")) !== FALSE) {
$row = 0;
while (($datarow = fgetcsv($handle, 1000, ",")) !== FALSE) {
$num = count($datarow);
// If it's a header row, create the headers array
if ($row == 0) {
$results .= "HEADER ROW\n";
$dataheaders = implode(",", $datarow);
// *** THIS IS WHERE THE TROUBLE STARTS ***
//$this->Datareportdoc->id = $id; // NO: "Indirect modification of overloaded property"
//$this->Datareportdoc->set('dataheaders', $dataheaders); // No, because we have an array, not an object :-(
//$this->Datareportdoc->dataheaders = $dataheaders;
//$this->Datareportdoc->save(); // NO: Error: Call to a member function save() on a non-object
// Also tried this, which also doesn't work
//$reportdoc['Datareportdoc']['dataheaders'] = $dataheaders;
//$reportdoc->save();
$results .= "Datareportdoc dataheaders: ".$dataheaders."\n";
} else {
$data = array();
$i = 0;
foreach ($headers as $header) {
$data[$header] = $datarow[$i];
$i++;
}
$results .= "Got data for insert:\n".print_r($data, true)."\n\n\n";
// #TODO: Insert the data...
}
$row++;
}
fclose($handle);
} else {
$results .= "fopen failed :-(<br />\n";
//echo "fopen failed :-(<br />\n";
//exit();
}
return $results;
}
}
So give this a shot:
Instead of this:
//$this->Datareportdoc->id = $id;
//$this->Datareportdoc->set('dataheaders', $dataheaders); // No, because we have an array, not an object :-(
//$this->Datareportdoc->dataheaders = $dataheaders;
//$this->Datareportdoc->save();
Use this:
$this->save($dataheaders);
It works every time in the model when done this way.
(Background: CakePHP 2.x cookbook talks a 'little' bit about active record .... but it's not really if you're used to other platforms, not in the sense I understand the term anyway. (3.0 is ORM and active record, but not so for 2.x)
I seem to be in a catch-22 with a small app I'm developing in PHP on Google App Engine using Quercus;
I have a remote csv-file which I can download & store in a string
To parse that string I'd ideally use str_getcsv, but Quercus doesn't have that function yet
Quercus does seem to know fgetcsv, but that function expects a file handle which I don't have (and I can't make a new one as GAE doesn't allow files to be created)
Anyone got an idea of how to solve this without having to dismiss the built-in PHP csv-parser functions and write my own parser instead?
I think the simplest solution really is to write your own parser . it's a piece of cake anyway and will get you to learn more regex- it makes no sense that there is no csv string to array parser in PHP so it's totally justified to write your own. Just make sure it's not too slow ;)
You might be able to create a new stream wrapper using stream_wrapper_register.
Here's an example from the manual which reads global variables: http://www.php.net/manual/en/stream.streamwrapper.example-1.php
You could then use it like a normal file handle:
$csvStr = '...';
$fp = fopen('var://csvStr', 'r+');
while ($row = fgetcsv($fp)) {
// ...
}
fclose($fp);
this shows a simple manual parser i wrote with example input with qualifed, non-qualified, escape feature. it can be used for the header and data rows and included an assoc array function to make your data into a kvp style array.
//example data
$fields = strparser('"first","second","third","fourth","fifth","sixth","seventh"');
print_r(makeAssocArray($fields, strparser('"asdf","bla\"1","bl,ah2","bl,ah\"3",123,34.234,"k;jsdfj ;alsjf;"')));
//do something like this
$fields = strparser(<csvfirstline>);
foreach ($lines as $line)
$data = makeAssocArray($fields, strparser($line));
function strparser($string, $div = ",", $qual = "\"", $esc = "\\") {
$buff = "";
$data = array();
$isQual = false; //the result will be a qualifier
$inQual = false; //currently parseing inside qualifier
//itereate through string each byte
for ($i = 0; $i < strlen($string); $i++) {
switch ($string[$i]) {
case $esc:
//add next byte to buffer and skip it
$buff .= $string[$i+1];
$i++;
break;
case $qual:
//see if this is escaped qualifier
if (!$inQual) {
$isQual = true;
$inQual = true;
break;
} else {
$inQual = false; //done parseing qualifier
break;
}
case $div:
if (!$inQual) {
$data[] = $buff; //add value to data
$buff = ""; //reset buffer
break;
}
default:
$buff .= $string[$i];
}
}
//get last item as it doesnt have a divider
$data[] = $buff;
return $data;
}
function makeAssocArray($fields, $data) {
foreach ($fields as $key => $field)
$array[$field] = $data[$key];
return $array;
}
if it can be dirty and quick. I would just use the
http://php.net/manual/en/function.exec.php
to pass it in and use sed and awk (http://shop.oreilly.com/product/9781565922259.do) to parse it. I know you wanted to use the php parser. I've tried before and failed simply because its not vocal about its errors.
Hope this helps.
Good luck.
You might be able to use fopen with php://temp or php://memory (php.net) to get it to work. What you would do is open either php://temp or php://memory, write to it, then rewind it (php.net), and then pass it to fgetcsv. I didn't test this, but it might work.