Read multiple csv files in folder - php

i need a help ^^
What i need is script which will open and read all .csv files in folder 'csv/files' and then do that thing in "if". Well, when i had only one file it worked fine. I managed to construct some script which is not working but no "error line" popping up either ...
So can somebody look at my code and tell me what i am doing wrong ?
<?php
foreach (glob("*.csv") as $filename) {
echo $filename."<br />";
if (($handle = fopen($filename, "r")) !== FALSE) {
while (($data = fgetcsv($handle, 1000, ";")) !== FALSE) {
$url = $data[0];
$path = $data[1];
$ch = curl_init($url);
$fp = fopen($path, 'wb');
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_exec($ch);
curl_close($ch);
fclose($fp);
}
fclose($handle);
}
}
?>

This is a prime candidate for multi-threading, and here's some code to do it:
<?php
class WebWorker extends Worker {
public function run() {}
}
class WebTask extends Stackable {
public function __construct($input, $output) {
$this->input = $input;
$this->output = $output;
$this->copied = 0;
}
public function run() {
$data = file_get_contents($this->input);
if ($data) {
file_put_contents(
$this->output, $data);
$this->copied = strlen($data);
}
}
public $input;
public $output;
public $copied;
}
class WebPool {
public function __construct($max) {
$this->max = $max;
$this->workers = [];
}
public function submit(WebTask $task) {
$random = rand(0, $this->max);
if (isset($this->workers[$random])) {
return $this->workers[$random]
->stack($task);
} else {
$this->workers[$random] = new WebWorker();
$this->workers[$random]
->start();
return $this->workers[$random]
->stack($task);
}
}
public function shutdown() {
foreach ($this->workers as $worker)
$worker->shutdown();
}
protected $max;
protected $workers;
}
$pool = new WebPool(8);
$work = [];
$start = microtime(true);
foreach (glob("csv/*.csv") as $file) {
$file = fopen($file, "r");
if ($file) {
while (($line = fgetcsv($file, 0, ";"))) {
$wid = count($work);
$work[$wid] = new WebTask(
$line[0], $line[1]);
$pool->submit($work[$wid]);
}
}
}
$pool->shutdown();
$runtime = microtime(true) - $start;
$total = 0;
foreach ($work as $job) {
printf(
"[%s] %s -> %s %.3f kB\n",
$job->copied ? "OK" : "FAIL",
$job->input,
$job->output,
$job->copied/1024);
$total += $job->copied;
}
printf(
"[TOTAL] %.3f kB in %.3f seconds\n",
$total/1024, $runtime);
?>
This will create a maximum number of pooled threads, it will then read through a directory of semi-colon seperated csv files where each line is input;output, it will then submit the task to read the input and write the output asynchronously to the pool for execution, while the main thread continues to read csv files.
I have used the simplest input/output file_get_contents and file_put_contents so that you can see how it works without cURL.
The worker selected when a task is submitted to the pool is random, this may not be desirable, it's possible to detect if a worker is busy but this would complicate the example.
Further reading:
https://gist.github.com/krakjoe/6437782
http://php.net/pthreads

Related

handle multiple request accessing a json file

I have json file in text having a sample data
{"msisdn":"xxxxxxxxxx","productID":"YYYYYYYY","subdate":"2018-09-28 16:30:35","Status":"1"}
{"msisdn":"xxxxxxxxxx","productID":"YYYYYYYY","subdate":"2018-09-28 16:30:35","Status":"1"}
and I have a php code that check the json file for existing msisdn
class JSONObject implements JsonSerializable
{
public function __construct($json = false)
{
if ($json)
$this->set(json_decode($json, true));
}
public function set($data)
{
foreach ($data AS $key => $value) {
if (is_array($value)) {
$sub = new JSONObject;
$sub->set($value);
$value = $sub;
}
$this->{$key} = $value;
}
}
public function jsonSerialize()
{
return (object) get_object_vars($this);
}
}
function checkmsisdnallreadyexists($file,$msisdn)
{
if (is_file($file)) {
if (($handle = fopen($file, 'r'))) {
while (!feof($handle)) {
$line = trim(fgets($handle));
$jsonString = json_encode(json_decode($line));
// Here's the sweetness.
//$class = new JSONObject($jsonString);
$class = new JSONObject($jsonString);
if($class->msisdn == $msisdn)
{
$date1=date_create($class->subdate);
$date2=date_create(date('Y-m-d H:i:s'));
$diff=date_diff($date1,$date2);
if($diff->format('%a') < 31)
{
fclose($handle);
return true;
}
}
}
fclose($handle);
}
}
return false;
}
Everything is working fine initially but when my json file has more than 30 000 records, we have a read timeout. as we have a huge request on my server approx 200k request per hour resulting in efficiency of the whole process.
Can any one provide a solution or a alternate method?
Note: I can't use database here
You can use file() instead of fopen() and fclose()
function checkmsisdnallreadyexists($msisdn){
$file_array = file('give file path',FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES);
foreach($file_array as $arr){
$msisdn_array = json_decode($arr,true);
$msisdn_value = $msisdn_array['msisdn'];
if($msisdn_value == $msisdn) {
$date1=date_create($msisdn_array['subdate']);
$date2=date_create(date('Y-m-d H:i:s'));
$diff=date_diff($date1,$date2);
if($diff->format('%a') < 31) {
return true;
}
}
}
}

Generator function not executed PHP

In the following code, I don't get 'handled' on my output. I see the filehandle a resource, the file gets opened en the contructor of FqReader is called, I checked all that. But with execution of FqReader::getread() I don't see output and the returned array is empty. The first while loop also does not get exectuted when I put while(1) instead of the logic test as in the code now.
<?php
class FastqFile {
function __construct($filename) {
if (substr($filename, -3, 3) == '.gz') {
$this->handle = gzopen($filename, 'r');
return $this->handle;
}
else
$this->handle = fopen($filename, 'r');
return $this->handle;
}
}
class FqReader {
function __construct($file_handle) {
$this->handle = $file_handle;
}
function getread() {
while ($header = fgets($this->handle) !== false) {
echo "handled";
$bases = fgets($this->handle);
$plus = fgets($this->handle);
$scores = fgets($this->handle);
yield array($header, $plus, $scores);
}
}
}
$filename = $argv[1];
$file_handle = new FastqFile($filename);
var_dump($file_handle);
$reader = new FqReader($file_handle);
var_dump($reader->getread());
It outputs:
object(FastqFile)#1 (1) {
["handle"]=>
resource(5) of type (stream)
}
object(Generator)#3 (0) {
}
$file_handle is a FastqFileinstance. Then you pass that object to fgets(), but you need to pass that object's handle to fgets(). For instance:
class FqReader {
function __construct($file_handle) {
$this->handle = $file_handle->handle;
}
function getread() {
while ($header = fgets($this->handle) !== false) {
echo "handled";
$bases = fgets($this->handle);
$plus = fgets($this->handle);
$scores = fgets($this->handle);
yield array($header, $plus, $scores);
}
}
}
The usage of yield was not showing you that error.
Exactly, this works like a charm:
(using a function to open file, not a class)
function openfq($filename)
{
if (substr($filename, -3, 3) == '.gz') {
$handle = gzopen($filename, 'r');
return $handle;
}
else
$handle = fopen($filename, 'r');
return $handle;
}
class FqReader {
function __construct($file_handle) {
$this->handle = $file_handle;
}
function getread() {
while (($header = fgets($this->handle)) !== false) {
echo "handled";
$bases = fgets($this->handle);
$plus = fgets($this->handle);
$scores = fgets($this->handle);
yield array($header, $bases, $scores);
}
}
}
$filename = $argv[1];
$file_handle = openfq($filename);
var_dump($file_handle);
$reader = new FqReader($file_handle);
var_dump($reader->getread());
foreach($reader->getread() as $read) {
var_dump($read);
}

PHP extract data from binary file with structure

I have i file with structure in image. I want to extract data to array from that:
function get_data($file, $number)
{
if(!$fp = fopen ($file, 'rb')) return 0;
$fsize = filesize($file);
if(!$data = fread ($fp, $fsize)) return 0;
$data_format=
'#100/'.
'smember_id/'.
'cmember_name_length/'.
'a' . $member_name_length . 'member_name/'.
'C100other_data/';
$data = unpack ($data_format, $data);
fclose($file);
return $data;
}
How can I get the $member_name_length from the file? I want to create a function that if user input the $number, it returns a array of $number(th) data.
Thank you.
Since you have a variable-length data blocks, you can read them only sequentially, so in order to read n-th block, you need to read all n first blocks:
function readDataBlock($f) {
$data = unpack('nmember_id', fread ($f, 2)); // I assume member_id is n, not s
if ($data['member_id'] == 0xFFFF) {
throw new \Exception('End of file');
}
$data = array_merge($data, unpack('Cmember_name_length', fread ($f, 1))); //again, it must be C, not c, as I can't imagine negative length.
$data = array_merge($data, unpack('a*member_name', fread ($f, $data['member_name_length']))); // be sure you understand how a differs from A
return array_merge($data, unpack('C100other_data', fread ($f, 100))); // are you sure C100 is what you want here?
}
function get_data($file, $number)
{
if(!$fp = fopen ($file, 'rb')) return 0;
fread ($fp, 100); //skip header
for($n = 0; $n <= $number; $n++) {
$data = readDataBlock($fp); // read single member
}
fclose($fp);
return $data; //return the n-th member
}
If the file is small enough to fit into memory, it might be better to read it once and return n-th member from memory:
$data = [];
while(true) {
try {
$data[] = readDataBlock($fp);
} catch(\Exception $e) {
break;
}
}
function get_data(&$data, $number)
{
return $data[$number];
}

PHP Library to Parse Mobi

Is there any freely available library for PHP which parses a .mobi file to get the:
Author
Title
Publisher
Cover
Edit:
To everyone who thinks this is an exact duplicate of Does a PHP Library Exist to Work with PRC/MOBI Files, you're obviously too lazy to read the questions.
That asker wants to know how to generate .mobi files using a PHP library. I want to know how to break apart, or parse, already created .mobi files to get certain information. Therefore, the solution to that question, phpMobi will not work because it is a script to generate .mobi files from HTML, not to parse .mobi files.
A very very very lame example, but if you get desperate, you may try something like this:
$data = file_get_contents("A Young Girl's Diary - Freud, Sigmund.mobi");
$chunk = mb_substr($data, mb_strpos($data, 'EXTH'), 512);
$chunks = explode("\x00", $chunk);
array_shift($chunks);
$chunks = array_filter($chunks, function($str){return preg_match('#([A-Z])#', $str) && mb_strlen($str) > 2;});
$chunks = array_combine(array('author', 'publisher', 'title'), $chunks);
print_r($chunks);
Output:
Array
(
[author] => Freud, Sigmund
[publisher] => Webarto
[title] => A Young Girl's Diary
)
File used: http://freekindlebooks.org/Freud/752-h.mobi (edited Publisher metadata with Calibre)
File parsing is not even remotely easy or fun thing to do. Just take a look at this: http://code.google.com/p/xee/source/browse/XeePhotoshopLoader.m?r=a70d7396356997114b548f4ab2cbd49badd7d285#107
What you should be doing is reading byte by byte, but because there is no detailed documentation, I'm afraid that won't be an easy job.
P.S. I haven't tried to fetch cover photo.
If someone is still interested here's a sample of mobi metadata reading:
class palmDOCHeader
{
public $Compression = 0;
public $TextLength = 0;
public $Records = 0;
public $RecordSize = 0;
}
class palmHeader
{
public $Records = array();
}
class palmRecord
{
public $Offset = 0;
public $Attributes = 0;
public $Id = 0;
}
class mobiHeader
{
public $Length = 0;
public $Type = 0;
public $Encoding = 0;
public $Id = 0;
public $FileVersion = 0;
}
class exthHeader
{
public $Length = 0;
public $Records = array();
}
class exthRecord
{
public $Type = 0;
public $Length = 0;
public $Data = "";
}
class mobi {
protected $mobiHeader;
protected $exthHeader;
public function __construct($file){
$handle = fopen($file, "r");
if ($handle){
fseek($handle, 60, SEEK_SET);
$content = fread($handle, 8);
if ($content != "BOOKMOBI"){
echo "Invalid file format";
fclose($handle);
return;
}
// Palm Database
echo "\nPalm database:\n";
$palmHeader = new palmHeader();
fseek($handle, 0, SEEK_SET);
$name = fread($handle, 32);
echo "Name: ".$name."\n";
fseek($handle, 76, SEEK_SET);
$content = fread($handle, 2);
$records = hexdec(bin2hex($content));
echo "Records: ".$records."\n";
fseek($handle, 78, SEEK_SET);
for ($i=0; $i<$records; $i++){
$record = new palmRecord();
$content = fread($handle, 4);
$record->Offset = hexdec(bin2hex($content));
$content = fread($handle, 1);
$record->Attributes = hexdec(bin2hex($content));
$content = fread($handle, 3);
$record->Id = hexdec(bin2hex($content));
array_push($palmHeader->Records, $record);
echo "Record ".$i." offset: ".$record->Offset." attributes: ".$record->Attributes." id : ".$record->Id."\n";
}
// PalmDOC Header
$palmDOCHeader = new palmDOCHeader();
fseek($handle, $palmHeader->Records[0]->Offset, SEEK_SET);
$content = fread($handle, 2);
$palmDOCHeader->Compression = hexdec(bin2hex($content));
$content = fread($handle, 2);
$content = fread($handle, 4);
$palmDOCHeader->TextLength = hexdec(bin2hex($content));
$content = fread($handle, 2);
$palmDOCHeader->Records = hexdec(bin2hex($content));
$content = fread($handle, 2);
$palmDOCHeader->RecordSize = hexdec(bin2hex($content));
$content = fread($handle, 4);
echo "\nPalmDOC Header:\n";
echo "Compression:".$palmDOCHeader->Compression."\n";
echo "TextLength:".$palmDOCHeader->TextLength."\n";
echo "Records:".$palmDOCHeader->Records."\n";
echo "RecordSize:".$palmDOCHeader->RecordSize."\n";
// MOBI Header
$mobiStart = ftell($handle);
$content = fread($handle, 4);
if ($content == "MOBI"){
$this->mobiHeader = new mobiHeader();
echo "\nMOBI header:\n";
$content = fread($handle, 4);
$this->mobiHeader->Length = hexdec(bin2hex($content));
$content = fread($handle, 4);
$this->mobiHeader->Type = hexdec(bin2hex($content));
$content = fread($handle, 4);
$this->mobiHeader->Encoding = hexdec(bin2hex($content));
$content = fread($handle, 4);
$this->mobiHeader->Id = hexdec(bin2hex($content));
echo "Header length: ".$this->mobiHeader->Length."\n";
echo "Type: ".$this->mobiHeader->Type."\n";
echo "Encoding: ".$this->mobiHeader->Encoding."\n";
echo "Id: ".$this->mobiHeader->Id."\n";
fseek($handle, $mobiStart+$this->mobiHeader->Length, SEEK_SET);
$content = fread($handle, 4);
if ($content == "EXTH"){
$this->exthHeader = new exthHeader();
echo "\nEXTH header:\n";
$content = fread($handle, 4);
$this->exthHeader->Length = hexdec(bin2hex($content));
$content = fread($handle, 4);
$records = hexdec(bin2hex($content));
echo "Records: ".$records."\n";
for ($i=0; $i<$records; $i++){
$record = new exthRecord();
$content = fread($handle, 4);
$record->Type = hexdec(bin2hex($content));
$content = fread($handle, 4);
$record->Length = hexdec(bin2hex($content));
$record->Data = fread($handle, $record->Length - 8);
array_push($this->exthHeader->Records, $record);
echo "Record ".$i." type: ".$record->Type." length: ".$record->Length."\n";
echo " data: ".$record->Data."\n";
}
}
}
fclose($handle);
}
}
protected function GetRecord($type)
{
foreach ($this->exthHeader->Records as $record){
if ($record->Type == $type)
return $record;
}
return NULL;
}
protected function GetRecordData($type)
{
$record = $this->GetRecord($type);
if ($record)
return $record->Data;
return "";
}
public function Title()
{
return $this->GetRecordData(503);
}
public function Author()
{
return $this->GetRecordData(100);
}
public function Isbn()
{
return $this->GetRecordData(104);
}
public function Subject()
{
return $this->GetRecordData(105);
}
public function Publisher()
{
return $this->GetRecordData(101);
}
}
$mobi = new mobi("test.mobi");
echo "\nTitle: ".$mobi->Title();
echo "\nAuthor: ".$mobi->Author();
echo "\nIsbn: ".$mobi->Isbn();
echo "\nSubject: ".$mobi->Subject();
echo "\nPublisher: ".$mobi->Publisher();
Had the same issue, didn't find any of PHP parsers, had to write my own(unfortunately I can't disclose my code). Here is a good resource about .mobi structure http://wiki.mobileread.com/wiki/MOBI

Is there a way to access a string as a filehandle in php?

I'm on a server where I'm limited to PHP 5.2.6 which means str_getcsv is not available to me. I'm using, instead fgetcsv which requires "A valid file pointer to a file successfully opened by fopen(), popen(), or fsockopen()." to operate on.
My question is this: is there a way to access a string as a file handle?
My other option is to write the string out to a text file and then access it via fopen() and then use fgetcsv, but I'm hoping there's a way to do this directly, like in perl.
If you take a look in the user notes on the manual page for str_getcsv, you'll find this note from daniel, which proposes this function (quoting) :
<?php
if (!function_exists('str_getcsv')) {
function str_getcsv($input, $delimiter = ",", $enclosure = '"', $escape = "\\") {
$fiveMBs = 5 * 1024 * 1024;
$fp = fopen("php://temp/maxmemory:$fiveMBs", 'r+');
fputs($fp, $input);
rewind($fp);
$data = fgetcsv($fp, 1000, $delimiter, $enclosure); // $escape only got added in 5.3.0
fclose($fp);
return $data;
}
}
?>
It seems to be doing exactly what you asked for : it uses a stream, which points to a temporary filehandle in memory, to use fgetcsv on it.
See PHP input/output streams for the documentation about, amongst others, the php://temp stream wrapper.
Of course, you should test that it works OK for you -- but, at least, this should give you an idea of how to achieve this ;-)
I'm horrified that no one has answered this solution:
<?php
$string = "I tried, honestly!";
$fp = fopen('data://text/plain,' . $string,'r');
echo stream_get_contents($fp);
#fputcsv($fp, .......);
?>
And memory hungry perfect solution:
<?php
class StringStream
{
private $Variable = NULL;
protected $fp = 0;
final public function __construct(&$String, $Mode = 'r')
{
$this->$Variable = &$String;
switch($Mode)
{
case 'r':
case 'r+':
$this->fp = fopen('php://memory','r+');
fwrite($this->fp, #strval($String));
rewind($this->fp);
break;
case 'a':
case 'a+':
$this->fp = fopen('php://memory','r+');
fwrite($this->fp, #strval($String));
break;
default:
$this->fp = fopen('php://memory',$Mode);
}
}
final public function flush()
{
# Update variable
$this->Variable = stream_get_contents($this->fp);
}
final public function __destruct()
{
# Update variable on destruction;
$this->Variable = stream_get_contents($this->fp);
}
public function __get($name)
{
switch($name)
{
case 'fp': return $fp;
default: trigger error('Undefined property: ('.$name.').');
}
return NULL;
}
}
$string = 'Some bad-ass string';
$stream = new StringStream($string);
echo stream_get_contents($stream->fp);
#fputcsv($stream->fp, .......);
?>
To answer your general question, yes you can treat a variable as a file stream.
http://www.php.net/manual/en/function.stream-context-create.php
The following is a copy and paste from a few different comments on the PHP manual (so I cannot vouch for how production ready it is):
<?php
class VariableStream {
private $position;
private $varname;
public function stream_open($path, $mode, $options, &$opened_path) {
$url = parse_url($path);
$this->varname = $url["host"];
$this->position = 0;
return true;
}
public function stream_read($count) {
$p=&$this->position;
$ret = substr($GLOBALS[$this->varname], $p, $count);
$p += strlen($ret);
return $ret;
}
public function stream_write($data){
$v=&$GLOBALS[$this->varname];
$l=strlen($data);
$p=&$this->position;
$v = substr($v, 0, $p) . $data . substr($v, $p += $l);
return $l;
}
public function stream_tell() {
return $this->position;
}
public function stream_eof() {
return $this->position >= strlen($GLOBALS[$this->varname]);
}
public function stream_seek($offset, $whence) {
$l=strlen(&$GLOBALS[$this->varname]);
$p=&$this->position;
switch ($whence) {
case SEEK_SET: $newPos = $offset; break;
case SEEK_CUR: $newPos = $p + $offset; break;
case SEEK_END: $newPos = $l + $offset; break;
default: return false;
}
$ret = ($newPos >=0 && $newPos <=$l);
if ($ret) $p=$newPos;
return $ret;
}
}
stream_wrapper_register("var", "VariableStream");
$csv = "foo,bar\ntest,1,2,3\n";
$row = 1;
if (($handle = fopen("var://csv", "r")) !== FALSE) {
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
$num = count($data);
echo "<p> $num fields in line $row: <br /></p>\n";
$row++;
for ($c=0; $c < $num; $c++) {
echo $data[$c] . "<br />\n";
}
}
fclose($handle);
}
?>
Of course, for your particular example, there are simpler stream methods that can be used.
You can use stream handles such as php://memory to achieve what you're after. Just open, fwrite, rewind, and you should be able to use fgetcsv.
Unfortunately, that is not possible. You cannot treat a string as if it's a stream from a file. You would indeed have to first write the string to a file, and then open said file using fopen.
And now for the obvious part, have you considered upgrading?

Categories