CSV file to flat array with materialized path - php

I have CSV file which contains a list of files and directories:
Depth;Directory;
0;bin
1;basename
1;bash
1;cat
1;cgclassify
1;cgcreate
0;etc
1;aliases
1;audit
2;auditd.conf
2;audit.rules
0;home
....
Each line depends on the above one (for the depth param)
I would like to create an array like this one in order to store it into my MongoDB collection with Materialized Paths
$directories = array(
array('_id' => null,
'name' => "auditd.conf",
'path' => "etc,audit,auditd.conf"),
array(....)
);
I don't know how to process...
Any ideas?
Edit 1:
I'm not really working with directories - it's an example, so I cannot use FileSystems functions or FileIterators.
Edit 2:
From this CSV file, I'm able to create a JSON nested array:
function nestedarray($row){
list($id, $depth, $cmd) = $row;
$arr = &$tree_map;
while($depth--) {
end($arr );
$arr = &$arr [key($arr )];
}
$arr [$cmd] = null;
}
But i'm not sure it's the best way to proceed...

This should do the trick, I think (it worked in my test, at least, with your data). Note that this code doesn't do much error checking and expects the input data to be in proper order (i.e. starting with level 0 and no holes).
<?php
$input = explode("\n",file_get_contents($argv[1]));
array_shift($input);
$data = array();
foreach($input as $dir)
{
if(count($parts = str_getcsv($dir, ';')) < 2)
{
continue;
}
if($parts[0] == 0)
{
$last = array('_id' => null,
'name' => $parts[1],
'path' => $parts[1]);
$levels = array($last);
$data[] = $last;
}
else
{
$last = array('id' => null,
'name' => $parts[1],
'path' => $levels[$parts[0] - 1]['path'] . ',' . $parts[1]);
$levels[$parts[0]] = $last;
$data[] = $last;
}
}
print_r($data);
?>

The "best" way to go would be to not store your data in CSV format, as it's the Wrong Tool For The Job.
That said, here you go:
<?php
$lines = file('/path/to/your/csv_file.csv');
$directories = array();
$path = array();
$lastDepth = NULL;
foreach ($lines as $line) {
list($depth, $dir) = str_getcsv($line, ';');
// Skip headers and such
if (!ctype_digit($depth)) {
continue;
}
if ($depth == $lastDepth) {
// If this depth is the same as the last, pop the last directory
// we added off the stack
array_pop($path);
} else if ($depth == 0) {
// At depth 0, reset the path
$path = array();
}
// Push the current directory onto the path stack
$path[] = $dir;
$directories[] = array(
'_id' => NULL,
'name' => $dir,
'path' => implode(',', $path)
);
$lastDepth = $depth;
}
var_dump($directories);
Edit:
For what it's worth, once you have the desired nested structure in PHP, it would probably be a good idea to use json_encode(), serialize(), or some other format to store it to disk again, and get rid of the CSV file. Then you can just use json_decode() or unserialize() to get it back in PHP array format whenever you need it again.

Related

PHP - Compare array and output found values [duplicate]

I want to separate a PHP array when they have a common prefix.
$data = ['status.1', 'status.2', 'status.3',
'country.244', 'country.24', 'country.845',
'pm.4', 'pm.9', 'pm.6'];
I want each of them in separate variables like $status, $countries, $pms which will contain:
$status = [1,2,3];
$country = [244, 24, 845]
$pms = [4,9,6]
My Current code is taking 1.5 seconds to group them:
$statuses = [];
$countries = [];
$pms = [];
$start = microtime(true);
foreach($data as $item){
if(strpos($item, 'status.') !== false){
$statuses[]= substr($item,7);
}
if(strpos($item, 'country.') !== false){
$countries[]= substr($item,8);
}
if(strpos($item, 'pm.') !== false){
$pms[]= substr($item,3);
}
}
$time_elapsed_secs = microtime(true) - $start;
print_r($time_elapsed_secs);
I want to know if is there any faster way to do this
This will give you results for more dynamic prefixs - first explode with the delimiter and then insert by the key to result array.
For separating the value you can use: extract
Consider the following code:
$data = array('status.1','status.2','status.3', 'country.244', 'country.24', 'country.845', 'pm.4','pm.9', 'pm.6');
$res = array();
foreach($data as $elem) {
list($key,$val) = explode(".", $elem, 2);
$res[$key][] = $val;
}
extract($res); // this will separate to var with the prefix name
echo "Status is: " . print_r($status); // will output array of ["1","2","3"]
This snippet took less the 0.001 second...
Thanks #mickmackusa for the simplification
Add continue to each of the if's, so if it's one of them, it won't then run the other ones... not really needed in the last one as obviously the loops starts again anyway. Should save a tiny bit of time, but doubt it'll be as much as you probably want to save.
foreach($data as $item){
if(strpos($item, 'status.') !== false){
$statuses[]= substr($item,7);
continue;
}
if(strpos($item, 'country.') !== false){
$countries[]= substr($item,8);
continue;
}
if(strpos($item, 'pm.') !== false){
$pms[]= substr($item,3);
continue;
}
}
I'd use explode to split them.
something like this:
$arr = array("status" => [],"country" => [],"pm" => []);
foreach($data as $item){
list($key,$val) = explode(".",$item);
$arr[$key][] = $val;
}
extract($res); // taken from david's answer
and it's a much more readable code (in my opinion)
___ EDIT ____
as #DavidWinder commented, this is both not dynamic and will not result in different variables - look at his answer for the most complete solution for your question
Use Explode. Also is a good way to use $limit param for performance and avoiding wrong behavior on having other '.' in values.
$arr = [];
foreach($data as $item){
list($key,$val) = explode('.', $item, 2);
if (!$key || !$val) continue;
$arr[$key][] = $val;
}
var_dump($arr);
If it was me I would do it like so...
<?php
$data = array ('status.1', 'status.2', 'status.3',
'country.244', 'country.24', 'country.845',
'pm.4', 'pm.9', 'pm.6');
$out = array ();
foreach ( $data AS $value )
{
$value = explode ( '.', $value );
$out[$value[0]][] = $value[1];
}
print_r ( $out );
?>
I'm not sure if this'll boost the performance but you could re-arrange your array in a way that each row has a heading and the corresponding value and then use array_column() to group which data you want.
This is an example of how you could group your data in such a way. (PHP 7.1.25+)
$groupedData = array_map(function($arg) {
[$key, $val] = explode('.', $arg); # for PHP 5.6 < 7.1.25 use list($key, $val) = explode(...)
return array($key => $val);
}, $data);
Then, you can pull out all of the country Id's like so:
$countries = array_column($groupedData, 'country');
Here is a live demo.
You can push data into their respective groups while destructuring. The only iterated function call is explode().
Creating individual variables for each group is a design flaw / mismanagement of array data.
Code: (Demo)
$result = [];
foreach ($data as $value) {
[$prefix, $result[$prefix][]] = explode('.', $value, 2);
}
var_export($result);
Output:
array (
'status' =>
array (
0 => '1',
1 => '2',
2 => '3',
),
'country' =>
array (
0 => '244',
1 => '24',
2 => '845',
),
'pm' =>
array (
0 => '4',
1 => '9',
2 => '6',
),
)
Use sscanf() if you want to directly/explicitly cast the numeric values as integers. Demo

How to separate a php array items by prefix

I want to separate a PHP array when they have a common prefix.
$data = ['status.1', 'status.2', 'status.3',
'country.244', 'country.24', 'country.845',
'pm.4', 'pm.9', 'pm.6'];
I want each of them in separate variables like $status, $countries, $pms which will contain:
$status = [1,2,3];
$country = [244, 24, 845]
$pms = [4,9,6]
My Current code is taking 1.5 seconds to group them:
$statuses = [];
$countries = [];
$pms = [];
$start = microtime(true);
foreach($data as $item){
if(strpos($item, 'status.') !== false){
$statuses[]= substr($item,7);
}
if(strpos($item, 'country.') !== false){
$countries[]= substr($item,8);
}
if(strpos($item, 'pm.') !== false){
$pms[]= substr($item,3);
}
}
$time_elapsed_secs = microtime(true) - $start;
print_r($time_elapsed_secs);
I want to know if is there any faster way to do this
This will give you results for more dynamic prefixs - first explode with the delimiter and then insert by the key to result array.
For separating the value you can use: extract
Consider the following code:
$data = array('status.1','status.2','status.3', 'country.244', 'country.24', 'country.845', 'pm.4','pm.9', 'pm.6');
$res = array();
foreach($data as $elem) {
list($key,$val) = explode(".", $elem, 2);
$res[$key][] = $val;
}
extract($res); // this will separate to var with the prefix name
echo "Status is: " . print_r($status); // will output array of ["1","2","3"]
This snippet took less the 0.001 second...
Thanks #mickmackusa for the simplification
Add continue to each of the if's, so if it's one of them, it won't then run the other ones... not really needed in the last one as obviously the loops starts again anyway. Should save a tiny bit of time, but doubt it'll be as much as you probably want to save.
foreach($data as $item){
if(strpos($item, 'status.') !== false){
$statuses[]= substr($item,7);
continue;
}
if(strpos($item, 'country.') !== false){
$countries[]= substr($item,8);
continue;
}
if(strpos($item, 'pm.') !== false){
$pms[]= substr($item,3);
continue;
}
}
I'd use explode to split them.
something like this:
$arr = array("status" => [],"country" => [],"pm" => []);
foreach($data as $item){
list($key,$val) = explode(".",$item);
$arr[$key][] = $val;
}
extract($res); // taken from david's answer
and it's a much more readable code (in my opinion)
___ EDIT ____
as #DavidWinder commented, this is both not dynamic and will not result in different variables - look at his answer for the most complete solution for your question
Use Explode. Also is a good way to use $limit param for performance and avoiding wrong behavior on having other '.' in values.
$arr = [];
foreach($data as $item){
list($key,$val) = explode('.', $item, 2);
if (!$key || !$val) continue;
$arr[$key][] = $val;
}
var_dump($arr);
If it was me I would do it like so...
<?php
$data = array ('status.1', 'status.2', 'status.3',
'country.244', 'country.24', 'country.845',
'pm.4', 'pm.9', 'pm.6');
$out = array ();
foreach ( $data AS $value )
{
$value = explode ( '.', $value );
$out[$value[0]][] = $value[1];
}
print_r ( $out );
?>
I'm not sure if this'll boost the performance but you could re-arrange your array in a way that each row has a heading and the corresponding value and then use array_column() to group which data you want.
This is an example of how you could group your data in such a way. (PHP 7.1.25+)
$groupedData = array_map(function($arg) {
[$key, $val] = explode('.', $arg); # for PHP 5.6 < 7.1.25 use list($key, $val) = explode(...)
return array($key => $val);
}, $data);
Then, you can pull out all of the country Id's like so:
$countries = array_column($groupedData, 'country');
Here is a live demo.
You can push data into their respective groups while destructuring. The only iterated function call is explode().
Creating individual variables for each group is a design flaw / mismanagement of array data.
Code: (Demo)
$result = [];
foreach ($data as $value) {
[$prefix, $result[$prefix][]] = explode('.', $value, 2);
}
var_export($result);
Output:
array (
'status' =>
array (
0 => '1',
1 => '2',
2 => '3',
),
'country' =>
array (
0 => '244',
1 => '24',
2 => '845',
),
'pm' =>
array (
0 => '4',
1 => '9',
2 => '6',
),
)
Use sscanf() if you want to directly/explicitly cast the numeric values as integers. Demo

How to make a tree from a flat array - php

I am trying to represent the whole array returned from Amazon S3 bucket in a tree structure one can browse.
The array example is following
$files[0] = 'container/798/';
$files[1] = 'container/798/logo.png';
$files[2] = 'container/798/test folder/';
$files[3] = 'container/798/test folder/another folder/';
$files[4] = 'container/798/test folder/another folder/again test/';
$files[5] = 'container/798/test folder/another folder/test me/';
$files[6] = 'container/798/test two/';
$files[7] = 'container/798/test two/logo2.png';
and this is what i am trying to achieve
http://i.stack.imgur.com/HBjvE.png
so far i have only achieved differing the files and folder but not on different level with parent-child relation. The above mentioned array resides in $keys['files']. The code is following
$keys = json_decode($result,true);
$folders = array();
$files = array();
$i =0;
foreach ($keys['files'] as $key){
if(endsWith($key, "/")){
$exploded = explode('container/'.$_SESSION['id_user'].'/',$key);
if(!empty($exploded[1]))
$folders[$i]['name'] = substr($exploded[1],0,-1);
}
else{
$exploded = explode('container/'.$_SESSION['id_user'].'/',$key);
$files[$i]['name'] = $exploded[1];
$files[$i]['size'] = "";
$files[$i]['date'] = "";
$files[$i]['preview_icon'] = "";
$files[$i]['dimensions'] = "";
$files[$i]['url'] = "";
}
$i++;
}
This is code just to show i am trying but its not complete or accurate. I don't know how to approach a logic that can give me the hierarchy i am showing the picture. Any help would be greatly appreciated.
I don't know if this is the 'correct' way to do this, but if you want to make a recursive structure, then the easy way is to use a recursive function:
$root = array('name'=>'/', 'children' => array(), 'href'=>'');
function store_file($filename, &$parent){
if(empty($filename)) return;
$matches = array();
if(preg_match('|^([^/]+)/(.*)$|', $filename, $matches)){
$nextdir = $matches[1];
if(!isset($parent['children'][$nextdir])){
$parent['children'][$nextdir] = array('name' => $nextdir,
'children' => array(),
'href' => $parent['href'] . '/' . $nextdir);
}
store_file($matches[2], $parent['children'][$nextdir]);
} else {
$parent['children'][$filename] = array('name' => $filename,
'size' => '...',
'href' => $parent['href'] . '/' . $filename);
}
}
foreach($files as $file){
store_file($file, $root);
}
Now, every element of root['children'] is an associative array that hash either information about a file or its own children array.

How to get Contents of text files as value in foreach loop in glob function in php?

I am developing a search engine with vector space Model. I successfully computed tf-idf with associative array data already define in code. Now I want that data should be come from directory where I have a folders and in each folder there is a number of text files with dummy data. I have tried alot but stuck at 1 point using glob function because I want all .txt files as key and its contents as value in foreach loop of glob function.... Below is my code.
Tf-idf With Associative Array Data
$collection = array(
1 => 'this string is a short string but a good string',
2 => 'this one isn\'t quite like the rest but is here',
3 => 'this is a different short string that\' not as short'
);
$dictionary = array();
$docCount = array();
foreach($collection as $docID => $doc) {
$terms = explode(' ', $doc);
$docCount[$docID] = count($terms);
foreach($terms as $term) {
if(!isset($dictionary[$term])) {
$dictionary[$term] = array('df' => 0, 'postings' => array());
}
if(!isset($dictionary[$term]['postings'][$docID])) {
$dictionary[$term]['df']++;
$dictionary[$term]['postings'][$docID] = array('tf' => 0);
}
$dictionary[$term]['postings'][$docID]['tf']++;
}
}
$temp = ('docCount' => $docCount, 'dictionary' => $dictionary);
As you see in 1st foreach loop is that $DocID is key and $doc is its contents(value) of collection array. But I don't know how to implement exact same thing when files read from directory. See code below..
Tf-idf With .txt Files and its contents read from directory
foreach (glob("C:\\wamp\\www\\Web-info\\documents\\awd_1990_00\\*.txt") as $file) {
$file_handle = fopen($file, "r");
//echo $file;
$dictionary = array();
$docCount = array();
foreach($file as $docID=> $value) {
echo $value;
$terms = explode(' ', $doc);
$docCount[$docID] = count($terms);
foreach($terms as $term) {
if(!isset($dictionary[$term])) {
$dictionary[$term] = array('df' => 0, 'postings' => array());
}
if(!isset($dictionary[$term]['postings'][$docID])) {
$dictionary[$term]['df']++;
$dictionary[$term]['postings'][$docID] = array('tf' => 0);
}
$dictionary[$term]['postings'][$docID]['tf']++;
}
}
}
$temp = array('docCount' => $docCount, 'dictionary' => $dictionary);
This gives me error on 1st foreach loop that invalid arugument supplied for foreach loop. As I mentioned earlier I want .txt files as a key and its contents as a value in 1st foreach loop. But I got this error Can anybody please Tell me how to do this.. Thanks in advance..
If you want to treat the entire file as one value, you can use file_get_contents() to read the file into a string:
$dictionary = array();
$docCount = array();
foreach (glob("C:\\wamp\\www\\Web-info\\documents\\awd_1990_00\\*.txt") as $docID) {
$value = file_get_contents($docID);
...
}

Convert array to an .ini file

I need to parse an .ini file into an array, and later change the values of the array and export it to the same .ini file.
I managed to read the file, but didn’t find any simple way to write it back.
Any suggestions?
Sample .ini file:
1 = 0;
2 = 1372240157; // timestamp.
In order to write the .ini file back, you need to create your own function, for PHP offers no functions out of the box other than for reading (which can be found here: http://php.net/manual/pl/function.parse-ini-file.php).
An example of function that might encapsulate a multidimensional array to .ini-syntax compatible string might look like this:
function arr2ini(array $a, array $parent = array())
{
$out = '';
foreach ($a as $k => $v)
{
if (is_array($v))
{
//subsection case
//merge all the sections into one array...
$sec = array_merge((array) $parent, (array) $k);
//add section information to the output
$out .= '[' . join('.', $sec) . ']' . PHP_EOL;
//recursively traverse deeper
$out .= arr2ini($v, $sec);
}
else
{
//plain key->value case
$out .= "$k=$v" . PHP_EOL;
}
}
return $out;
}
You can test it like this:
$x = [
'section1' => [
'key1' => 'value1',
'key2' => 'value2',
'subsection' => [
'subkey' => 'subvalue',
'further' => ['a' => 5],
'further2' => ['b' => -5]]]];
echo arr2ini($x);
(Note that short array syntax is available only since PHP 5.4+.)
Also note that it doesn't preserve the comments that were present in your question. There are no easy ways to remember them, when it is software (as opposed to a human) that updates the file back.
I've made significant changes to the function provided by rr- (many thanks for the kick-start!)
I was unhappy with the way multidimensional properties are handled in that version. I took the example ini file from the php documentation page for parse_ini_file and got a result which included the keys third_section.phpversion and third_section.urls - not what I expected.
I tried using a RecursiveArrayIterator for unlimited nesting, but unfortunately, a header with key-value pairs under it is the maximum limit of recursion that parse_ini_string will process before choking on an error message.
So I started from scratch, added some curveballs as the fourth and last items, and ended up with this:
$test = array(
'first_section' => array(
'one' => 1,
'five' => 5,
'animal' => "Dodo bird",
),
'second_section' => array(
'path' => "/usr/local/bin",
'URL' => "http://www.example.com/username",
),
'third_section' => array(
'phpversion' => array(5.0, 5.1, 5.2, 5.3),
'urls' => array(
'svn' => "http://svn.php.net",
'git' => "http://git.php.net",
),
),
'fourth_section' => array(
7.0, 7.1, 7.2, 7.3,
),
'last_item' => 23,
);
echo '<pre>';
print_r($test);
echo '<hr>';
$ini = build_ini_string($test);
echo $ini;
echo '<hr>';
print_r( parse_ini_string($ini, true) );
function build_ini_string(array $a) {
$out = '';
$sectionless = '';
foreach($a as $rootkey => $rootvalue){
if(is_array($rootvalue)){
// find out if the root-level item is an indexed or associative array
$indexed_root = array_keys($rootvalue) == range(0, count($rootvalue) - 1);
// associative arrays at the root level have a section heading
if(!$indexed_root) $out .= PHP_EOL."[$rootkey]".PHP_EOL;
// loop through items under a section heading
foreach($rootvalue as $key => $value){
if(is_array($value)){
// indexed arrays under a section heading will have their key omitted
$indexed_item = array_keys($value) == range(0, count($value) - 1);
foreach($value as $subkey=>$subvalue){
// omit subkey for indexed arrays
if($indexed_item) $subkey = "";
// add this line under the section heading
$out .= "{$key}[$subkey] = $subvalue" . PHP_EOL;
}
}else{
if($indexed_root){
// root level indexed array becomes sectionless
$sectionless .= "{$rootkey}[] = $value" . PHP_EOL;
}else{
// plain values within root level sections
$out .= "$key = $value" . PHP_EOL;
}
}
}
}else{
// root level sectionless values
$sectionless .= "$rootkey = $rootvalue" . PHP_EOL;
}
}
return $sectionless.$out;
}
My input and output arrays match (functionally, anyway) and my ini file looks like this:
fourth_section[] = 7
fourth_section[] = 7.1
fourth_section[] = 7.2
fourth_section[] = 7.3
last_item = 23
[first_section]
one = 1
five = 5
animal = Dodo bird
[second_section]
path = /usr/local/bin
URL = http://www.example.com/username
[third_section]
phpversion[] = 5
phpversion[] = 5.1
phpversion[] = 5.2
phpversion[] = 5.3
urls[svn] = http://svn.php.net
urls[git] = http://git.php.net
I know it may be a little overkill, but I really needed this function in two of my own projects. Now I can read an ini file, make changes and save it.
The answer by RR works and I added one change
in else statement
//plain key->value case
$out .= "$k=$v" . PHP_EOL;
change it to
//plain key->value case
$out .= "$k=\"$v\"" . PHP_EOL;
By having " around the value, you can have larges values in the INI otherwise parse_ini_* functions will have an issue
http://missioncriticallabs.com/blog/2009/08/double-quotation-marks-in-php-ini-files/
This is my enhanced version answer of rr- (thanks to him), my function is a part of class in laravel eco-system so a function named Arr::isAssoc is used which is basically to detect whether the given array is an associative array or not.
private function arrayToConfig(array $array, array $parent = []): string
{
$returnValue = '';
foreach ($array as $key => $value)
{
if (is_array($value)) // Subsection case
{
// Merge all the sections into one array
if (is_int($key)) $key++;
$subSection = array_merge($parent, (array)$key);
// Add section information to the output
if (Arr::isAssoc($value))
{
if (count($subSection) > 1) $returnValue .= PHP_EOL;
$returnValue .= '[' . implode(':', $subSection) . ']' . PHP_EOL;
}
// Recursively traverse deeper
$returnValue .= $this->arrayToConfig($value, $subSection);
$returnValue .= PHP_EOL;
}
elseif (isset($value)) $returnValue .= "$key=" . (is_bool($value) ? var_export($value, true) : $value) . PHP_EOL; // Plain key->value case
}
return count($parent) ? $returnValue : rtrim($returnValue) . PHP_EOL;
}
What about using php internal functions ? http://php.net/manual/en/function.parse-ini-file.php

Categories