Recursively crawl data folder and create multidimensional array - php

So I have the following situation. In my project folder I got a 'data' folder that contains .json files. These .json files also are structured in nested folders.
Something like:
/data
/content
/data1.json
/data2.json
/project
/data3.json
I'd like to create a function that recursively crawls through the data folder and stores all .json files in one multidimensional array, which makes it relatively easy to add static data for use for my project. So the expected result should be:
$data = array(
'content' => array(
'data1' => <data-from-data1.json>,
'data2' => <data-from-data2.json>
),
'project' => array(
'data3' => <data-from-data3.json>
)
);
UPDATE
I have tried the following, but this only returns the first level:
$data = array();
$directoryArray = scandir('./data');
foreach($directoryArray as $key => $value) {
$data[$key] = $value;
}
Is there a neat way to achieve this?

You should use RecursiveIteratorIterator. Skip some directories like . and .. . After this script loop other subdirectories.
//just to remove extension filename
function removeExtension($filename){
return preg_replace('/\\.[^.\\s]{3,4}$/', '', $filename);
}
$startpath= 'data';
$ritit = new RecursiveIteratorIterator(new RecursiveDirectoryIterator($startpath), RecursiveIteratorIterator::CHILD_FIRST);
$result = [];
foreach ($ritit as $splFileInfo) {
if ($splFileInfo->getFilename() == '.') continue;
if ($splFileInfo->getFilename() == '..') continue;
if ($splFileInfo->isDir()){
$path = [removeExtension($splFileInfo->getFilename()) => []];
}else{
$path = [removeExtension($splFileInfo->getFilename()) => json_decode(file_get_contents($splFileInfo->getPathname(), $splFileInfo->getFilename()))];
}
for ($depth = $ritit->getDepth() - 1; $depth >= 0; $depth--) {
$path = [$ritit->getSubIterator($depth)->current()->getFilename() => $path];
}
$result = array_merge_recursive($result, $path);
}
print_r($result);
My json files contain:
data1.json: {"foo": "foo"}
data2.json: {"bar": "bar"}
data3.json: {"foobar": "foobar"}
The result is:
Array
(
[content] => Array
(
[data1] => stdClass Object
(
[foo] => foo
)
[data2] => stdClass Object
(
[bar] => bar
)
)
[project] => Array
(
[data3] => stdClass Object
(
[foobar] => foobar
)
)
)

You do not really have to use RecursiveIteratorIterator. As a programmer you should always know how to deal with recursive data structures, may it be an xml content, a folder tree or else. You may write a recursive function to handle such tasks.
Recursive functions are functions which call themselves to process through data with multiple layers or dimensions.
For example, scanFolder function below is designed to process contents of a directory, and it calls itself when it is encountered with a sub-directory.
function scanFolder($path)
{
echo "scanning dir: '$path'";
$contents = array_diff(scandir($path), ['.', '..']);
$result = [];
foreach ($contents as $item) {
$fullPath = $path . DIRECTORY_SEPARATOR . $item;
echo "processing '$fullPath'";
// process folder
if (is_dir($fullPath)) {
// process folder contents
$result[$item] = scanFolder($fullPath);
} else {
// for this specific program, you should perform a check here to see if the file is a json
// collect the result
$result[$item] = json_decode(file_get_contents($fullPath));
}
}
return $result;
}
IMO, this is a cleaner and more expressive way to accomplish this task and I wonder what others have to say about this statement.

I think that you can use RecursiveDirectoryIterator, there is an documentation about this class.

Related

How to create hierarchy in array?

I know I'm not describing well my question, but I want to create "nested array" as you can see:
folder/ -> folder/file.txt, folder/folder2/ -> folder/folder2/file.txt, folder/folder2/folder3/ -> etc
but instead, I get:
E:\wamp\www\index.php:31:
array (size=3)
'folder/' =>
array (size=1)
0 => string 'folder/file.txt' (length=15)
'folder/folder2/' =>
array (size=1)
0 => string 'folder/folder2/file.txt' (length=23)
'folder/folder2/folder3/' =>
array (size=1)
0 => string 'folder/folder2/folder3/file.txt' (length=31)
My code is:
$array = [
'folder/',
'folder/folder2/folder3/',
'folder/folder2/',
'folder/folder2/folder3/file.txt',
'folder/folder2/file.txt',
'folder/file.txt'
];
sort($array);
$array = array_flip($array);
function recursive_dir_nested($a) {
foreach ($a as $k => $v) {
if (preg_match("/\/$/", $k)) {
$a[$k] = [];
}
if (preg_match("/\/[^\/]+$/", $k)) {
$nk = preg_replace("/\/[^\/]+$/", "/", $k);
if (array_key_exists($nk, $a)) {
$a[$nk][] = $k;
unset($a[$k]);
} else {
recursive_dir_nested($a);
}
}
}
return $a;
}
I know I do something wrong, I'm not sure why... How can I solve this?
Not sure if using regex's is the best way to go. This builds on another answer - PHP - Make multi-dimensional associative array from a delimited string, but adds in the idea of using an array of entries. The one thing to note is that when adding new entries, if the element isn't currently an array, it turns it into an array so it can contain multiple entries ( the if ( !is_array($current) ) { part).
It uses each string and builds the folder hierarchy from that, saving the last part as the file name to be added specifically to the folder element...
$array = [
'folder/',
'folder/folder2/folder3/',
'folder/folder2/',
'folder/folder2/folder3/file.txt',
'folder/folder2/file.txt',
'folder/file.txt'
];
sort($array);
$output = [];
foreach ( $array as $entry ) {
$split = explode("/", $entry);
$current = &$output;
$file = array_pop($split);
foreach ( $split as $level ) {
if ( !isset($current[$level]) ){
if ( !is_array($current) ) {
$current = [ $current ];
}
$current[$level] = [];
}
$current = &$current[$level];
}
if ( !empty($file) ) {
$current = $file;
}
}
print_r($output);
This gives you...
Array
(
[folder] => Array
(
[0] => file.txt
[folder2] => Array
(
[0] => file.txt
[folder3] => file.txt
)
)
)
You can nest arrays in PHP. You might also want to use keys for the names of the directories:
$array = [
'folder' => [
'folder2' => [
'folder3' => [
'file.txt'
],
'file.txt'
],
'file.txt'
]
];
You could check each item with is_array() to see if it itself is array, then treat it as a string if it isn't.
See here for more info: php.net/manual/en/language.types.array.php

Convert single array to multidimensional array in php

If a problem i try to solve now some hours, but simply cant find a solution.
If a single array of paths
$singleArray = array(
'/Web',
'/Web/Test1',
'/Web/Test2',
'/Web/Test2/Subfolder',
'/Web/Test3',
'/Public'
);
From that array i want to create a mulitdimensional array, which keeps the keys but put subfolders in the correct parent folders. Later i want to loop over the new array to create a folder tree (but thats not a problem)
The new array should look like this:
$multiArray = array(
'/Web'=>array(
'/Web/Test1'=>array(),
'/Web/Test2'=>array(
'/Web/Test2/Subfolder'=>array()
),
'/Web/Test3'=>array()
),
'/Public'=>array()
);
The code below will make the array you want. The key to solve your problem is to create a reference to the array every iteration.
<?php
$singleArray = array(
'/Web',
'/Web/Test1',
'/Web/Test2',
'/Web/Test2/Subfolder',
'/Web/Test3',
'/Public'
);
$multiArray = array();
foreach ($singleArray as $path) {
$parts = explode('/', trim($path, '/'));
$section = &$multiArray;
$sectionName = '';
foreach ($parts as $part) {
$sectionName .= '/' . $part;
if (array_key_exists($sectionName, $section) === false) {
$section[$sectionName] = array();
}
$section = &$section[$sectionName];
}
}
Got this working! Great challenge!
First I sort the array by number of folders, so that the first to be processed are those with fewest folders (in the root).
Then the function iterates through each of the array items and each folder in that item, comparing it to the existing items in the array and, if it exists, placing it inside of that item as a multidimensional array.
This will work for up to two subfolders - /root/sub1/sub2 - but it's quite straightforward so simple to add functionality for deeper use.
This sample code also prints out the before/after arrays:
$singleArray = array(
'/Web',
'/Web/Test1',
'/Web/Test2',
'/Web/Test2/Subfolder',
'/Web/Test3',
'/Public'
);
echo "<pre>";
print_r($singleArray);
$multiArray = array();
//first sort array by how many folders there are so that root folders are processed first
usort($singleArray, function($a, $b) {
$a_folders = explode("/", $a);
$b_folders = explode("/", $b);
$a_num = count($a_folders); //number of folders in first
$b_num = count($b_folders); //number of folders in second
if($a_num > $b_num) return -1;
elseif($a_num < $b_num) return 1;
else return 0;
});
//foreach in array
foreach($singleArray as $item){
//get names of folders
$folders = explode("/", $item);
//if the first folder exists
if(in_array($folders[0], $multiArray)){
$key1 = array_search($folders[0], $multiArray);
//repeat for subfolder #1
if(in_array($folders[1], $multiArray[$key1])){
$key2 = array_search($folders[1], $multiArray[$key1]);
//repeat for subfolder #2
if(in_array($folders[2], $multiArray[$key1][$key2])){
$key3 = array_search($folders[2], $multiArray[$key1][$key2]);
array_push($multiArray[$key1][$key2][$key3], $item);
} else array_push($multiArray[$key1][$key2], $item);
} else array_push($multiArray[$key1], $item);
} else array_push($multiArray, $item);
}
//reverse the array so that it looks nice
$multiArray = array_reverse($multiArray);
print_r($multiArray);
This will output:
Array
(
[0] => /Web
[1] => /Web/Test1
[2] => /Web/Test2
[3] => /Web/Test2/Subfolder
[4] => /Web/Test3
[5] => /Public
)
Array
(
[0] => /Web
[1] => /Public
[2] => /Web/Test1
[3] => /Web/Test2
[4] => /Web/Test3
[5] => /Web/Test2/Subfolder
)

Make a nested array of subdirectories using RecursiveDirectoryIterator

I have a following directory structure
test
directory_in_test
directory_in_directory_in_test
directory2_in_test
directory_in_directory2_in_test
abc.php
index.php
I am trying to make a function that will give a multidimensional array of sub-directories. Required output something like :
[directories] => Array(
[test] => Array(
[directory_in_test] => Array(
[directory_in_directory_in_test] => null
)
[directory2_in_test] => Array(
[directory_in_directory2_in_test] => null
)
)
)
I have tried to used RecursiveIteratorIterator with RecursiveDirectoryIterator but it give a one-level array of directories and files which is far from my requirement. Here is the code and result i have
code
<?php
public function findDirectories($path = '', $like = '')
{
$path = (is_dir($path)) ? $path : getcwd();
$directories = array();
$iterator = new \RecursiveIteratorIterator(new \RecursiveDirectoryIterator($path));
foreach ($iterator as $directory) {
if($directory->isDir())
$directories[] = $directory->getPathName();
}
return $directories;
}
Result on printing $directories
Array
(
[0] => D:\xampp\htdocs\raheelwp\file-resolver\tests\.
[1] => D:\xampp\htdocs\raheelwp\file-resolver\tests\..
[2] => D:\xampp\htdocs\raheelwp\file-resolver\tests\directory2_in_test\.
[3] => D:\xampp\htdocs\raheelwp\file-resolver\tests\directory2_in_test\..
[4] => D:\xampp\htdocs\raheelwp\file-resolver\tests\directory2_in_test\directory_in_directory2_in_test\.
[5] => D:\xampp\htdocs\raheelwp\file-resolver\tests\directory2_in_test\directory_in_directory2_in_test\..
[6] => D:\xampp\htdocs\raheelwp\file-resolver\tests\directory_in_test\.
[7] => D:\xampp\htdocs\raheelwp\file-resolver\tests\directory_in_test\..
[8] => D:\xampp\htdocs\raheelwp\file-resolver\tests\directory_in_test\direcotry_in_directory_in_test\.
[9] => D:\xampp\htdocs\raheelwp\file-resolver\tests\directory_in_test\direcotry_in_directory_in_test\..
)
<?php
$it = new RecursiveDirectoryIterator(".", RecursiveDirectoryIterator::SKIP_DOTS);
$it = new RecursiveIteratorIterator($it);
$files = new RecursiveArrayIterator(array());
foreach ($it as $fi) {
$it = $files;
$dirs = explode('/', $fi->getPath());
foreach ($dirs as $path) {
if (isset($it[$path])) {
$it = $it[$path];
} else {
$it[$path] = new RecursiveArrayIterator();
}
}
$it[$fi->getFileName()] = $fi->getFileName();
}
$a = array();
createArray($a, $files);
print_r($a);
function createArray(&$a, $it) {
foreach ($it as $k => $tmp) {
if (is_string($tmp)) {
$a[] = $tmp;
} else {
$a[$k] = array();
createArray($a[$k], $tmp);
}
}
}
The code is fairly simple, and split in two parts even though it could easily be created in just one part. The first part will split the directories into separate RecursiveArrayIterators, so you keep the "iterator" capabilities to do all kind of other stuff with it. This is often useful when you are using the SPL iterators to begin with.
The second part, the createArray function basically uses an array reference to point to the "current" directory. Since it will be a multidimensional array, we do not have to worry about "where" in the array we actually are (it could be the 1st level, it might as well be the 100th level if your directory structure goes that deep). It just checks if the given element is a string, if so, it's a file, otherwise it's a directory so we recursively call the createArray again.
Might be easier solutions, but I reckon most of them uses a basic array-reference system nevertheless.

PHP Array append values

I am having a problem appending a few options to an array of modules. I am using Opencart and trying to extend a module by adding an image. To do this and ensure that the code will not break anything in the future I wanted to add to the array instead of replace it.
This is the code I have so far:
if (isset($this->request->post['special_module'])) {
$modules = $this->request->post['special_module'];
} elseif ($this->config->get('special_module')) {
$modules = $this->config->get('special_module');
}
$this->load->model('tool/image');
foreach ($modules as $module) {
if (isset($module['image']) && file_exists(DIR_IMAGE . $module['image'])) {
$image = $module['image'];
} else {
$image = 'no_image.jpg';
}
array_push($module, array(
'image' => $image,
'thumb' => $this->model_tool_image->resize($image, 100, 100)
));
}
print_r($modules);exit;
$this->data['modules'] = $modules;
Print Array, no image or thumb:
Array
(
[0] => Array
(
[image_width] => 307
[image_height] => 234
[layout_id] => 1
[position] => column_right
[status] => 1
[sort_order] => 1
)
)
When I do array_push do I need to assign this back to the array?
$module is being overwritten by the foreach() loop every time it iterates. So your push is basically a null-op, becaus foreach will destroy the previous $module (that you pushed to) with the next $module value coming out of $modules. You'd need something more like this:
foreach($modules as &$module) {
...
$module['image'] = $image;
$module['thumb'] = ...;
}
The & before $module in the foreach turns it into a reference, so any modifications to $module within the loop will modify the original element in $modules, rather than a copy which would get trashed on every iteration.
$module, in your foreach loop is a copy of the contents. You will need to access it by reference, or push back into the actual array $modules.
Try modifying the foreach signature to the following:
foreach ($modules as &$module) {
try using array_merge instead of array_push
array_merge($module, array(
'image' => $image,
'thumb' => $this->model_tool_image->resize($image, 100, 100)
));
edit:
also, as print_r outputs the correct should be array_merge($module[0], array(...));

Extract leaf nodes of multi-dimensional array in PHP

Suppose I have an array in PHP that looks like this
array
(
array(0)
(
array(0)
(
.
.
.
)
.
.
array(10)
(
..
)
)
.
.
.
array(n)
(
array(0)
(
)
)
)
And I need all the leaf elements of this mulit-dimensional array into a linear array, how should I go about doing this without resorting recursion, such like this?
function getChild($element)
{
foreach($element as $e)
{
if (is_array($e)
{
getChild($e);
}
}
}
Note: code snippet above, horribly incompleted
Update: example of array
Array
(
[0] => Array
(
[0] => Array
(
[0] => Seller Object
(
[credits:private] => 5000000
[balance:private] => 4998970
[queueid:private] => 0
[sellerid:private] => 2
[dateTime:private] => 2009-07-25 17:53:10
)
)
)
...snipped.
[2] => Array
(
[0] => Array
(
[0] => Seller Object
(
[credits:private] => 10000000
[balance:private] => 9997940
[queueid:private] => 135
[sellerid:private] => 234
[dateTime:private] => 2009-07-14 23:36:00
)
)
....snipped....
)
)
Actually, there is a single function that will do the trick, check the manual page at: http://php.net/manual/en/function.array-walk-recursive.php
Quick snippet adapted from the page:
$data = array('test' => array('deeper' => array('last' => 'foo'), 'bar'), 'baz');
var_dump($data);
function printValue($value, $key, $userData)
{
//echo "$value\n";
$userData[] = $value;
}
$result = new ArrayObject();
array_walk_recursive($data, 'printValue', $result);
var_dump($result);
You could use iterators, for example:
$result = array();
foreach(new RecursiveIteratorIterator(new RecursiveArrayIterator($array), RecursiveIteratorIterator::LEAVES_ONLY) as $value) {
$result[] = $value;
}
Use a stack:
<?php
$data = array(array(array("foo"),"bar"),"baz");
$results = array();
$process = $data;
while (count($process) > 0) {
$current = array_pop($process);
if (is_array($current)) {
// Using a loop for clarity. You could use array_merge() here.
foreach ($current as $item) {
// As an optimization you could add "flat" items directly to the results array here.
array_push($process, $item);
}
} else {
array_push($results, $current);
}
}
print_r($results);
Output:
Array
(
[0] => baz
[1] => bar
[2] => foo
)
This should be more memory efficient than the recursive approach. Despite the fact that we do a lot of array manipulation here, PHP has copy-on-write semantics so the actual zvals of the real data won't be duplicated in memory.
Try this:
function getLeafs($element) {
$leafs = array();
foreach ($element as $e) {
if (is_array($e)) {
$leafs = array_merge($leafs, getLeafs($e));
} else {
$leafs[] = $e;
}
}
return $leafs;
}
Edit   Apparently you don’t want a recursive solution. So here’s an iterative solution that uses a stack:
function getLeafs($element) {
$stack = array($element);
$leafs = array();
while ($item = array_pop($stack)) {
while ($e = array_shift($item)) {
if (is_array($e)) {
array_push($stack, array($item));
array_push($stack, $e);
break;
} else {
$leafs[] = $e;
}
}
}
return $leafs;
}
Just got the same issue and used another method that was not mentioned. The accepted answer require the ArrayObject class to work properly. It can be done with the array primitive and the use keyword in the anonymous function (PHP >= 5.3):
<?php
$data = array(
array(1,2,3,4,5),
array(6,7,8,9,0),
);
$result = array();
array_walk_recursive($data, function($v) use (&$result) { # by reference
$result[] = $v;
});
var_dump($result);
There is no flatten function to get directly the leafs. You have to use recursion to check for each array if has more array children and only when you get to the bottom to move the element to a result flat array.

Categories