PHP_Count total lines in all files under a given folder - php

Just wanted to count total number of line from all the files from the folder. following php function helps me to count line num for only particular file. just wondering what is the way to cont total number of lines from the folder.
$lines = COUNT(FILE($file));
Thank you.!

You could iterate the directory and count each file and sum them all. And you are using file() function, which will load the whole content into memory, if the file is very large, your php script will reach the memory limit of your config.
If you could use external command, there is a solution with one line. (If you are using windows, just omit it.)
$total = system("find $dir_path -type f -exec wc -l {} \; | awk '{total += $1} END{print total}'");

Same as one above (salathe's answer), except this one prints the number of lines (now in php7) rather than a bunch of error messages.
$files = new RecursiveIteratorIterator(new
RecursiveDirectoryIterator(__DIR__));
$lines = 0;
foreach ($files as $fileinfo) {
if (!$fileinfo->isFile()) {
continue;
}
$read = $fileinfo->openFile();
$read->setFlags(SplFileObject::READ_AHEAD);
$lines += iterator_count($read) - 1; // -1 gives the same number as "wc -l"
}
echo ("Found :$lines");

Something like this perhaps:
<?php
$line_count = 0;
if ($handle = opendir('some/dir/path')) {
while (false !== ($entry = readdir($handle))) {
if (is_file($entry)) {
$line_count += count(file($entry));
}
}
closedir($handle);
}
var_dump($line_count);
?>

Check out the Standard PHP Library (aka SPL) for DirectoryIterator:
$dir = new DirectoryIterator('/path/to/dir');
foreach($dir as $file ){
$x += (isImage($file)) ? 1 : 0;
}
(FYI there is an undocumented function called iterator_count() but probably best not to rely on it for now I would imagine. And you'd need to filter out unseen stuff like . and .. anyway.)
or try this:--
see url :- http://www.brightcherry.co.uk/scribbles/php-count-files-in-a-directory/
$directory = "../images/team/harry/";
if (glob($directory . "*.jpg") != false)
{
$filecount = count(glob($directory . "*.jpg"));
echo $filecount;
}
else
{
echo 0;
}

A very basic example of counting the lines might look something like the following, which gives the same numbers as xdazz's answer.
<?php
$files = new RecursiveIteratorIterator(new RecursiveDirectoryIterator(__DIR__));
$lines = $files = 0;
foreach ($files as $fileinfo) {
if (!$fileinfo->isFile()) {
continue;
}
$files++;
$read = $fileinfo->openFile();
$read->setFlags(SplFileObject::READ_AHEAD);
$lines += iterator_count($read) - 1; // -1 gives the same number as "wc -l"
}
printf("Found %d lines in %d files.", $lines, $files);
See also
RecursiveDirectoryIterator
SplFileInfo
SplFileObject
RecursiveIteratorIterator
iterator_count()

Related

Sorting files in an array by the ocurrences of a word in it, php

I'm making a search bar that searches files in a directory that have the word searched, then I want it to be added to an array by order of which one has more times the word asked to the one with less.
I'm working on PHP this is my code:
<?php
if(isset($_POST['busqueda'])){
$variable = utf8_encode($_POST['busqueda']);
}
$Array1 = array();
foreach(glob("*.txt") as $filename) {
$contents = file_get_contents($filename);
if (strpos($contents, $variable)){
$Array1[] = $filename;
}
}
I don't know how to do it exactly, I think that I should use substr_count(file_get_contents($Array1[$position1])) or something like that but I'm unsure how to make the sorting system, can someone help me!
print_r($Array1);
for($var1=0; $var1<sizeof($Array1); $var1++){
echo "times on the file: ".$Array1[$var1]."<br>";
echo substr_count(file_get_contents($Array1[$var1]));
}
?>
You can use the substr_count itself. Then you need to use arsort to sort the array.
$Array1 = array();
foreach (glob("*.txt") as $filename) {
$contents = file_get_contents($filename);
if ( ($count = substr_count($contents, $variable)) ) {
$Array1[$filename] = $count;
}
}
arsort($Array1) ;
print_r($Array1);
foreach ($Array1 as $file => $count) {
echo "times on the file($file): $count <br>";
}
Bash (available on at least Linux and Mac operating systems) makes it extremely easy to accomplish your task, because you can call commands through PHP's exec function, assuming it is not disabled by an administrator. If you're on Windows, then this will probably not work, but most people are using Linux for a production environment, so I thought this answer would be worthy of posting.
The following function is taken from CodeIgniter's file helper and only serves to fetch an array of filenames from a specified directory. If you don't need a function like this because you are getting your filenames from somewhere else, just note that this function can include the full file path for each file, and that's why I used it.
function get_filenames($source_dir, $include_path = FALSE, $_recursion = FALSE)
{
static $_filedata = array();
if ($fp = #opendir($source_dir))
{
// reset the array and make sure $source_dir has a trailing slash on the initial call
if ($_recursion === FALSE)
{
$_filedata = array();
$source_dir = rtrim(realpath($source_dir), DIRECTORY_SEPARATOR).DIRECTORY_SEPARATOR;
}
while (FALSE !== ($file = readdir($fp)))
{
if (#is_dir($source_dir.$file) && strncmp($file, '.', 1) !== 0)
{
get_filenames($source_dir.$file.DIRECTORY_SEPARATOR, $include_path, TRUE);
}
elseif (strncmp($file, '.', 1) !== 0)
{
$_filedata[] = ($include_path == TRUE) ? $source_dir.$file : $file;
}
}
return $_filedata;
}
else
{
return FALSE;
}
}
Now that I can fetch an array of filenames easily, I'd do this:
/**
* Here you can see that I am searching
* all of the files in the script-library
* directory for the word "the"
*/
$searchWord = 'the';
$directory = '/var/www/htdocs/script-library';
$filenames = get_filenames(
$directory,
TRUE
);
foreach( $filenames as $file )
{
$counts[$file] = exec("tr ' ' '\n' < " . $file . " | grep " . $searchWord . " | wc -l");
}
arsort( $counts );
echo '<pre>';
print_r( $counts );
echo '</pre>';
For a good explaination of how that works, see this: https://unix.stackexchange.com/questions/2244/how-do-i-count-the-number-of-occurrences-of-a-word-in-a-text-file-with-the-comma
I tested this code locally and it works great.

How to count number of .txt files in a directory in PHP?

I'm working on an IMDB style website and I need to dynamically find the amount of reviews for a movie. The reviews are stored in a folder called /moviefiles/moviename/review[*].txt where the [*] is the number that the review is. Basically I need to return as an integer how many of those files exist in that directory. How do I do this?
Thanks.
Use php DirectoryIterator or FileSystemIterator:
$directory = new DirectoryIterator(__DIR__);
$num = 0;
foreach ($directory as $fileinfo) {
if ($fileinfo->isFile()) {
if($fileinfo->getExtension() == 'txt')
$num++;
}
}
Take a look at the glob() function: http://php.net/manual/de/function.glob.php
You can then use sizeof() to count how many files there are.
First, use glob () to get file list array, then use count () to get array length, the array length is file count.
Simplify code:
$txtFileCount = count( glob('/moviefiles/moviename/review*.txt') );
You can use this php code to get the number of text files in a folder
<div id="header">
<?php
// integer starts at 0 before counting
$i = 0;
$dir = 'folder-path';
if ($handle = opendir($dir)) {
while (($file = readdir($handle)) !== false){
if (!in_array($file, array('.', '..')) && !is_dir($dir.$file))
{
$temp = explode(".",$file);
if($temp[1]=="txt")
$i++;
}
}
}
// prints out how many were in the directory
echo "There were $i files";
?>
</div>
This is a very simple code that works well. :)
$files = glob('yourfolder/*.{txt}', GLOB_BRACE);
foreach($files as $file) {
your work
}

What is the fastest way to get the number of items in an iterator?

I have a folder structure with something over 300.000 files and folders in it.
When I iterate over this structure to get all files of a certain type, this take about half a second to complete:
$path = "/path/to/folder";
$folder = new RecursiveDirectoryIterator($path);
$iterator = new RecursiveIteratorIterator($folder);
$files = new RegexIterator($iterator, '/^.+\.jpg$/i', RecursiveRegexIterator::GET_MATCH);
But getting the length of the resulting iterator ($files) takes about ten seconds:
$n = iterator_count($files);
Is there a faster way to get the number of items in the iterator? Maybe during iteration?
These all take around 10 seconds, too:
1. find (commandline)
find /path/to/folder -type f | wc -l
2. scandir (PHP, source)
$intFileCount = countFilesInDirectory($path, array('jpg'));
echo $intFileCount;
function countFilesInDirectory($strPathToDirectory, $arrExtensions) {
$intFileCount = 0;
$arrFiles = scandir($strPathToDirectory);
foreach($arrFiles as $strFile) {
if('.' != $strFile && '..' != $strFile) {
if(is_dir("$strPathToDirectory/$strFile") && is_readable("$strPathToDirectory/$strFile")) {
$intFileCount += countFilesInDirectory("$strPathToDirectory/$strFile", $arrExtensions);
} else {
$arrFileInfo = pathinfo($strFile);
if(in_array($arrFileInfo['extension'], $arrExtensions)) {
$intFileCount += 1;
}
}
}
}
return $intFileCount;
}
Note:
I'm counting the seconds in my head. So there might be some slight difference between methods, but it is not so large as to be relevant to me.

File loop in PHP

I wrote some code to create a text file just once each time I execute the php file.
Its idea is to check all existing files with a specific name then create a text file with the previous name +1
For example, if there is a file called filetext0.txt, my code will create a file called filetext1.txt and so on...
Please help me to find the error in my code:
<?php
for ($i=0; $i=1000; $i=$i+1)
{
$handle = fopen("filetext".$i.".txt","r");
if ($handle) {
fclose($handle);
$s=$i+1
$handlex = fopen("filetext".$s.".txt","w+");
fclose($handlex);
break
}
}
?>
First of all you should use file_exists in the first step.
Then, your problem are missing semi-colon ';' at the end of lines. Check the error messages on your web pages next time ;)
And finally, your code create a file each file it found, not only one.
I'll suggest this code :
$i = 0;
while(true) {
$filename = "filetext".$i.".txt";
if(! file_exists($filename)) {
touch($filename);
break;
}
$i++
}
You do not have to open each and ever file to check if it exists. You should use PHP's directory functions.
// the maximum number
$maxnum = 0;
$d = dir(".");
while (false !== ($entry = $d->read())) {
if (preg_match ('/filetext([0-9]+)\.txt/', $entry, $matches)) {
if ($matches[1] > $maxnum) {
$maxnum = $matches[1];
}
}
}
$d->close();
echo ("The biggest number is: " . $maxnum);
// increment maxnum
$maxnum++;
// creating the file
touch ("filetext" . $maxnum . ".txt");
You need a ; after each statement.
$fileNames = glob('filetext*.txt');
$latestNumber = -1;
foreach($fileNames as $fileName) {
list($fileNumber) = sscanf($fileName,'filetext%d.txt');
$latestNumber = max($latestNumber,$fileNumber);
}
if ($fileNumber > -1) {
$fileName = 'filetext'.($fileNumber+1).'.txt';
touch($fileName);
}
Leaving aside the syntax errors, the algorithm you are using does not scale well. A better solution would be a searching method something like:
function find_next($stub)
{
$increment=1000; // depending on number of files
$offset=0;
for ($x=0; $x<500; $x++) {
$offset+=$increment;
if (file_exists($stub . $offset)) {
if ($increment<0) {
$increment=-1*((integer)($increment/2) ? $increment/2 : 1;
}
} else {
if (file_exists($stub . ($offset-1)) {
return $offset;
}
if ($increment>0) {
$increment=-1*((integer)($increment/2) ? $increment/2 : 1;
}
}
}
return false; // too many files!
}
(NB I'm just typing this stuff - the above may be a bit buggy).
But it'd be a lot better to store a sequence number and increment it each time you add a file.
However do beware that storing transactional data for a multi-user system using files with PHP is a very bad idea.
Yes used file_exists function to find next name for the file.
The above code missed the brace in if condition.
here code for your problem
$i = 0;
while(true) {
$myfile = "myfile".$i.".txt";
if(!file_exists($myfile)) {
$fh = fopen($myfile, 'w');
fclose($fh);
break;
}
$i++;
}

PHP - How do I open files and read them then write new ones with "x" lines per file?

I posted this question here before but there were no responses. I may have done something wrong so, here it is again with some more details.
The files in the directory are named 1.txt, 2.txt, 3.txt etc.... The snippet below enters that directory, opens all the *,txt files reading them, removes the dupes and creates one file with all the unique contents. (names in this case).
$files = glob($dirname."/*.txt"); //matches all text files
$lines = array();
foreach($files as $file)
{
$lines = array_merge($lines, file($file, FILE_SKIP_EMPTY_LINES | FILE_IGNORE_NEW_LINES));
}
$lines = array_unique($lines);
file_put_contents($dirname."/allofthem.txt", implode("\n", $lines));
}
The above works great for me! Thanks to great help here at stackoverflow.
But, I desire to take it one step further.
Instead of one big duplicate free "allofthem.txt" file, how can I modify the above code to create files with a maximum of 5oo lines each from the new data?
They need to go into a new directory eg $dirname."/done/".$i.".txt" I have tried counting in the loop but my efforts are not working and ended up being a mile long.
I also attempted to push 500 into an array, increment to another array and save that way. No luck. I am just not "getting" it.
Again, this beginner needs some expert assistance. Thanks in advance.
Once you have your array of lines as per your code, you can break it into chunks of 500 lines using array_chunk, and then write each chunk to its own file:
// ... from your code
$lines = array_unique($lines);
$counter = 1;
foreach (array_chunk($lines, 500) as $chunk)
{
file_put_contents($dirname . "/done/" . $counter . ".txt", implode("\n", $chunk));
$counter++;
}
this function will get you somewhere !
function files_identical($fn1, $fn2) {
if(filetype($fn1) !== filetype($fn2))
return FALSE;
if(filesize($fn1) !== filesize($fn2))
return FALSE;
if(!$fp1 = fopen($fn1, 'rb'))
return FALSE;
if(!$fp2 = fopen($fn2, 'rb')) {
fclose($fp1);
return FALSE;
}
$same = TRUE;
while (!feof($fp1) and !feof($fp2))
if(fread($fp1, 4096) !== fread($fp2, 4096)) {
$same = FALSE;
break;
}
if(feof($fp1) !== feof($fp2))
$same = FALSE;
fclose($fp1);
fclose($fp2);
return $same;
}
Src: http://www.php.net/manual/en/function.md5-file.php#94494
$files = glob($dirname."/*.txt"); //matches all text files
$lines = array();
foreach($files as $file)
{
$lines = array_merge($lines, file($file, FILE_SKIP_EMPTY_LINES | FILE_IGNORE_NEW_LINES));
}
$lines = array_unique($lines);
$lines_per_file = 500;
$files = count($lines)/$lines_per_file;
if(count($lines) % $lines_per_file > 0) $files++;
for($i = 0; $i < $files; $i++) {
$write = array_slice($lines, $lines_per_file * $i, $lines_per_file);
file_put_contents($dirname."/done/".$i.".txt", implode("\n", $write));
}

Categories