I'm building a script which will open a saved text file, export the contents to an array and then dump the contents in a database. So far I've been able to get the file upload working quite happily and can also open said file.
The trouble I'm having is the contents of the file are variable, they have a fixed structure but the contents will change every time. The structure of the file is that each "section" is seperated by a blank line.
I've used php's file() to get an array ... I'm not sure if there's a way to then split that array up every time it comes across a blank line?
$file = $target_path;
$data = file($file) or die('Could not read file!');
Example output:
[0] => domain.com
[1] => # Files to be checked
[2] => /www/06.php
[3] => /www/08.php
[4] =>
[5] => domain2.com
[6] => # Files to be checked
[7] => /cgi-bin/cache.txt
[8] => /cgi-bin/log.txt
[9] =>
[10] => domain3.com
[11] => # Files to be checked
[12] => /www/Content.js
[13] =>
I know that Field 0 and 1 will be constants, they will always be a domain name then that hash line. The lines thereafter could be anywhere between 1 line and 1000 lines.
I've looked at array_chunk() which is close to what I want but it works on a numerical value, what would be good if there was something which would work on a specified value (like a new line, or a comma or something of that sort!).
Lastly, apologies if this has been answered previously. I've searched the usual places a few times for potential solutions.
Hope you can help :)
Foxed
I think what you're looking for is preg_split. If you just split on a carriage return, you might miss lines that just have spaces or tabs.
$output = array(...);//what you just posted
$string_output = implode('', $output);
$array_with_only_populated_lines = preg_split('`\n\W+`', $string_output);
You could just do something like this. You could change it also to read the file in line-by-line rather than using file(), which would use less memory, which might be important if you use larger files.
$handle = fopen('blah', 'r');
$blocks = array();
$currentBlock = array();
while (!feof($handle)) {
$line = fgets($handle);
if (trim($line) == '') {
if ($currentBlock) {
$blocks[] = $currentBlock;
$currentBlock = array();
}
} else {
$currentBlock[] = $line;
}
}
fclose($handle);
//if is anything left
if ($currentBlock) {
$blocks[] = $currentBlock;
}
print_r($blocks);
Have you tried split('\n\n', $file);
?
You could do it by splitting first on the blank line and then on new lines, e.g.:
$file = $target_path;
$fileData = file_get_contents($file) or die('Could not read file!');
$parts = explode("\n\n", $data);
$data = array();
foreach ($parts as $part) {
$data[] = explode("\n", $part);
}
You could also use preg_split() in place of the first explode() with a regex to sp.lit on lines containing just whitespace (e.g. \s+)
I would use the function preg_grep() to reduce the resulting array:
$array = preg_grep('/[^\s]/', $array);
Related
I am trying to concatenate words from a file with words from another file. However when I run the script I get a full output of the first file, then the output of the second file, then I see that the execution does not complete so I am stuck in an infinite loop. This is my code:
include 'passgen.txt';
include 'mycharset.txt';
$lines=file('passgen.txt');
$additions=file('mycharset.txt');
foreach($lines as $line){
foreach($additions as $addition){
$newPasswords=$line . $addition;
}
}
file_put_contents('newPasswords.txt', print_r($newPasswords, true));
passgen.txt content example:
stack
5tack
St4ck
...
mycharset.txt content example:
1
1!
2
2!
Expected results of what I am trying to achieve:
stack1
stack1!
stack2
stack2!
5tack1
5tack1!
...
EDIT:
adding full code from Jay answer:
#!/usr/bin/php
<?php
include 'passgen.txt';
include 'mycharset.txt';
$lines=file('passgen.txt');
$additions=file('mycharset.txt');
foreach($lines as $start) {
foreach($additions as $end) {
file_put_contents('newPasswords2.txt', $start.$end ."\r\n", FILE_APPEND);
}
}
?>
SAMPLE OUTPUT from Jay answer:
St4ck
6!3
St4ck
6!4
St4ck
6!5
I tried to remove the \r\n but still does not append the 6!5 to the word in the desired format below:
St4ck6!4
St4ck6!5
...
You are just creating a line, so you will not get an array as $newPasswords is overwritten on each iteration. What I did was place the concatenated words into an array ($word_array). You can then loop through the array easily and place into a text file:
EDIT
Added the trim() function to account for any whitespace characters in the text files we may not be aware of:
$file1 = ['stack','5tack','St4ck'];
$file2 = ['1','1!','2'];
$word_array = array();
foreach($file1 as $start) {
foreach($file2 as $end) {
$word_array[] = trim($start).trim($end);
}
}
print_r($word_array);
Returns:
Array
(
[0] => stack1
[1] => stack1!
[2] => stack2
[3] => 5tack1
[4] => 5tack1!
[5] => 5tack2
[6] => St4ck1
[7] => St4ck1!
[8] => St4ck2
)
Now you can put these in your text file like this:
foreach($word_array as $word) {
file_put_contents('newPasswords.txt', $word."\r\n");
}
Having said that I caution you against using this for password generation for any reason. You're essentially creating a rainbow table based on your comment:
I am using a weak password finder and I need a custom list of password to compare if hashes are weak.
You'd be better off providing the users with a password strength indicator that would encourage them to create strong passwords.
Shortening the process...
You could shorten the process entirely by writing to the file during the loop, which would require no arrays:
foreach($file1 as $start) {
foreach($file2 as $end) {
file_put_contents('newPasswords.txt', trim($start).trim($end) ."\r\n", FILE_APPEND);
}
}
You can use the fgets() function to read from a file line by line. Then you can concatenate that.
So something like:
$count = 0;
while($word = fgets($lines, 4096)){
$word = $word . additions[$count];
echo $word;
$count++;
}
The data contained in the text file (actually a .dat) looks like:
LIN*1234*UP*abcde*33*0*EA
LIN*5678*UP*fghij*33*0*EA
LIN*9101*UP*klmno*33*23*EA
There are actually over 500,000 such lines in the file.
This is what I'm using now:
//retrieve file once
$file = file_get_contents('/data.dat');
$file = explode('LIN', $file);
...some code
foreach ($list as $item) { //an array containing 10 items
foreach($file as $line) { //checking if these items are on huge list
$info = explode('*', $line);
if ($line[3] == $item[0]) {
...do stuff...
break; //stop checking if found
}
}
}
The problem is it runs way too slow - about 1.5 seconds of each iteration. I separately confirmed that it is not the '...do stuff...' that is impacting speed. Rather, its the search for the correct item.
How can I speed this up? Thank you.
If each item is on its own line, instead of loading the whole thing in memory, it might be better to use fgets() instead:
$f = fopen('text.txt', 'rt');
while (!feof($f)) {
$line = rtrim(fgets($f), "\r\n");
$info = explode('*', $line);
// etc.
}
fclose($f);
PHP file streams are buffered (~8kB), so it should be decent in terms of performance.
The other piece of logic can be rewritten like this (instead of iterating the file multiple times):
if (in_array($info[3], $items)) // look up $info[3] inside the array of 10 things
Or, if $items is suitably indexed:
if (isset($items[$info[3]])) { ... }
file_get_contents loads the whole file into memory as an array & then your code acts on it. Adapting this sample code from the official PHP fgets documentation should work better:
$handle = #fopen("test.txt", "r");
if ($handle) {
while (($buffer = fgets($handle, 4096)) !== false) {
$file_data = explode('LIN', $buffer);
foreach($file_data as $line) {
$info = explode('*', $line);
$info = array_filter($info);
if (!empty($info)) {
echo '<pre>';
print_r($info);
echo '</pre>';
}
}
}
if (!feof($handle)) {
echo "Error: unexpected fgets() fail\n";
}
fclose($handle);
}
The output of the above code using your data is:
Array
(
[1] => 1234
[2] => UP
[3] => abcde
[4] => 33
[6] => EA
)
Array
(
[1] => 5678
[2] => UP
[3] => fghij
[4] => 33
[6] => EA
)
Array
(
[1] => 9101
[2] => UP
[3] => klmno
[4] => 33
[5] => 23
[6] => EA
)
But still unclear about your missing code since the line that states:
foreach ($list as $item) { //an array containing 10 items
That seems to be another real choke point.
When you do file_get_contents, it loads the stuff into the memory so you can only imagine how resource intensive the process may be. Not to mention you have a nested loop, that's (O)n^2
You can either split the file if possible or use fopen, fgets and fclose to read them line by line.
If I was you, I’d use another language like C++ or Go if I really need the speeds.
So I have two files, formatted like this:
First file
adam 20 male
ben 21 male
Second file
adam blonde
adam white
ben blonde
What I would like to do, is use the instance of adam in the first file, and search for it in the second file and print out the attributes.
Data is seperated by tab "\t", so this is what I have so far.
$firstFile = fopen("file1", "rb"); //opens first file
$i=0;
$k=0;
while (!feof($firstFile) ) { //feof = while not end of file
$firstFileRow = fgets($firstFile); //fgets gets line
$parts = explode("\t", $firstFileRow); //splits line into 3 strings using tab delimiter
$secondFile= fopen("file2", "rb");
$countRow = count($secondFile); //count rows in second file
while ($i<= $countRow){ //while the file still has rows to search
$row = fgets($firstFile); //gets whole row
$parts2 = explode("\t", $row);
if ($parts[0] ==$parts2[0]){
print $parts[0]. " has " . $parts2[1]. "<br>" ; //prints out the 3 parts
$i++;
}
}
}
I cant figure out how to loop through the second file, get each row, and compare to the first file.
You have a typo in the inner loop, you are reading firstfile and should be reading second file. In addition, after exiting inner loop you would want to re-wind the secondfile pointer back to the beginning.
How about this:
function file2array($filename) {
$file = file($filename);
$result = array();
foreach ($file as $line) {
$attributes = explode("\t", $line);
foreach (array_slice($attributes, 1) as $attribute)
$result[$attributes[0]][] = $attribute;
}
return $result;
}
$a1 = file2array("file1");
$a2 = file2array("file2");
print_r(array_merge_recursive($a1, $a2));
It will ouput the following:
Array (
[adam] => Array (
[0] => 20
[1] => male
[2] => blonde
[3] => white
)
[ben] => Array (
[0] => 21
[1] => male
[2] => blonde
)
)
However this one reads both files in one piece and will crash, if they are large ( >100MB). On the other hand 90% of all php programs have this problem, since file() is popular :-)
I have an array in php that contains all the lines of a text files (each line being one value of the array). My text file had blank lines so the array has blank lines too. I wanted to search the array for a certain value like this:
$array = array();
$lines = file("textfile.txt"); //file in to an array
foreach ($lines as $line)
{
if (stripos($line, "$$") !== false)
{
$array[] = str_replace("$$", "", $line);
}
}
The code above is searching for a $$ and replacing it with a blank. The text file holds a line with a $$1 or any number and I want it to find all instances of that line, which it is doing.
My problem is that I want it to find the next 5 lines that aren't blank after finding the $$(number) and put them into a multi dimensional array. The multidimensional array looking similar to this (the program is a test in case you are wondering why the array is named the way it is):
$test = array(
array('question' => 'What is the answer', 'ansa' => "answera", 'ansb' => "answerb", 'ansc' => "answerc", 'ansd' => "answerd"), // $test[1]
array('question' => 'What is the answer', 'ansa' => "answera", 'ansb' => "answerb", 'ansc' => "answerc", 'ansd' => "answerd"), // $test[2]
);
The next five lines after the $$(number) are a question and four answers that need to go into the array. My code with regxp and searching isn't working so i discarded it.
you can try something like this...
<?php
$lines = array_filter(file('text.txt')); //file in to an array
$questions = array();
// find your starts and pull out questions
foreach ($lines as $k=>$line)
{
if (stripos($line, "$$") !== false)
{
$questions[] = array_slice($lines, $k, 5);
}
}
// dump
var_dump($questions);
See php manual for array_slice
Have you looked at preg_replace_callback?
Something along these lines should work:
<?php
function replace_callback($matches) {
var_dump($matches);
}
preg_replace_callback('/\$\$[0-9]+\s+([^'.PHP_EOL.']+){5}/is', 'replace_callback', file_get_contents('textfile.txt'));
?>
I have a 12 XML files from which I am extracting ONE CSV file, from which - I am extracting column 1 and appending values to a tt.txt file .
NOW, I need to extract the values from this .txt file... everytime data is written to it ...
But the problem is , when I use
$contents = fread ($fd,filesize ($filename));
fclose ($fd);
$delimiter = ',' ;
$splitcontents = explode($delimiter, $contents);
IT reads ONLY from the first value of the file , every time a tt.txt file is appended !
I hope u understand the problem .. What I need is , I want $contents to have only the new data that was appended... instead it reads from the start of the file everytime...
Is there a way to achieve this, or does php fail ?/
This prob is extraction from TXT file- > performing computations- > writing INTO a new txt file . The problem being that I can't read from a middle value to a new value.. PHP always reads from the start of a file.
I think you need to store the last file position.
Call filesize to get current length, read the file, later, check if filesize is different (or maybe you know this some other way, and use fseek to move the cursor in the file, then read from there.
IE:
$previousLength = 0;
// your loop when you're calling your new read function
$length = filesize($filename);
fseek($fd,$previousLength);
$contents = fread($fd,$length - $previousLength);
$previousLength = $length;
It is only reading the first field because PHP does not automatically assume that a newline character (\n) means a new record; you have to handle this, yourself.
Using what you already have, I would do the following:
$contents = fread($fd, filesize($filename));
close($fd);
/* Now, split up $contents by newline, turning this into an array, where each element
* is, in effect, a new line in the CSV file. */
$contents = explode("\n", $contents);
/* Now, explode each element in the array, into itself. */
foreach ($contents as &$c) {
$c = explode(",", $c);
}
In the future, if you want to go line-by-line, as you run the risk of hogging too many resources by reading the entire file in, use fgets().
I'm not great at arrays but it sounds to me like you need an associative array (I'm doing a similar thing with the following code.
$lines = explode("\n", $contents);
foreach ($lines as $line) {
$parts = explode(',', $line);
if (count($parts) > 0) {
$posts = array();
$posts[] = array('name' => $parts[3],'email' => $parts[4],'phone' => $parts[5],'link' => $parts[6],'month' => $parts[0],'day' => $parts[1],'year' => $parts[2]); }
foreach ($posts as $post):
$post = array_filter(array_map('trim', $post));