Reading certain lines of a textfile PHP - php

I have to write a parser for a txt file with structure like this:
exampleOfSomething: 95428, anotherExample: 129, youNeedThis: 491,\n
anotherExample: 30219, exampleOfSomething: 4998, youNeedThis: 492,
But there is one major problem - like in the example - the file doesn't always come out in one order, sometimes i get "youNeedThis" before "anotherExample" etc., but the structure
{variable}: {value},
is always the same. I know what I'm looking for (i.e. I want to read only the value of "anotherExample"). When I get this number I want it to write it to some txt file in separate lines:
129
30219
From what I've gotten so far is to write every number from the file in separate line, but I have to filter them out to only contain the ones I'm looking for. Is there a way of filtering this out without having to do something like this:
$c = 0;
if (fread($file, 1) == "a" && $c == 0) $c++;
if (fread($file, 1) == "n" && $c == 1) $c++;
if (fread($file, 1) == "o" && $c == 2) $c++;
// And here after I check if this is correct line, I take the number and write the rest of it to output.txt

Discover regular expressions.
preg_match_all('/anotherExample\:\s*([0-9]+)/sm', file_get_contents('input.txt'), $rgMatches);
file_put_contents('output.txt', join(PHP_EOL, $rgMatches[1]));

How about something like this:
<?php
$data = file_get_contents($filename);
$entries = explode(",", $data);
foreach($entries as $entry) {
if(strpos($entry, "anotherExample") === 0) {
//Split the entry into label and value, then print the value.
}
}
?>
You'll probably want to do something a little more robust than just an explode to get $entries, something like preg_split.

I've solved it with this:
$fileHandlerInput = file_get_contents($fileNameInput);
$rows = explode (",", $fileHandlerInput);
foreach($rows as $row) {
$output = explode(":", $row);
if (preg_match($txtTemplate, trim($output[0]))) {
fwrite($fileHandlerOutput[0], trim($output[1])."\r");
}
}
It's not the most efficient nor neat one but it works, both answers helped me with figuring this out.

Related

php Compare two text files and output NON matching records

I have two text files of data the first file has 30 lines of data and matches with 30 lines in the second text file, but in addition the first text file has two additional lines that are added as the operator uploads file to the directory I want to find the non matching lines and out put them to be used in the same script as a mailout.
I am trying to use this code, which outputs the contents of the two files to screen.
<?php
if ($file1 = fopen(".data1.txt", "r")) {
while(!feof($file1)) { $textperline = fgets($file1);
echo $textperline;
echo "<br>";}
if ($file2 = fopen(".data.txt", "r")) {
while(!feof($file2)) {$textperline1 = fgets($file2);
echo $textperline1;
echo "<br>";}
fclose($file1);
fclose($file2);
}}
?>
But it outputs the whole list of data, can anyone help listingout only NON matching lines?
attached output of the two files from my code
I want to output only lines that are in file2 but not in file1
My suggestion would be to read each file into an array (one line = one element) and then use array_diff to compare them. Unless you have millions of lines, this approach is the easiest.
To reuse your code, this is how you can read the 2 files into two arrays
$list1 = [];
$list2 = [];
if ($file1 = fopen(".data1.txt", "r")) {
while (!feof($file1)) {
$list1[] = trim(fgets($file1));
}
fclose($file1);
}
if ($file2 = fopen(".data.txt", "r")) {
while (!feof($file2)) {
$list2[] = trim(fgets($file2));
}
fclose($file2);
}
If the files are small and you can read them in one go, you can also use a simplified syntax.
$list1 = explode(PHP_EOL, file_get_contents(".data1.txt"));
$list2 = explode(PHP_EOL, file_get_contents(".data.txt"));
Then, no matter which method you chose, you can compare them as follows
$comparison = array_diff($list2, $list1);
foreach ($comparison as $line) {
echo $line."<br />";
}
This will only output the lines of the second array that are not present in the first one.
Make sure that the one with the additional lines is the first argument of array_diff
ASSUMPTION
Both files are not huge and you can read the whole content into the memory at once. According to this, you can put following code to the top:
$file1 = "./data1.txt";
$file2 = "./data2.txt";
$linesOfFile1 = file($file1);
$linesOfFile2 = file($file2);
$newLinesInFile2 = [];
There are a couple cases, which you did not mention in your question.
CASE 1
New lines are only appended to the secode file file2. The solution for this case is the easiest one:
$numberOfRowsFile1 = count($linesOfFile1);
$numberOfRowsFile2 = count($linesOfFile1);
if($numberOfRowsFile2 > $numberOfRowsFile1)
{
$newLinesInFile2 = array_slice($linesOfFile2, $numberOfRowsFile1);
}
CASE 2
The lines with the same content may have different position in each file. Duplicate lines within the same file are ignored.
Furthermore the case sensitivity may play a role. That's why the content of each line should be hashed to make a simpler comparison. For both case sensitive and insensitive comparison the following function is needed:
function buildHashedMap($array, &$hashedMap, $caseSensitive = true)
{
foreach($array as $line)
{
$line = !$caseSensitive ? strtolower($line) : $line;
$hash = md5($line);
$hashedMap[$hash] = $line;
}
}
Case sensitive comparison
$hashedLinesFile1 = [];
buildHashedMap($linesOfFile1, $hashedLinesFile1);
$hashedLinesFile2 = [];
buildHashedMap($linesOfFile2, $hashedLinesFile2);
$newLinesInFile2 = array_diff_key($hashedLinesFile2, $hashedLinesFile1);
Case INSENSITIVE comparison
$caseSensitive = false;
$hashedLinesFile1 = [];
buildHashedMap($linesOfFile1, $hashedLinesFile1, $caseSensitive);
$hashedLinesFile2 = [];
buildHashedMap($linesOfFile2, $hashedLinesFile2, $caseSensitive);
$newLinesInFile2 = array_diff_key($hashedLinesFile2, $hashedLinesFile1);

what is wrong with this code, I am using PHP 8

The loop never stop, also it always print, not only when $i is equal 8
$file = file_get_contents ($fileUrl);
$i = 0;
while ($line = explode ("\r\n", $file)) {
if ($i == 8) {
print_r ($line);
exit ();
}
$i++;
}
By the way, I need to use file_get_contents because I am using DOM, but I use that code because I need the data in line number 8, is there any better way to get a specific line
It is infinite because explode always explodes the entire file string and it never fails. You can read it into an array, but this is only useful without the exit if you are doing things with other lines in the file:
foreach(file($fileUrl) as $line) {
if ($i == 8) { // actually the ninth line
print_r ($line);
}
$i++;
}
Or read it as you are and get the proper line:
$lines = explode("\r\n", $file);
print_r($lines[8]); // actually the ninth line
You're not looping through the lines. You're setting $line to an array of all the lines, not a specific line. And you're setting it to the same thing every time through the loop, so the while condition will never change.
However, the loop should stop when $i == 8 because of the exit() call. It will then print all the lines with print_r().
If you want line 8, just index the array.
$lines = explode("\r\n", $file);
if (count($lines) >= 9) {
print_r($lines[8]);
}
FYI, you can also use file() to read a file and split it into lines:
$lines = file($fileUrl, FILE_IGNORE_NEW_LINES);
if you want to get the 8th line, you can simply do this:
$line = explode ("\r\n", $file)[8];
print_r($line);
without using a loop
and regarding you question for infinite loop
$line = explode ("\r\n", $file)
returns a true, since you are just assigning an array to this variable.
you should use foreach here like this:
foreach(explode ("\r\n", $file) as $line){
// TO DO
}

Check if any of array element is less than 2 characters length within a loop in PHP [duplicate]

This question already has answers here:
The 3 different equals
(5 answers)
Closed 10 months ago.
I'm currently creating a search engine for a database and realized that numbers or short words would mess up the search.
The engine works so that it splits the sentence into words, and each word are put in to an array called $searches.
An example of a search that creates a mess would look something like:
database info 3
Each word will be looked for everywhere and shown if it matches either search 1 and 2, 1 and 3, or 2 and 3. The problem is, that by searching "3", it creates a mess. Because there are so many things containing just a single character.
So I wonder how I can check the length of all content in an array, and loop this correctly.
This is the code I have so far (Not fully):
$search = $_GET['q']; //Search string
$count = 0; //Counter
$searches = explode(' ', $search); //Creates the array
for ($i = 0; $i < count($searches); $i++) { //Loop for each element in array
if (strlen($searches[$i]) <= 2) { //if it finds an element less than 2 characters
$keyword = $searches[$i]; //remember key word
foreach (glob('database/*/*/*.txt') as $path) { //Look through database
$title = basename($path, ".txt").PHP_EOL; //Get file instead of path
for ($q=0; $q < count($searches); $q++) { //Another loop in case keyword comes last (which it does)
if (($searches[$q] != $keyword) && (strripos($title,$keyword) != false) && (strripos($title,$searches[$q]) != false)) { //check if if keyword is not equal to itself while searching and tries to find a file with a combination of keyword and $searches[$q].
echo $title . '<br>'; //Gives output
$count += 1;
}
}
}
}
}
if($count == 0) {
echo 'NOTHING'; //Nothing
}
It seams like it won't output any files including both words. Basically not showing any files. The double loop just makes this complicated for checking the array length. Any clue how to get this to work?
Well, the first bit you have to take care of is:
$search = $_GET['q'];
except you are sure 100% about the security of your input you should sanitize it.
Then I would check:
if (strlen($searches[$i]) <= 2) {
Actually you are doing something only if your word has max 2 chars.
The reason I didn't get this to work was because this code was written incorrect:
if (($searches[$q] != $keyword) && (strripos($title,$keyword) != false) && (strripos($title,$searches[$q]) != false)) {
Forgot to put !== instead of !=. So fixing the code to this will work:
if (($searches[$q] != $keyword) && (strripos($title,$keyword) !== false) && (strripos($title,$searches[$q]) !== false)) {

Extracting meaningful data from this complicated string in PHP

I'm receiving some structured data for my PHP application, but the format is somewhat unpredictable and difficult to deal with. I don't get a say in the initial format of the data. What I get is a string (sample given below).
[9484,'Víctor Valdés',8,[[['accurate_pass',[15]],['touches',[42]],['saves',[4]],['total_pass',[24]],['good_high_claim',[2]],['formation_place',[1]]]],1,'GK',1,0,0,'GK',31,183,78],[1320,'Carles Puyol',7.76,[[['accurate_pass',[50]],['touches',[75]],['aerial_won',[3]],['total_pass',[55]],['total_tackle',[1]],['formation_place',[6]]]],2,'DC',5,0,0,'D(CLR)',35,178,80],[5780,'Dani Alves',8.21,[[['accurate_pass',[58]],['touches',[99]],['total_scoring_att',[1]],['total_pass',[66]],['total_tackle',[6]],['aerial_lost',[1]],['fouls',[4]],['formation_place',[2]]]],2,'DR',22,0,0,'D(CR)',30,173,64],[83686,'Marc Bartra',8.31,[[['accurate_pass',[64]],['touches',[88]],['won_contest',[1]],['total_scoring_att',[1]],['aerial_won',[1]],['total_pass',[66]],['total_tackle',[5]],['aerial_lost',[1]],['fouls',[1]],['formation_place',[5]]]],2,'DC',15,0,0,'D(C)',22,181,70],[13471,'Adriano',6.72,[[['accurate_pass',[16]],['touches',[28]],['aerial_won',[2]],['total_pass',[18]],['total_tackle',[1]],['formation_place',[3]]]],2,'DL',21,1,31,'D(CLR),M(LR)',29,172,67]
The above is data for 5 football players. This is what I need to get:
[9484,'Víctor Valdés',8,[[['accurate_pass',[15]],['touches',[42]],['saves',[4]],['total_pass',[24]],['good_high_claim',[2]],['formation_place',[1]]]],1,'GK',1,0,0,'GK',31,183,78]
[1320,'Carles Puyol',7.76,[[['accurate_pass',[50]],['touches',[75]],['aerial_won',[3]],['total_pass',[55]],['total_tackle',[1]],['formation_place',[6]]]],2,'DC',5,0,0,'D(CLR)',35,178,80]
[5780,'Dani Alves',8.21,[[['accurate_pass',[58]],['touches',[99]],['total_scoring_att',[1]],['total_pass',[66]],['total_tackle',[6]],['aerial_lost',[1]],['fouls',[4]],['formation_place',[2]]]],2,'DR',22,0,0,'D(CR)',30,173,64]
[83686,'Marc Bartra',8.31,[[['accurate_pass',[64]],['touches',[88]],['won_contest',[1]],['total_scoring_att',[1]],['aerial_won',[1]],['total_pass',[66]],['total_tackle',[5]],['aerial_lost',[1]],['fouls',[1]],['formation_place',[5]]]],2,'DC',15,0,0,'D(C)',22,181,70]
[13471,'Adriano',6.72,[[['accurate_pass',[16]],['touches',[28]],['aerial_won',[2]],['total_pass',[18]],['total_tackle',[1]],['formation_place',[3]]]],2,'DL',21,1,31,'D(CLR),M(LR)',29,172,67]
Now, what I've done manually in the above example I need to do reliably with PHP. As you see, each player has a set of data. In order to split the big string into individual players, I can't just explode it by "],[" because that substring appears within each player's data too an unpredictable number of times.
Each player has a certain number of statistics (accurate_pass, touches etc) but they don't all have the same statistics. For instance, player #1 has "saves" and the others don't. Player #4 has "won_contest" and the others don't. There is no way to know who will have which stats. That means I can't just count commas until the new player or something similar.
Each player has a number before his name, but that number has an unpredictable number of digits and there's no way to discern it from other numbers which may appear in the string.
What I see as a constant occurrence for all players is the last bit: before the last closed bracket there are always 3 integers divided by commas. This type of substring (INT,INT,INT]) doesn't seem to appear in any other situation. Maybe this could be of some use?
A "hard" way to do this is parenthesis counting (less common in PHP, more common in text parsing languages)...
<?php
$str = "[9484,'Víctor Valdés',8,[[['accurate_pass',[15]],['touches',[42]],['saves',[4]],['total_pass',[24]],['good_high_claim',[2]],['formation_place',[1]]]],1,'GK',1,0,0,'GK',31,183,78],[1320,'Carles Puyol',7.76,[[['accurate_pass',[50]],['touches',[75]],['aerial_won',[3]],['total_pass',[55]],['total_tackle',[1]],['formation_place',[6]]]],2,'DC',5,0,0,'D(CLR)',35,178,80],[5780,'Dani Alves',8.21,[[['accurate_pass',[58]],['touches',[99]],['total_scoring_att',[1]],['total_pass',[66]],['total_tackle',[6]],['aerial_lost',[1]],['fouls',[4]],['formation_place',[2]]]],2,'DR',22,0,0,'D(CR)',30,173,64],[83686,'Marc Bartra',8.31,[[['accurate_pass',[64]],['touches',[88]],['won_contest',[1]],['total_scoring_att',[1]],['aerial_won',[1]],['total_pass',[66]],['total_tackle',[5]],['aerial_lost',[1]],['fouls',[1]],['formation_place',[5]]]],2,'DC',15,0,0,'D(C)',22,181,70],[13471,'Adriano',6.72,[[['accurate_pass',[16]],['touches',[28]],['aerial_won',[2]],['total_pass',[18]],['total_tackle',[1]],['formation_place',[3]]]],2,'DL',21,1,31,'D(CLR),M(LR)',29,172,67]";
$line = ',';
$paren_count = 0;
$lines = array();
for($i=0; $i<strlen($str); $i++)
{
$line.= $str{$i};
if($str{$i} == '[') $paren_count++;
elseif($str{$i} == ']')
{
$paren_count--;
if($paren_count == 0)
{
$lines[] = substr($line,1);
$line = '';
}
}
}
print_r($lines);
?>
Looks like #Boundless answer is correct, you can use json_decode, but you need to do a couple of things to the string you get first, which also seems like a valid json formatted string.
This worked for me:
<?php
$str = "[9484,'Víctor Valdés',8,[[['accurate_pass',[15]],['touches',[42]],['saves',[4]],['total_pass',[24]],['good_high_claim',[2]],['formation_place',[1]]]],1,'GK',1,0,0,'GK',31,183,78],[1320,'Carles Puyol',7.76,[[['accurate_pass',[50]],['touches',[75]],['aerial_won',[3]],['total_pass',[55]],['total_tackle',[1]],['formation_place',[6]]]],2,'DC',5,0,0,'D(CLR)',35,178,80],[5780,'Dani Alves',8.21,[[['accurate_pass',[58]],['touches',[99]],['total_scoring_att',[1]],['total_pass',[66]],['total_tackle',[6]],['aerial_lost',[1]],['fouls',[4]],['formation_place',[2]]]],2,'DR',22,0,0,'D(CR)',30,173,64],[83686,'Marc Bartra',8.31,[[['accurate_pass',[64]],['touches',[88]],['won_contest',[1]],['total_scoring_att',[1]],['aerial_won',[1]],['total_pass',[66]],['total_tackle',[5]],['aerial_lost',[1]],['fouls',[1]],['formation_place',[5]]]],2,'DC',15,0,0,'D(C)',22,181,70],[13471,'Adriano',6.72,[[['accurate_pass',[16]],['touches',[28]],['aerial_won',[2]],['total_pass',[18]],['total_tackle',[1]],['formation_place',[3]]]],2,'DL',21,1,31,'D(CLR),M(LR)',29,172,67]";
$str = '[' . $str . ']';
$str = str_replace('\'','"', $str);
//convert string to array
$arr = json_decode($str);
//now it's a php array so you can access any value
//echo '<pre>';
//print_r( $arr );
//echo '</pre>';
echo $arr [0][1]; //prints "Victor Valdes"
?>
Your string looks like JSON but it is not valid JSON so json_decode() will not work.
Your specific case could be converted to valid JSON by wrapping the string in a pair of [] and replacing the single quotes with double quotes:
$string = str_replace("'", '"', $your_string);
var_dump(json_decode('[' . $string . ']'));
See this example.
Of course the best solution would be to make sure that valid JSON is supplied because this will break easily if your text strings contain for example double quotes.
Try parsing as json, then pulling out what you want. Assuming that the data comes in blocks of 4 you can try:
$arr = json_decode($str);
for($i = 0; $i < count($arr) - 3; $i += 4)
{
$arr[] = new array($arr[$i], $arr[$i + 1], $arr[$i + 2], $arr[$i + 3]);
}
Why not count the [ in a loop? Here's a quick untested loop that could get you started.
$output = array('');
$brackets = 0;
$index = 0;
foreach (str_split($input) as $ch) {
if ($ch == '[') {
$brackets++;
}
$output[$index] .= $ch;
if ($ch == ']') {
$brackets--;
if ($brackets === 0) {
$index++;
$output[$index] = '';
}
}
}
Not very elegant though...

problem with PHP reading CSV files

I'm trying to read data from a.csv file to ouput it on a webpage as text.
It's the first time I'm doing this and I've run into a nasty little problem.
My .csv file(which gets openened by Excel by default), has multiple rows and I read the entire thing as one long string.
like this:
$contents = file_get_contents("files/data.csv");
In this example file I made, there are 2 lines.
Paul Blueberryroad
85 us Flashlight,Bag November 20,
2008, 4:39 pm
Hellen Blueberryroad
85 us lens13mm,Flashlight,Bag,ExtraBatteries November
20, 2008, 16:41:32
But the string read by PHP is this:
Paul;Blueberryroad 85;us;Flashlight,Bag;November 20, 2008, 4:39 pmHellen;Blueberryroad 85;us;lens13mm,Flashlight,Bag,ExtraBatteries;November 20, 2008, 16:41:32
I'm splitting this with:
list($name[], $street[], $country[], $accessories[], $orderdate[]) = split(";",$contents);
What I want is for $name[] to contain "Paul" and "Hellen" as its contents. And the other arrays to receive the values of their respective columns.
Instead I get only Paul and the content of $orderdate[] is
November 20, 2008, 4:39 pmHellen
So all the rows are concatenated. Can someone show me how i can achieve what I need?
EDIT: solution found, just one werid thing remaining:
I've solved it now by using this piece of code:
$fo = fopen("files/users.csv", "rb+");
while(!feof($fo)) {
$contents[] = fgetcsv($fo,0,';');
}
fclose($fo);
For some reason, allthough my CSV file only has 2 rows, it returns 2 arrays and 1 boolean. The first 2 are my data arrays and the boolean is 0.
You are better off using fgetcsv() which is aware of CSV file structure and has designated options for handling CSV files. Alternatively, you can use str_getcsv() on the contents of the file instead.
The file() function reads a file in an array, every line is an entry of the array.
So you can do something like:
$rows = array();
$name = array();
$street = array();
$country = array();
$rows = file("file.csv");
foreach($rows as $r) {
$data = explode(";", $r);
$name[] = $data[0];
$street[] = $data[1];
$country[] = $data[2];
}
I've solved it now by using this piece of code:
$fo = fopen("files/users.csv", "rb+");
while(!feof($fo)) {
$contents[] = fgetcsv($fo,0,';');
}
fclose($fo);
For some reason, allthough my CSV file only has 2 rows, it returns 2 arrays and 1 boolean. The first 2 are my data arrays and the boolean is 0.
The remark about fgetcsv is correct.
I will still answer your question, for educational purpose. First thing, I don't understand the difference between your data (with comas) and the "string read by PHP" (it substitutes some spaces with semi-colon, but not all?).
PS.: I looked at the source code of your message, it looks like an odd mix of TSV (tabs) and CSV (coma).
Beside, if you want to go this way, you need to split first the file in lines, then the lines in fields.
The best way is of course fgetcsv() as pointed out.
$f = fopen ('test.csv', 'r');
while (false !== $data = fgetcsv($f, 0, ';'))
$arr[] = $data;
fclose($f);
But if you have the contents in a variable and want to split it, and str_getcsv is unavailable you can use this:
function str_split_csv($text, $seperator = ';') {
$regex = '#' . preg_quote($seperator) . '|\v#';
preg_match('|^.*$|m', $text, $firstline);
$chunks = substr_count($firstline[0], $seperator) + 1;
$split = array_chunk(preg_split($regex, $text), $chunks);
$c = count($split) - 1;
if (isset($split[$c]) && ((count($split[$c]) < $chunks) || (($chunks == 1) && ($split[$c][0] == ''))))
unset($split[$c]);
return $split;
}

Categories