HOw to correctly use array_diff() - php

I have the following code:
$l1 = file($file1['tmp_name']);// get file 1 contents
$l2 = file($file2['tmp_name']);// get file 2 contents
$l3 = array_diff($l1, $l2);// create diff array
Here are the files:
File 1:
6974527983
6974527984
6974527985
File 2:
6974527983
$l3 should be:
6974527984
6974527985
But, instead it is just spitting out the values from File 1:
6974527983
6974527984
6974527985
Am I setting this up right?
UPdate -
Using print_r(), I have verified that the files being loaded are being properly parsed into arrays:
File 1 -
Array ( [0] => 6974527983 [1] => 6974527984 [2] => 6974527985 ) 1
File 2 -
Array ( [0] => 6974527983 ) 1
So I don't believe there are any issues with the newlines in the text files.

If each number is on a new line, you could try splitting each file by line breaks and comparing the arrays that way.
$l1 = explode("\n", file($file1['tmp_name']));
$l2 = explode("\n", file($file2['tmp_name']));
$l3 = array_diff($l1, $l2);

Using the following example you can see that array_diff() works as expected:
$a = array(
6974527983,
6974527984,
6974527985
);
$b = array(
6974527983
);
var_dump(array_diff($a, $b));
Output:
array(2) {
[1] =>
int(6974527984)
[2] =>
int(6974527985)
}
This shows that file($file2['tmp_name']) is the problem in your case. Try:
var_dump(file($file2['tmp_name']));
to check the file's contents.

Okay, I will post an answer as I think this will solve your issue.
Without knowing more about the structure of your files, we can only assume that there is a possible issue with line endings. There are three possible line endings:
Unix: \n
Windows: \r\n
Classic mac: \r
I see two possible scenarios here:
The line endings in each file are different to each other
The line endings in both files are \r (classic mac)
As Mark Baker pointed out, you should use the FILE_IGNORE_NEW_LINES flag as the second argument for each of your file() calls. This, as far as I can make out from quickly experimenting here, should resolve the issue if one file had Unix and the other had Windows line endings.
However, it does not seem to deal well in cases where at least one file has '\r' line endings. In this case, there's an ini setting that might help:
ini_set('auto_detect_line_endings', true);
Consulting the docs for auto_detect_line_endings:
When turned on, PHP will examine the data read by fgets() and file() to see if it is using Unix, MS-Dos or Macintosh line-ending conventions.
This enables PHP to interoperate with Macintosh systems, but defaults to Off, as there is a very small performance penalty when detecting the EOL conventions for the first line, and also because people using carriage-returns as item separators under Unix systems would experience non-backwards-compatible behaviour.
So, TL;DR: debug your line endings to make sure you know what's going on (with file or hexdump or similar), and use a combination of auto_detect_line_endings and FILE_IGNORE_NEW_LINES.
Hope this helps :)

Related

PHP Associative array strange behavior

I am using an associative array which I initialized like this:
$img_captions = array();
Then, later in the code I am filling it in a while loop with keys and values coming in from a .txt file (every line in that .txt file contains a pair - a string - separated by '|') looking like this:
f1.jpg|This is a caption for this specific file
f2.jpg|Yea, also this one
f3.jpg|And this too for sure
...
I am filling the associative array with those data like this:
if (file_exists($currentdir ."/captions.txt"))
{
$file_handle = fopen($currentdir ."/captions.txt", "rb");
while (!feof($file_handle) )
{
$line_of_text = fgets($file_handle);
$parts = explode('/n', $line_of_text);
foreach($parts as $img_capts)
{
list($img_filename, $img_caption) = explode('|', $img_capts);
$img_captions[$img_filename] = $img_caption;
}
}
fclose($file_handle);
}
When I test that associative array if it actually contains keys and values like:
print_r(array_keys($img_captions));
print_r(array_values($img_captions));
...I see it contains them as expected, BUT when I try to actually use them with direct calling like, let's say for instance:
echo $img_captions['f1.jpg'];
I get PHP error saying:
Notice: Undefined index: f1.jpg in...
I am clueless what is going on here - can anyone tell, please?
BTW I am using USBWebserver with PHP 5.3.
UPDATE 1: so by better exploring the output of the 'print_r(array_keys($img_captions));' inside Chrome (F12 key) I noticed something strange - THE FIRST LINE OF '[0] => f1.jpg' LOOKS VISUALLY VERY WEIRD tho it looks normal when displayed as print_r() output on the site, I noticed it actually in fact is coded like this in webpage source (F12):
Array
(
[0] => f1.jpg
[1] => f2.jpg
[2] => f3.jpg
[3] => f4.jpg
[4] => f5.jpg
[5] => f6.jpg
[6] => f7.jpg
[7] => f8.jpg
[8] => f9.jpg
[9] => f10.jpg
)
So when I tested anything else than the 1. line it works OK. I tryed to delete completely the file and re-write it once again but still the same occurs...
DISCLAIMER Guys, just to clarify things more properly: THIS IS NOT MY ORIGINAL CODE (that is 'done completely by me'), it is
actually a MiniGal Nano PHP photogalery I had just make to suit my
needs but those specific parts we are talking about are FROM THE
ORIGINAL AUTHOR
I will recommend you to use file() along wth trim().
Your code becomes short, readable and easy to understand.
$parts= file('your text file url', FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES);
$img_captions = [];
foreach($parts as $img_capts){
list($img_filename, $img_caption) = explode('|', $img_capts);
$img_captions[trim(preg_replace("/&#?[a-z0-9]+;/i","",$img_filename))] = trim(preg_replace("/&#?[a-z0-9]+;/i","",$img_caption));
}
print_r($img_captions);
So after a while I realize there is something wrong with my .txt file itself as it:-
ALWAYS PUT SOME STRANGE SIGNS IN FRONT OF THE 1st LINE WHATEVER I DO IT WAS ALWAYS THERE EVEN WITH THE NEW FILE CREATED FROM SCRATCH (although those are UNVISIBLE unless seen as a source code on a webpage!!!)
So I decided to test it in another format, this time .log file and all of a sudden everything works just fine.
I do not know if it is just my local problem of some sort (most probably is) or something else I am not aware of.
But my solution to this was changing the file type holding the string pairs (.txt => .log) which solved this 'problem' for me.
Some other possible solution to this as #AbraCadaver said:
(Those strange signs: [0] => f1.jpg) That's the HTML entity for a BYTE ORDER MARK or BOM, save your file
with no BOM in whatever editor you're using.

CSV file line count not working in PHP

I have a webpage that needs to count the number of lines in a CSV file, but the following code isn't working:
$linecount = count(file("sample.csv"));
var_dump($linecount);
When I run this code, the code returns the number 1, but there are 8 lines in sample.csv. Does anybody know why this is happening and how to fix it?
If the sample.csv file created in mac/linux you might want to consider setting auto_detect_line_endings to ON.
From the manual:
auto_detect_line_endings boolean
When turned on, PHP will examine the data read by fgets() and file() to see if it is using
Unix, MS-Dos or Macintosh line-ending conventions.
Another option (if you don't want to use this) is to read the file and split the lines by all new-line options (\r\n|\r|\n):
$linecount = count(preg_split("/\r\n|\r|\n/", file_get_contents("sample.csv")));

fgetcsv returns too many entries

I have the following code:
while (!feof($file)) {
$arrayOfIdToBodyPart = fgetcsv($file,0, "\t");
if (count($arrayOfIdToBodyPart)==2){
the problem is, the contents of the file look like this:
39 ankle
40 tibia
41 Vastus Intermedius
and so on
sometimes, the test in the if will show three entries, with the first being the number, the second being the name, and the third being just... emtpy.
This causes the if block to fail, and me to be sad. I know i can just make the if block test for >=2, but is there any way i can get it to just recognise the fact that there are two items? I don't like that the fgetcsv is finding "mystery" characters at the end of the line.
Is this possibly a unix server running a windows-based file error? If so, and i'm running an ubuntu server without dos2unix, where do i get it?
You probably have tabs at the end of a line:
value<tab>value<tab><newline>
If that's the case, dos2unix won't help you. You might have to do something like read each line into a variable, trim() the variable, and then use str_getcsv() to split it.
Is it possible that you have a tab at the end of those lines? They are invisible and often hard to spot... you might want to double check.
Also if you are working with csv files, while you are running windows locally and the server is unix, I found this line:
ini_set('auto_detect_line_endings', true);
saves a lot of headaches.

Merge two large CSV files with PHP

I want to merge two large CSV files with PHP. This files are too big to even put into memory all at once. In pseudocode, I can think of something like this:
for i in file1
file3.write(file1.line(i) + ',' + file2.line(i))
end
But when I'm looping through a file using fgetcsv, it's not really clear how I would grab line n from a certain file without loading the whole thing into memory first.
Any ideas?
Edit: I forgot to mention that each of the two files has the same number of lines and they have a one-to-one relationship. That is, line 62,324 in file1 goes with line 62,324 in file2.
Not sure what operating system you're on, but if you're using Linux, using the paste command is probably a lot easier than trying to do this in PHP.
If this is a viable solution and you don't absolutely need to do it in PHP, you could try the following:
paste -d ',' file1 file2 > combined_file
Take a look at the fgets function. You could read a single line of each file, process them, and write them to your new file, then move on to the next line until you've reached the end of your file.
PHP: fgets
Specifically look at the example titled Example #1 Reading a file line by line in the PHP manual. It's also important to note the return value of the the fgets functions.
Returns a string of up to length - 1
bytes read from the file pointed to by
handle. If there is no more data to
read in the file pointer, then FALSE
is returned.
So, if it doesn't return FALSE you know you still have more lines to process.
You can use fgets().
$file1 = fopen('file1.txt', 'r');
$file2 = fopen('file2.txt', 'r');
$merged = fopen('merged.txt', 'w');
while (
($line1 = fgets($file1)) !== false
&& ($line2 = fgets($file2)) !== false) {
fwrite($merged, $line1 . ',' . $line2);
}
fgets() reads one line from a file. As you can see, this code uses it on both files at the same time, writing the merged lines to a third file. The manual here:
http://php.net/fgets
http://php.net/fopen
http://php.net/fwrite
Try using fgets() to read one line from each file at a time.
I think the solution for this is to map first line begins for each line ( and some kind of key if you need ) and then make a new csv using fread and fwrite ( we know beginning and ending of each line now , so we need just seek and read )
Another way is to put it into MySQL ( if it is possible ) and then back to new CSV

Read in text file line by line php - newline not being detected

I have a php function I wrote that will take a text file and list each line as its own row in a table.
The problem is the classic "works fine on my machine", but of course when I ask somebody else to generate the .txt file I am looking for, it keeps on reading in the whole file as 1 line. When I open it in my text editor, it looks just as how I would expect it with a new name on each line, but its the newline character or something throwing it off.
So far I have come to the conclusion it might have something to do with whatever text editor they are using on their Mac system.
Does this make sense? and is there any easy way to just detect this character that the text editor is recognizing as a new line and replace it with a standard one that php will recognize?
UPDATE: Adding the following line solved the issue.
ini_set('auto_detect_line_endings',true);
Function:
function displayTXTList($fileName) {
if(file_exists($fileName)) {
$file = fopen($fileName,'r');
while(!feof($file)) {
$name = fgets($file);
echo('<tr><td align="center">'.$name.'</td></tr>');
}
fclose($file);
} else {
echo('<tr><td align="center">placeholder</td></tr>');
}
}
This doesn't work for you?
http://us2.php.net/manual/en/filesystem.configuration.php#ini.auto-detect-line-endings
What's wrong with file()?
foreach (file($fileName) as $name) {
echo('<tr><td align="center">'.$name.'</td></tr>');
}
From the man page of fgets:
Note: If PHP is not properly recognizing the line endings when reading files either on or created by a Macintosh computer, enabling the auto_detect_line_endings run-time configuration option may help resolve the problem.
Also, have you tried the file function? It returns an array; each element in the array corresponds to a line in the file.
Edit: if you don't have access to the php.ini, what web server are you using? In Apache, you can change PHP settings using a .htaccess file. There is also the ini_set function which allows changing settings at runtime.
This is a classic case of the newline problem.
ASCII defines several different "newline" characters. The two specific ones we care about are ASCII 10 (line feed, LF) and 13 (carriage return, CR).
All Unix-based systems, including OS X, Linux, etc. will use LF as a newline. Mac OS Classic used CR just to be different, and Windows uses CR LF (that's right, two characters for a newline - see why no one likes Windows? Just kidding) as a newline.
Hence, text files from someone on a Mac (assuming it's a modern OS) would all have LF as their line ending. If you're trying to read them on Windows, and Windows expects CR LF, it won't find it. Now, it has already been mentioned that PHP has the ability to sort this mess out for you, but if you prefer, here's a memory-hogging solution:
$file = file_get_contents("filename");
$array = split("/\012\015?/", $file); # won't work for Mac Classic
Of course, you can do the same thing with file() (as has already been mentioned).

Categories