fgetcsv returns too many entries - php

I have the following code:
while (!feof($file)) {
$arrayOfIdToBodyPart = fgetcsv($file,0, "\t");
if (count($arrayOfIdToBodyPart)==2){
the problem is, the contents of the file look like this:
39 ankle
40 tibia
41 Vastus Intermedius
and so on
sometimes, the test in the if will show three entries, with the first being the number, the second being the name, and the third being just... emtpy.
This causes the if block to fail, and me to be sad. I know i can just make the if block test for >=2, but is there any way i can get it to just recognise the fact that there are two items? I don't like that the fgetcsv is finding "mystery" characters at the end of the line.
Is this possibly a unix server running a windows-based file error? If so, and i'm running an ubuntu server without dos2unix, where do i get it?

You probably have tabs at the end of a line:
value<tab>value<tab><newline>
If that's the case, dos2unix won't help you. You might have to do something like read each line into a variable, trim() the variable, and then use str_getcsv() to split it.

Is it possible that you have a tab at the end of those lines? They are invisible and often hard to spot... you might want to double check.
Also if you are working with csv files, while you are running windows locally and the server is unix, I found this line:
ini_set('auto_detect_line_endings', true);
saves a lot of headaches.

Related

remove last line(s) if they are empty from a bunch of files

Some hosting provider I work with is very strict, if the last line(s) of a php file are empty it throws a fatal error and the site is broken.
The problem is that I've got a lot of php files where the last line might be empty and now I am looking for a way to automatically remove those lines...
I've tried to use sed:
sed -i '${/^$/d}' bla.php
but all i get is an error
sed: 1: "bla.php": extra characters at the end of d command
btw: i am using iTerm on the Mac
You can do something like this within your loop.
file_put_contents('[target_file]', implode('', file('[file_being_processed]', FILE_SKIP_EMPTY_LINES)));
The target_file would be your new copy of the file you're processing and the file_being_processed is self explanatory. You would need to add more logic into the loop to keep the same names.
Bare in mind the above code will exclude all blank lines no matter where they are, so you will also need to add some logic to only apply the above code to the last line of the file.
Have you tried turning off error_reporting?
Insert the following on the first line of your php files
<?php error_reporting(0); ?>

Getting different output for same PHP code

(Can't paste the exact question as the contest is over and I am unable to access the question. Sorry.)
Hello, recently I took part in a programming contest (PHP). I tested the code on my PC and got the desired output but when I checked my code on the contest website and ideone, I got wrong output. This is the 2nd time the same thing has happened. Same PHP code but different output.
It is taking input from command line. The purpose is to bring substrings that contact the characters 'A','B','C','a','b','c'.
For example: Consider the string 'AaBbCc' as CLI input.
Substrings: A,a,B,b,C,c,Aa,AaB,AaBb,AaBbC,AaBbCc,aB,aBb,aBbC,aBbCc,Bb,BbC,BbCc,bC,bCc,Cc.
Total substrings: 21 which is the correct output.
My machine:
Windows 7 64 Bit
PHP 5.3.13 (Wamp Server)
Following is the code:
<?php
$stdin = fopen('php://stdin', 'r');
while(true) {
$t = fread($stdin,3);
$t = trim($t);
$t = (int)$t;
while($t--) {
$sLen=0;
$subStringsNum=0;
$searchString="";
$searchString = fread($stdin,20);
$sLen=strlen($searchString);
$sLen=strlen(trim($searchString));
for($i=0;$i<$sLen;$i++) {
for($j=$i;$j<$sLen;$j++) {
if(preg_match("/^[A-C]+$/i",substr($searchString,$i,$sLen-$j))) {$subStringsNum++;}
}
}
echo $subStringsNum."\n";
}
die;
}
?>
Input:
2
AaBbCc
XxYyZz
Correct Output (My PC):
21
0
Ideone/Contest Website Output:
20
0
You have to keep in mind that your code is also processing the newline symbols.
On Windows systems, newline is composed by two characters, which escaped representation is \r\n.
On UNIX systems including Linux, only \n is used, and on MAC they use \r instead.
Since you are relying on the standard output, it will be susceptible to those architecture differences, and even if it was a file you are enforcing the architecture standard by using the flag "r" when creating the file handle instead of "rb", explicitly declaring you don't want to read the file in binary safe mode.
You can see in in this Ideone.com version of your code how the PHP script there will give the expected output when you enforce the newline symbols used by your home system, while in this other version using UNIX newlines it gives the "wrong" output.
I suppose you should be using fgets() to read each string separetely instead of fread() and then trim() them to remove those characters before processing.
I tried to analyse this code and that's what I know:
It seems there are no problems with input strings. If there were any it would be impossible to return result 20
I don't see any problem with loops, I usually use pre-incrementation but it shouldn't affect result at all
There are only 2 possibilities for me that cause unexpected result:
One of the loops iteration isn't executed - it could be only the last one inner loop (when $i == 5 and then $j == 5 because this loop is run just once) so it will match difference between 21 and 20.
preg_match won't match this string in one of occurrences (there are 21 checks of preg_match and one of them - possible the last one doesn't match).
If I had to choose I would go for the 1st possible cause. If I were you I would contact concepts author and ask them about version and possibility to test other codes. In this case the most important is how many times preg_match() is launched at all - 20 or 21 (using simple echo or extra counter would tell us that) and what are the strings that preg_match() checks. Only this way you can find out why this code doesn't work in my opinion.
It would be nice if you could put here any info when you find out something more.
PS. Of course I also get result 21 so it's hard to say what could be wrong

How to keep file to 1000 lines with Linux or PHP?

I have a file that I'm using to log IP addresses for a client. They want to keep the last 500 lines of the file. It is on a Linux system with PHP4 (oh no!).
I was going to add to the file one line at a time with new IP addresses. We don't have access to cron so I would probably need to make this function do the line-limit cleanup as well.
I was thinking either using like exec('tail [some params]') or maybe reading the file in with PHP, exploding it on newlines into an array, getting the last 1000 elements, and writing it back. Seems kind of memory intensive though.
What's a better way to do this?
Update:
Per #meagar's comment below, if I wanted to use the zip functionality, how would I do that within my PHP script? (no access to cron)
if(rand(0,10) == 10){
shell_exec("find . logfile.txt [where size > 1mb] -exec zip {} \;")
}
Will zip enumerate the files automatically if there is an existing file or do I need to do that manually?
The fastest way is probably, as you suggested, to use tail:
passthru("tail -n 500 $filename");
(passthru does the same as exec only it outputs the entire program output to stdout. You can capture the output using an output buffer)
[edit]
I agree with a previous comment that a log rotate would be infinitely better... but you did state that you don't have access to cron so I'm assuming you can't do logrotate either.
logrotate
This would be the "proper" answer, and it's not difficult to set this up either.
You may get the number of lines using count(explode("\n", file_get_contents("log.txt"))) and if it is equal to 1000, get the substring starting from the first \n to the end, add the new IP address and write the whole file again.
It's almost the same as writing the new IP by opening the file in a+ mode.

Why might my PHP log file not entirely be text?

I'm trying to debug a plugin-bloated Wordpress installation; so I've added a very simple homebrew logger that records all the callbacks, which are basically listed in a single, ultimately 250+ row multidimensional array in Wordpress (I can't use print_r() because I need to catch them right before they are called).
My logger line is $logger->log("\t" . $callback . "\n");
The logger produces a dandy text file in normal situations, but at two points during this particular task it is adding something which causes my log file to no longer be encoded properly. Gedit (I'm on Ubuntu) won't open the file, claiming to not understand the encoding. In vim, the culprit corrupt callback (which I could not find in the debugger, looking at the array) is about in the middle and printed as ^#lambda_546 and at the end of file there's this cute guy ^M. The ^M and ^# are blue in my vim, which has no color theme set for .txt files. I don't know what it means.
I tried adding an is_string($callback) condition, but I get the same results.
Any ideas?
^# is a NUL character (\0) and ^M is a CR (\r). No idea why they're being generated though. You'd have to muck through the source and database to find out. geany should be able to open the file easily enough though.
Seems these cute guys are a result of your callback formatting for windows.
Mystery over. One of the callbacks was an anonymous function. Investigating the PHP create_function documentation, I saw that a commenter had noted that the created function has a name like so: chr(0) . lambda_n. Thanks PHP.
As for the \r. Well, that is more embarrassing. My logger reused some older code that I previously written which did end lines in \r\n.

Read in text file line by line php - newline not being detected

I have a php function I wrote that will take a text file and list each line as its own row in a table.
The problem is the classic "works fine on my machine", but of course when I ask somebody else to generate the .txt file I am looking for, it keeps on reading in the whole file as 1 line. When I open it in my text editor, it looks just as how I would expect it with a new name on each line, but its the newline character or something throwing it off.
So far I have come to the conclusion it might have something to do with whatever text editor they are using on their Mac system.
Does this make sense? and is there any easy way to just detect this character that the text editor is recognizing as a new line and replace it with a standard one that php will recognize?
UPDATE: Adding the following line solved the issue.
ini_set('auto_detect_line_endings',true);
Function:
function displayTXTList($fileName) {
if(file_exists($fileName)) {
$file = fopen($fileName,'r');
while(!feof($file)) {
$name = fgets($file);
echo('<tr><td align="center">'.$name.'</td></tr>');
}
fclose($file);
} else {
echo('<tr><td align="center">placeholder</td></tr>');
}
}
This doesn't work for you?
http://us2.php.net/manual/en/filesystem.configuration.php#ini.auto-detect-line-endings
What's wrong with file()?
foreach (file($fileName) as $name) {
echo('<tr><td align="center">'.$name.'</td></tr>');
}
From the man page of fgets:
Note: If PHP is not properly recognizing the line endings when reading files either on or created by a Macintosh computer, enabling the auto_detect_line_endings run-time configuration option may help resolve the problem.
Also, have you tried the file function? It returns an array; each element in the array corresponds to a line in the file.
Edit: if you don't have access to the php.ini, what web server are you using? In Apache, you can change PHP settings using a .htaccess file. There is also the ini_set function which allows changing settings at runtime.
This is a classic case of the newline problem.
ASCII defines several different "newline" characters. The two specific ones we care about are ASCII 10 (line feed, LF) and 13 (carriage return, CR).
All Unix-based systems, including OS X, Linux, etc. will use LF as a newline. Mac OS Classic used CR just to be different, and Windows uses CR LF (that's right, two characters for a newline - see why no one likes Windows? Just kidding) as a newline.
Hence, text files from someone on a Mac (assuming it's a modern OS) would all have LF as their line ending. If you're trying to read them on Windows, and Windows expects CR LF, it won't find it. Now, it has already been mentioned that PHP has the ability to sort this mess out for you, but if you prefer, here's a memory-hogging solution:
$file = file_get_contents("filename");
$array = split("/\012\015?/", $file); # won't work for Mac Classic
Of course, you can do the same thing with file() (as has already been mentioned).

Categories