gzgets reaches end of file early - php

I have a gzipped text file that I'm trying to read within PHP (using gzopen/gzgets). The file is somewhat large, around 158,000 lines. The script works fine except when it gets to line 157,237 of the file, it reads in part of the line then acts as if it's reached EOF. I'm able to unzip the file and confirm the rest of the file does exist. I wrote a simple script to test:
<?php
$handle = gzopen('/path/to/file.gz','r');
while(true) {
echo gzgets($handle,4096);
}
?>
It reads in everything perfectly then suddenly gets to this line and prints:
GUAN XIN 508|R34745|CH|CGO|100|
and nothing else. It just sits there [the not-infinite-loop version exits the while(!gzeof($handle))]
If I gunzip the file and go to that line, I see:
GUAN XIN 508|R34745|CH|CGO|100| | | | |BEGS| | | | |133|19| | | | | | | | | | | | |413669000|1|
So the data is there. Is there some sort of size limitation on the zlib functions that I'm not aware of?
UPDATE: I ran it through a 'cat -vet' to look for special characters... nothing.

Updated zlib to 1.2.7. We were running 1.2.3, and "large file" support was apparently added in 1.2.4.

Related

How can I find out what rule produces an error?

I'm setting up PHP CodeSniffer as a linter for my code and I have an error which I want to ignore.
In order to do that, I should be able to put the line // phpcs:ignore Name.Of.The.Rule before the line that is an exception to that rule.
Unfortunately, I don't know how I can find which of the rules I have to ignore. Is there a way to display the rule producing the error?
For now, I searched the error message in my vendor folder, resulting in 4 different entries. I'm not sure I know which is the one called.
The error summary looks like that:
----------------------------------------------------------------------
FOUND 1 ERROR AFFECTING 1 LINE
----------------------------------------------------------------------
14 | ERROR | Method name "ActivityRules::is_after" is not in camel
| | caps format
----------------------------------------------------------------------
I would love to have something like that:
---------------------------------------------------------------------------------------------------
FOUND 1 ERROR AFFECTING 1 LINE
---------------------------------------------------------------------------------------------------
14 | ERROR | Method name "ActivityRules::is_after" is not in camel | SomeStandard.Category
| | caps format | .NameOfTheSniff.RuleCalled
---------------------------------------------------------------------------------------------------
EDIT:
I don't need you to tell me it's PSR1.Methods.CamelCapsMethodName.NotCamelCaps, I know how to find the rule by hand the hard way, by trial and error. I'd like to know if there is an easy way to do it.
Use phpcs -s
The output of phpcs --help includes the available options:
phpcs --help
...
-s Show sniff codes in all reports
...
The 'Show sniff codes in all reports' option results in this output format:
➜ /tmp phpcs -s example.php
FILE: /private/tmp/example.php
----------------------------------------------------------------------
FOUND 1 ERROR AFFECTING 1 LINE
----------------------------------------------------------------------
14 | ERROR | [ ] Method name "ActivityRules::is_after" is not in camel caps format
| | (PEAR.NamingConventions.ValidFunctionName.NotCamelCaps)
----------------------------------------------------------------------
Time: 30ms; Memory: 6MB
➜ /tmp
Which is hopefully close enough to what you're looking for here.

How to catch only the PHP extensions that are necessary for the app to work

I need to catch only these PHP extensions that are necessary for the app to work. The idea is removing all the PHP extensions that are not necessary for the app. Do you guys have any idea how can I do that?
The app is on PHP 8.0.14 - Laravel 8
I am not sure what is the question.
Try this code:
$full = explode("-", "PHP 8.0.14 - Laravel 8"); // first argument is separator, in your case it's "-" sign, second argument is what needs to be separated
$extension =end($full); //gives everithing after "-" sign
$noextension = reset($full); //gives everithing before "-" sign
I think you need one of those.
Take a look at this project: https://github.com/maglnet/ComposerRequireChecker
It does what you are asking, based both in the dependencies you required through composer and on the code you wrote.
You can use it by running composer-require-checker check <path/to/composer.json>
The output will be something like this:
$ composer-require-checker check composer.json
ComposerRequireChecker 4.1.x-dev#fb2a69aa2b7307541233536f179275e99451b339
The following 2 unknown symbols were found:
+----------------------------------------------------------------------------------+--------------------+
| Unknown Symbol | Guessed Dependency |
+----------------------------------------------------------------------------------+--------------------+
| hash | ext-hash |
| Psr\Http\Server\RequestHandlerInterface | |
+----------------------------------------------------------------------------------+--------------------+
By looking at it you can tell that I must include ext-hash and psr/http-server-handler to my composer's require section.
Note that although ext-hash has been shipped with standard PHP distributions for a while, it may be a good practice to include it, so if your software is being executed in a non-standard/custom distribution.

How to get the EXIF shutter count (imageNumber) using PHP?

I've been able to get EXIF data by using exif_read_data(). According to the EXIF documentation provided on PHP docs, there has to be an imageNumber tag (I understand it's not guaranteed), but I haven't been able to see anything like that on my test image (Unedited JPG from a Nikon D5100). The same image seems to carry information about the shutter count as per online shutter count websites.
Really appreciate it if you can shed some light on what I'm possibly doing wrong to get this number. Or is there any other place or method they store shutter count in image meta?
EDIT:
Here's the code I tried to work out, and I'm trying to get imageNumber which is apparently not available to get. But online tools show the shutter count on the same image. I'd like to get the same result using PHP (or even using another language). Any help is appreciated.
$exif_data = exif_read_data ( $_FILES['fileToUpload']['tmp_name']);
print_r( $exif_data);
As per your example file it is specific to Nikon's MakerNote and in there specific to the D5100 model. Using ExifTool in verbose mode shows the structure:
> exiftool -v DSC_8725.JPG
...
JPEG APP1 (65532 bytes):
ExifByteOrder = MM
+ [IFD0 directory with 11 entries]
| 0) Make = NIKON CORPORATION
| 1) Model = NIKON D5100
...
| 9) ExifOffset (SubDirectory) -->
| + [ExifIFD directory with 41 entries]
...
| | 16) MakerNoteNikon (SubDirectory) -->
| | + [MakerNotes directory with 55 entries]
...
| | | 38) ShotInfoD5100 (SubDirectory) -->
| | | + [BinaryData directory, 8902 bytes]
...
| | | | ShutterCount = 41520
JPEG explained, see segment APP1:
Exif explained, see tag 0x927c:
Nikon's MakerNote explained, see tag 0x0091:
ShotInfoD5100 explained, see index 801
MakerNotes are proprietary: how data is stored there is up to each manufacturer. Documentations from those are rare - mostly hobbyists reverse engineer that information - that's why only selected software can read it at all for selected models. At this point you may realize that dozens of manufacturers with dozens of models exist, for which you all would have to interpret bytes differently - which is a lot of work! As per exif_read_data()s ChangeLog PHP 7.2.0 nowhere claims to support Nikon at all.
You have to either parse the MakerNote yourself or find PHP code/library/software which already did that for you. As a last resort you could execute non-PHP software (such as ExifTool) to get what you want.

Read from live data feed php

I am using something called DAP (https://github.com/rapid7/dap) which helps deal with large file handling and outputs an ever growing list of data.
For example:
curl -s https://scans.io/data/rapid7/sonar.http/20141209-http.gz | zcat | head -n 10 | dap json + select vhost + lines
This code correctly works and it will output 10 lines of IP addresses.
My question is how can I read this data from PHP - in effect where a data feed is continuous/live (it will end at some point) how can I process each line I'm given?
I've tried piping to it but I don't get passed the output. I don't want to use exec because the data is constantly growing. I think it could be a stream but not sure that is the case.
For anyone else that finds themselves in the same situation - here is the answer that works for me (can be run directly from the command line also):
curl -s 'https://scans.io/data/rapid7/sonar.http/20141209-http.gz' | zcat | head -n 1000 | dap json + select vhost + lines | while read line ; do php /your_script/path/file.php $line ; done
Then pull out $argv[1] and the data is all yours.

extending phpcodesniffer to filter report based on error codes

I am trying to extend PHPCodeSniffer.What I am trying to achive is to filter the report using error codes.
To explain this lets say I have an error message like "error code : 630 , function is not compatible"
When I run PHPCS from command line , I shoudl be able to pass an argument "error code" so that the report is filtered based on it.(only show result for error code say 630)
e.g.
$ phpcs --standard=mystanderd /path/to/code/myfile.php --errorcode=603
and output will be
FILE: /path/to/code/myfile.php
--------------------------------------------------------------------------------
FOUND 4 ERROR(S) AFFECTING 4 LINE(S)
--------------------------------------------------------------------------------
2 | ERROR | 603 | function is not compatible
20 | ERROR | 603 | function is not compatible
51 | ERROR | 603 | function is not compatible
88 | ERROR | 603 | function is not compatible
--------------------------------------------------------------------------------
what is the best way to achive it ? as far as what I have understood we can filter only based on seviority as it have inbuilt support.
I would like to avoid modifying the core of PHPCodeSniffer. What I am thinking to do is to write a wrapper script which will accept the argument from CLI and execute PHPCS the capture the o/p and manipulate it before throwing out to the console.However, I don't think it is a best solution.
a bash script utilising grep and wc comes to mind.
You could also use a PHP script like this (let's say this is called my_wrapper.php):
<?php
$legal_codes = array(
'603' => true
);
$f = fopen('php://stdin', 'r');
while ($line = fgets($f)) {
if (preg_match("/^\s*(\d+)\s*\|\s*([A-Z]+)\s*\|\s*(\d+)\s*\|\s*(.*)/", $line, $match)) {
$code = trim($match[3]);
if (!isset($legal_codes[$code])) {
continue;
}
}
echo $line;
}
?>
Which when called like this:
php my_wrapper.php < cs_out.txt
With cs_out.txt like this:
FILE: /path/to/code/myfile.php
--------------------------------------------------------------------------------
FOUND 4 ERROR(S) AFFECTING 4 LINE(S)
--------------------------------------------------------------------------------
2 | ERROR | 601 | function is not compatible
20 | ERROR | 602 | function is not compatible
51 | ERROR | 603 | function is not compatible
88 | ERROR | 604 | function is not compatible
--------------------------------------------------------------------------------
Produces output like this:
FILE: /path/to/code/myfile.php
--------------------------------------------------------------------------------
FOUND 4 ERROR(S) AFFECTING 4 LINE(S)
--------------------------------------------------------------------------------
51 | ERROR | 603 | function is not compatible
--------------------------------------------------------------------------------
Making the keys of the $legal_codes array specifiable via command line parameter to my_wrapper.php is left as an exercise for the reader.

Categories