Cut string and gunzip using PHP - php

I have a router config export file, which contains a header of 20 bytes, followed by zlib compressed data. Once uncompressed it should contain plain xml content.
My code strips the first 20 bytes, and decompresses the file. The exported data is still binary. I used file_get_contents and file_put_contents first, but assumed (wrongly) it wasn't binary-safe. I've tried to change the 20 in anything from 1-1000 without avail.
<?
$fp_orig = fopen('config.cfg', "rb");
$data_orig = fread($fp_orig, filesize('config.cfg'));
fclose($fp_orig);
$bytes = 20; // tried 1-1000
$data_gz = substr($data_orig,$bytes);
$fp_gz = fopen('config.cfg.gz', 'w');
fwrite($fp_gz, $data_gz);
fclose($fp_gz);
$fp_gz = gzopen('config.cfg.gz', 'rb');
$fp_xml = fopen('config.cfg.xml', 'wb');
while(!gzeof($fp_gz))
{
fwrite($fp_xml, gzread($fp_gz, 4096));
}
fclose($fp_xml);
gzclose($fp_gz);
echo file_get_contents('config.cfg.xml'); // gives binary data
?>
I'm not particularly looking for turnkey working code, but rather a push into the right direction.

Related

PHP File Handling (Download Counter) Reading file data as a number, writing it as that plus 1

I'm trying to make a download counter in a website for a video game in PHP, but for some reason, instead of incrementing the contents of the downloadcount.txt file by 1, it takes the number, increments it, and appends it to the end of the file. How could I just make it replace the file contents instead of appending it?
Here's the source:
<?php
ob_start();
$newURL = 'versions/v1.0.0aplha/Dungeon1UP.zip';
//header('Location: '.$newURL);
//increment download counter
$file = fopen("downloadcount.txt", "w+") or die("Unable to open file!");
$content = fread($file,filesize("downloadcount.txt"));
echo $content;
$output = (int) $content + 1;
//$output = 'test';
fwrite($file, $output);
fclose($file);
ob_end_flush();
?>
The number in the file is supposed to increase by one every time, but instead, it gives me numbers like this: 101110121011101310111012101110149.2233720368548E+189.2233720368548E+189.2233720368548E+18
As correctly pointed out in one of the comments, for your specific case you can use fseek ( $file, 0 ) right before writing, such as:
fseek ( $file, 0 );
fwrite($file, $output);
Or even simpler you can rewind($file) before writing, this will ensure that the next write happens at byte 0 - ie the start of the file.
The reason why the file gets appended it is because you're opening the file in append and truncate mode, that is "w+". You have to open it in readwrite mode in case you do not want to reset the contents, just "r+" on your fopen, such as:
fopen("downloadcount.txt", "r+")
Just make sure the file exists before writing!
Please see fopen modes here:
https://www.php.net/manual/en/function.fopen.php
And working code here:
https://bpaste.net/show/iasj
It will be much simpler to use file_get_contents/file_put_contents:
// update with more precise path to file:
$content = file_get_contents(__DIR__ . "/downloadcount.txt");
echo $content;
$output = (int) $content + 1;
// by default `file_put_contents` overwrites file content
file_put_contents(__DIR__ . "/downloadcount.txt", $output);
That appending should just be a typecasting problem, but I would not encourage you to handle counts the file way. In order to count the number of downloads for a file, it's better to make a database update of a row using transactions to handle concurrency properly, as doing it the file way could compromise accuracy.
You can get the content, check if the file has data. If not initialise to 0 and then just replace the content.
$fileContent = file_get_contents("downloadcount.txt");
$content = (!empty($fileContent) ? $fileContent : 0);
$content++;
file_put_contents('downloadcount.txt', $content);
Check $str or directly content inside the file

How to delete first 11 lines in a file using PHP?

I have a CSV file in which I want the first 11 lines to be removed. The file looks something like:
"MacroTrends Data Download"
"GOOGL - Historical Price and Volume Data"
"Historical prices are adjusted for both splits and dividends"
"Disclaimer and Terms of Use: Historical stock data is provided 'as is' and solely for informational purposes, not for trading purposes or advice."
"MacroTrends LLC expressly disclaims the accuracy, adequacy, or completeness of any data and shall not be liable for any errors, omissions or other defects in, "
"delays or interruptions in such data, or for any actions taken in reliance thereon. Neither MacroTrends LLC nor any of our information providers will be liable"
"for any damages relating to your use of the data provided."
date,open,high,low,close,volume
2004-08-19,50.1598,52.1911,48.1286,50.3228,44659000
2004-08-20,50.6614,54.7089,50.4056,54.3227,22834300
2004-08-23,55.5515,56.9157,54.6938,54.8694,18256100
2004-08-24,55.7922,55.9728,51.9454,52.5974,15247300
2004-08-25,52.5422,54.1672,52.1008,53.1641,9188600
I want only the stocks data and not anything else. So I wish to remove the first 11 lines. Also, there will be several text files for different tickers. So str_replace doesn't seem to be a viable option. The function I've been using to get CSV file and putting the required contents to a text file is
function getCSVFile($url, $outputFile)
{
$content = file_get_contents($url);
$content = str_replace("date,open,high,low,close,volume", "", $content);
$content = trim($content);
file_put_contents($outputFile, $content);
}
I want a general solution which can remove the first 11 lines from the CSV file and put the remaining contents to a text file. How do I do this?
Every example here won't work for large/huge files. People don't care about the memory nowadays. You, as a great programmer, want your code to be efficient with low memory footprint.
Instead parse file line by line:
function saveStrippedCsvFile($inputFile, $outputFile, $lineCountToRemove)
{
$inputHandle = fopen($inputFile, 'r');
$outputHandle = fopen($outputFile, 'w');
// make sure you handle errors as well
// files may be unreadable, unwritable etc…
$counter = 0;
while (!feof($inputHandle)) {
if ($counter < $lineCountToRemove) {
fgets($inputHandle);
++$counter;
continue;
}
fwrite($outputHandle, fgets($inputHandle) . PHP_EOL);
}
fclose($inputHandle);
fclose($outputHandle);
}
I have a CSV file in which I want the first 11 lines to be removed.
I always prefer to use explode to do that.
$string = file_get_contents($file);
$lines = explode('\n', $string);
for($i = 0; $i < 11; $i++) { //First key = 0 - 0,1,2,3,4,5,6,7,8,9,10 = 11 lines
unset($lines[$i]);
}
This will remove it and with implode you can create a new 'file' out of it
$new = implode('\n',$lines);
$new will contain the new file
Did'nt test it, but I'm pretty sure that this will work
Be carefull! I will quote #emix his comment.
This will fail spectacularly if the file content exceeds available PHP memory.
Be sure that the file isn't to 'huge'
Use file() to read it as array and simply trim first 11 lines:
$content = file($url);
$newContent = array_slice($content, 12);
file_put_contents($outputFile, implode(PHP_EOL, $newContent));
But answer these questions:
Why there is additional content in this CSV?
How will you know how much lines to cut off? What if it's more than 11 lines to cut?

PHP script to convert csv to XML not working for large file (around 1GB)

We have written the following PHP script to convert CSV file to XML file. But It got stuck and didn't come out of the while loop to saveXML.
The size of the CSV file is around 1GB, The number of rows in the CSV file is around 1,00,000.
Due to the large number of rows, It is not working.
My question is: How can we modify this following code in such a way that, It works for a large file ?
<?php
$delimit = "," ;
$row_count = 0 ;
$inputFilename = "feed.csv" ;
$outputFilename = 'output.xml';
$inputFile = fopen($inputFilename, 'rt');
$headers = fgetcsv($inputFile);
$doc = new DomDocument();
$doc->formatOutput = true;
$root = $doc->createElement('rows');
$root = $doc->appendChild($root);
while (($row = fgetcsv($inputFile)) !== FALSE)
{
$container = $doc->createElement('row');
foreach ($headers as $i => $header)
{
$arr = explode($delimit, $header);
foreach ($arr as $j => $ar)
{
$child = $doc->createElement(preg_replace("/[^A-Za-z0-9]/","",$ar));
$child = $container->appendChild($child);
$whole = explode($delimit, $row[$i]);
$value = $doc->createTextNode(ltrim( rtrim($whole[$j], '"') ,'"'));
$value = $child->appendChild($value);
}
}
$root->appendChild($container);
echo "." ;
}
echo "Saving the XML now" ;
$result = $doc->saveXML();
echo "Writing to XML file now" ;
$handle = fopen($outputFilename, "w");
fwrite($handle, $result);
fclose($handle);
return $outputFilename;
?>
Edited:
In php.ini the memory_limit and execution time is set for unlimited & maximum. I am executing using command line.
as you noticed, you run into resource problems with such big in/output.
The input handling you use, fgetcsv() is already quite effective as it reads one line at a time.
The output is the problem in this case. You store the whole 1GB raw text into a DOMDocument Object, which adds considerable overhead to the needed memory.
But according to your code, you only write the xml back to a file, so you don't really need it as a DOMDocument at runtime.
The simplest solution would be to build the xml string as a string and write it to the output file for each line of the csv: open the handle for the outputfile with 'a' (fopen($outputfilename, "a");, write the xml header before the loop, fwrite every csv-to-xml-ified elment per loop run, write the xml footer after the loop
It's most probably the (mis)usage of the DomDocument that causes your memory issues (as already answered by #cypherabe).
But instead of the proposed string concatenation solution, I would urge you to take a look at the XmlWriter http://php.net/manual/en/book.xmlwriter.php
The XmlWriter extension represents a writer that provides a non-cached, forward-only means of generating streams or files containing XML data.
This extension can be used in an object oriented style or a procedural one.
It's already bundled with PHP from version 5.2.1
http://www.prestatraining.com/12-tips-to-optimise-your-php-ini-file-for-prestashop/
Look at the section Memory and Size Limits (ignore the fact it's about prestashop)
It sounds like your PHP settings on the server are timing out on execution. If you are trying to process a file that is 1GB I wouldn't be surprised if it fails if you have standard PHP.ini settings.

PHP SSH read file in chunks by pattern

I have a PHP script that takes a user-supplied string, then SSHs out to a remote server, reads the file into an array, then parses out the request/response blocks containing the string to return to the user.
This implementation does not work with large log files, because PHP runs out of memory trying to store the whole file in an array.
Example data:
*** REQUEST
request line 1
request line 2
request line 3
[...]
*** RESPONSE
response line 2
response line 2
response line 3
[...]
[blank line]
The length of the requests and responses vary, so I can never be sure how many lines there will be.
How can I read a file in chunks without storing the whole file in memory, while still ensuring I'll always be able to process a full request/response block of data from the log without truncating it?
I feel like I'm just being exceptionally dense about this, since my experience is usually working with whole files or arrays.
Here's my current code (with $search representing the user-supplied string we're looking for in the log), which is putting the whole file into an array first:
$stream = ssh2_exec($ssh, $command);
stream_set_blocking($stream, true);
$data = '';
while($buffer = fread($stream, 4096)) {
$data .= $buffer;
}
fclose($stream);
$rawlog = $data;
$logline = explode("\n",$rawlog);
reset($logline);
$block='';
foreach ( $logline as $k => $v ) {
if ( preg_match("/\*\*\* REQUEST",$v) && $block != '') {
if ( preg_match("/$search/i",$block) ) {
$results[] = $block;
}
$block=$v . "\n";
} else {
$block .= $v . "\n";
}
}
if ( preg_match("/$search/i",$block) ) {
$results[] = $block;
}
Any suggestions?
Hard to say if this would work for you but if the logs are in files you could use phpseclib's SFTP implementation (latest Git version).
eg.
If you do $sftp->get('filename.ext', false, 0, 1000) it'll download bytes 0-1000 from filename.ext and return a string with those bytes. If you do $sftp->get('filename.ext', false, 1000, 1000) it'll download bytes 1000-2000.
You can use command like tail which will get lines from 0 to 99, from 100 to 199, and so on.
This will require more ssh commands, but will not require you to store all result in memory.
Or, you can first store all the output into local file, and after that parse it.

Split big files using PHP

I want to split huge files (to be specific, tar.gz files) in multiple part from php code. Main reason to do this is, php's 2gb limit on 32bit system.
SO I want to split big files in multiple part and process each part seperately.
Is this possible? If yes, how?
My comment was voted up twice, so maybe my guess was onto something :P
If on a unix environment, try this...
exec('split -d -b 2048m file.tar.gz pieces');
split
Your pieces should be pieces1, pieces2, etc.
You could get the number of resulting pieces easily by using stat() in PHP to get the file size and then do the simple math (int) ($stat['size'] / 2048*1024*1024) (I think).
A simple method (if using Linux based server) is to use the exec command and to run the split command:
exec('split Large.tar.gz -b 4096k SmallParts'); // 4MB parts
/* | | | | |
| | |______| |
App | | |_____________
The source file | |
The split size Out Filename
*/
See here for more details: http://www.computerhope.com/unix/usplit.htm
Or you can use: http://www.computerhope.com/unix/ucsplit.htm
exec('csplit -k -s -f part_ -n 3 LargeFile.tar.gz');
PHP runs within a single thread and the only way to increase this thread count is to create child process using the fork commands.
This is not resource friendly. What I would suggest is to look into a language that can do this fast and effectively. I would suggest using node.js.
Just install node on the server and then create a small script, called node_split for instance, that can do the job on its own for you.
But I do strongly advise that you do not use PHP for this job but use exec to allow the host operating system to do this.
HJSPLIT
http://www.hjsplit.org/php/
PHP itself might not be able to...
If you can figure out how to do this from your computers' command line,
You should be able to then execute these commands using exec();
function split_file($source, $targetpath='/split/', $lines=1000){
$i=0;
$j=1;
$date = date("m-d-y");
$buffer='';
$handle = fopen ($_SERVER['DOCUMENT_ROOT'].$source, "r");
while (!feof ($handle)) {
$buffer .= fgets($handle, 4096);
$i++;
if ($i >= $lines) {
$fname = $_SERVER['DOCUMENT_ROOT'].$targetpath."part_".$date.$j.".txt";
$fhandle = fopen($fname, "w") or die($php_errormsg);
if (!$fhandle) {
echo "Cannot open file ($fname)";
//exit;
}
if (!fwrite($fhandle, $buffer)) {
echo "Cannot write to file ($fname)";
//exit;
}
fclose($fhandle);
$j++;
$buffer='';
$i=0;
$line+=10; // add 10 to $lines after each iteration. Modify this line as required
}
}
fclose ($handle);
}
$handle = fopen('source/file/path','r');
$f = 1; //new file number
while(!feof($handle))
{
$newfile = fopen('newfile/path/'.$f.'.txt','w'); //create new file to write to with file number
for($i = 1; $i <= 5000; $i++) //for 5000 lines
{
$import = fgets($handle);
//print_r($import);
fwrite($newfile,$import);
if(feof($handle))
{break;} //If file ends, break loop
}
fclose($newfile);
$f++; //Increment newfile number
}
fclose($handle);
If you want to split files which are
already on server, you can do it
(simply use the file functions fread,
fopen, fwrite, fseek to read/write
part of the file).
If you want to
split files which are uploaded from
the client, I am afraid you cannot.
This would probably be possible in php, but php was built for web development and trying to this whole operation in one request will result in the request timing out.
You could however use another language like java or c# and build a background process that you can notify from php to perform the operation. Or even run from php, depending on your Security settings on the host.
Splits are named as filename.part0 filename.part1 ...
<?php
function fsplit($file,$buffer=1024){
//open file to read
$file_handle = fopen($file,'r');
//get file size
$file_size = filesize($file);
//no of parts to split
$parts = $file_size / $buffer;
//store all the file names
$file_parts = array();
//path to write the final files
$store_path = "splits/";
//name of input file
$file_name = basename($file);
for($i=0;$i<$parts;$i++){
//read buffer sized amount from file
$file_part = fread($file_handle, $buffer);
//the filename of the part
$file_part_path = $store_path.$file_name.".part$i";
//open the new file [create it] to write
$file_new = fopen($file_part_path,'w+');
//write the part of file
fwrite($file_new, $file_part);
//add the name of the file to part list [optional]
array_push($file_parts, $file_part_path);
//close the part file handle
fclose($file_new);
}
//close the main file handle
fclose($file_handle);
return $file_parts;
}
?>

Categories