How do I convert my sed insert text script in PHP?

How do I convert my sed insert text script in PHP? - php

I have a working sed script that inserts text to a document at the line number.
sed -i '35i \ NewPage,' file
wondering if theres a way i can achive the same result using php.
35 i is the row number to be inserted
\ make the insert in a new line
NewPage is the text being inserted
file is the file location
Any suggestions?
Best regards
AT.

You can but can't be oneliner like sed
Sample Input
akshay#db-3325:/tmp$ seq 1 5 > test.txt
akshay#db-3325:/tmp$ cat test.txt
1
2
3
4
5
Output with sed at line number 4
akshay#db-3325:/tmp$ sed '4i \ NewPage,' test.txt
1
2
3
NewPage,
4
5
PHP Script
akshay#db-3325:/tmp$ cat test.php
<?php
$new_contents = " NewPage,";
$file = "test.txt";
$line = 4;
$contents = file($file);
array_splice($contents, $line-1, 0, $new_contents.PHP_EOL);
file_put_contents($file, implode("",$contents));
?>
Execution and Output
akshay#db-3325:/tmp$ php test.php
akshay#db-3325:/tmp$ cat test.txt
1
2
3
NewPage,
4
5
OR else you have to use exec, but careful if you are enabling exec in your server, usually people disable these functions in their php.ini configuration
exec("sed -i '35i \ NewPage,' path/to/file 2>&1", $outputAndErrors, $return_value);
if (!$return_value) {
// Alright command executed successfully
}
Note : In general functions such as “exec” and “system” are always
used to execute the external programs. Even a shell command can also
be executed. If these two functions are enabled then a user can enter
any command as input and execute into your server.

Related

How to grab inout from user and modify file using echo or sed

I am using the below code to get input from user and modify my backup.php file:
#!/bin/bash
read -p "Enter hostname: " hostname
read -p "Enter cPanel username: " user
read -p "Enter password: " pass
echo "\$source_server_ip = \"$hostname\";" >> "backup.php"
echo "\$cpanel_account = \"$user\"; " >> "backup.php"
echo "\$cpanel_password = \"$pass\"; " >> "backup.php"
This works perfectly; however, I want to insert the user-provided details in backup.php at line numbers 4, 5 and 6, respectively.
backup.php contents:
Line no. 1: php
Line no. 2: include xmlapi.php
Line no. 3: (blank)
Line no. 4: $source_server_ip = " ";
Line no. 5: $cpanel_account = " ";
Line no. 6: $cpanel_password = " ";
Line no 7 L code continue..
I want to keep reset of the code as it is and want to make changes in line no 4,5.6 online.
How can I do this? Do I need to use sed?

If I understand correctly, you need to replace/update your 4th, 5th and 6th line with some new lines. You can use sed's substitution command for this:
host_line="\$source_server_ip = \"$hostname\""
user_line="\$cpanel_account = \"$user\""
pass_line="\$cpanel_password = \"$pass\""
sed -i "4s/.*/$host_line/; 5s/.*/$user_line/; 6s/.*/$pass_line/" backup.php
#different notation:
sed -i -e "4s/.*/$host_line/" -e "5s/.*/$user_line/" -e "6s/.*/$pass_line/" backup.php
4s determines on which line the substitution command should take place (in this case 4th line)
.* matches the entire line and substitutes it with your user input
-i is in-place editing. It will edit your file instead of sending the result to stdin
Warning: using sed's s command is really straightforward but it can have some surprising and dangerous results if your variables contain unescaped special characters to sed, for example: &, /, newline, ...
Or make use of the sed's c command:
sed -i "4c \
$host_line
5c \
$user_line
6c \
$pass_line
" backup.php
#different notatiton:
sed -i -e "4c $host_line" -e "5c $user_line" -e "6c $pass_line" backup.php
Warning: you can manage to break this as well, with unescaped newline for example:
pass='mypass
2c oops'
You will either have to escape those special characters to sed or use what I consider the safest solution, awk:
awk -i inplace -v hl="$host_line" -v ul="$user_line" -v pl="$pass_line" '
NR==4 { $0=hl }
NR==5 { $0=ul }
NR==6 { $0=pl }
{ print }' backup.php
Do not forget to use -r option with read or it will treat backslashes specially.
Also, note that even backslashes in your variables will later get interpreted! You can add extra backslashes with parameter expansion if you want to prevent this:
host_line="\$source_server_ip = \"${hostname//\\/\\\\}\""
user_line="\$cpanel_account = \"${user//\\/\\\\}\""
pass_line="\$cpanel_password = \"${pass//\\/\\\\}\""
EDIT: If you just want to insert the 3 lines after the 3th line, use this simple sed:
sed -i "4i $host_line\n$user_line\n$pass_line" backup.php

AWK pipe output from TTY to PHP

I have a tty device (/dev/ttyUSB0), which occasionally outputs a string in the form of Cycle 1: 30662 ms, 117.41 W. I'm using a simple bash script to process it:
#!/bin/sh
stty -F /dev/ttyUSB0 57600
cd /home/pi
while true; do
cat /dev/ttyUSB0 | awk '{ print $0 > "/dev/stderr"; if (/^Cycle/) { print "update kWh.rrd N:" $5 } }' | php5 test.php
sleep 1
done
The test.php script looks like this:
<?php
stream_set_blocking(STDIN, 0);
$line = trim(fgets(STDIN));
$file = 'kwhoutput.txt';
$current = file_get_contents($file);
$current .= $line;
file_put_contents($file, $current);
?>
however, the kwhoutput.txt remains empty. Why is this not working?

awk is buffering your data. Use fflush() to flush the buffers after each output line:
awk '{
print $0 > "/dev/stderr";
if (/^Cycle/) {
print "update kWh.rrd N:" $5;
fflush();
}
}' < /dev/ttyUSB0 | php5 test.php
Also make sure that /dev/ttyUSB0 actually outputs a line (terminated by \n), and not just a string of data.
You should also fix up your php script to:
Read multiple lines and append them one by one (otherwise, the script will skip every other line).
Find out how to append to a file in php. Reading the whole file, concatenating a string in memory, then writing the whole file is not the way to go.

Can we use PHP in Bash Script?

I have a bash script abcd.sh
#!/bin/sh
for i in `seq 8`; do ssh w$i 'uptime;ps -elf|grep httpd|wc -l;free -m;mpstat'; done &
pid=$!
sleep 1
kill -9 $pid
I want to use PHP in my bash script.
eg: in bash script I want to set value of seq through PHP.

Mmm, if you are using bash, you should maybe use a bash shebang on line 1 so people know you are expecting bash features to be available. And if you are using bash, you can use a bash sequence anyway:
#!/bin/bash
for i in {1..8}; do echo $i; done
Update 1
If the number of servers is obtained through PHP, you can do something like this:
numservers=$(php -r 'echo 8;')
for i in $(seq $numservers); do echo $i; done
1
2
3
4
5
6
7
8
Update 2
Ok, you said the number of servers is dynamic, but then you say it is set in the script (which seems contradictory), but this is what you do:
numservers=10
for i in $(seq $numservers); do echo $i; done
1
2
3
4
5
6
7
8
9
10

Why don't you write the entire shell script in PHP?
#!/usr/bin/php
<?php
for ($i = 0; $i < 8; $i++) {
exec("ssh w$i 'uptime;ps -elf|grep httpd|wc -l;free -m;mpstat'");
}
?>
(code is untested, just an example)

It's not easy to understand what you want.
Maybe this helps. The script defines a var PHP_VAR in bash and use this var in PHP. Then we call a PHP code snippet an put the output in the shell var output. At last we output the var output (but you can do something else with it).
Attation: All the output from the PHP will be found in the var output.
#!/bin/bash
echo "I am a bash echo"
export PHP_VAR="I was in php"
# Here we start php and put the output in 'output'
output=$(php << EOF
<?php \$inner_var = getenv("PHP_VAR");
echo \$inner_var; ?>
EOF
)
# usage var 'output' with php output
echo "---$output---"

Get the number of pages in a PDF document

This question is for referencing and comparing. The solution is the accepted answer below.
Many hours have I searched for a fast and easy, but mostly accurate, way to get the number of pages in a PDF document. Since I work for a graphic printing and reproduction company that works a lot with PDFs, the number of pages in a document must be precisely known before they are processed. PDF documents come from many different clients, so they aren't generated with the same application and/or don't use the same compression method.
Here are some of the answers I found insufficient or simply NOT working:
Using Imagick (a PHP extension)
Imagick requires a lot of installation, apache needs to restart, and when I finally had it working, it took amazingly long to process (2-3 minutes per document) and it always returned 1 page in every document (haven't seen a working copy of Imagick so far), so I threw it away. That was with both the getNumberImages() and identifyImage() methods.
Using FPDI (a PHP library)
FPDI is easy to use and install (just extract files and call a PHP script), BUT many of the compression techniques are not supported by FPDI. It then returns an error:
FPDF error: This document (test_1.pdf) probably uses a compression technique which is not supported by the free parser shipped with FPDI.
Opening a stream and search with a regular expression:
This opens the PDF file in a stream and searches for some kind of string, containing the pagecount or something similar.
$f = "test1.pdf";
$stream = fopen($f, "r");
$content = fread ($stream, filesize($f));
if(!$stream || !$content)
return 0;
$count = 0;
// Regular Expressions found by Googling (all linked to SO answers):
$regex = "/\/Count\s+(\d+)/";
$regex2 = "/\/Page\W*(\d+)/";
$regex3 = "/\/N\s+(\d+)/";
if(preg_match_all($regex, $content, $matches))
$count = max($matches);
return $count;
/\/Count\s+(\d+)/ (looks for /Count <number>) doesn't work because only a few documents have the parameter /Count inside, so most of the time it doesn't return anything. Source.
/\/Page\W*(\d+)/ (looks for /Page<number>) doesn't get the number of pages, mostly contains some other data. Source.
/\/N\s+(\d+)/ (looks for /N <number>) doesn't work either, as the documents can contain multiple values of /N ; most, if not all, not containing the pagecount. Source.
So, what does work reliable and accurate?
See the answer below

A simple command line executable called: pdfinfo.
It is downloadable for Linux and Windows. You download a compressed file containing several little PDF-related programs. Extract it somewhere.
One of those files is pdfinfo (or pdfinfo.exe for Windows). An example of data returned by running it on a PDF document:
Title: test1.pdf
Author: John Smith
Creator: PScript5.dll Version 5.2.2
Producer: Acrobat Distiller 9.2.0 (Windows)
CreationDate: 01/09/13 19:46:57
ModDate: 01/09/13 19:46:57
Tagged: yes
Form: none
Pages: 13 <-- This is what we need
Encrypted: no
Page size: 2384 x 3370 pts (A0)
File size: 17569259 bytes
Optimized: yes
PDF version: 1.6
I haven't seen a PDF document where it returned a false pagecount (yet). It is also really fast, even with big documents of 200+ MB the response time is a just a few seconds or less.
There is an easy way of extracting the pagecount from the output, here in PHP:
// Make a function for convenience
function getPDFPages($document)
{
$cmd = "/path/to/pdfinfo"; // Linux
$cmd = "C:\\path\\to\\pdfinfo.exe"; // Windows
// Parse entire output
// Surround with double quotes if file name has spaces
exec("$cmd \"$document\"", $output);
// Iterate through lines
$pagecount = 0;
foreach($output as $op)
{
// Extract the number
if(preg_match("/Pages:\s*(\d+)/i", $op, $matches) === 1)
{
$pagecount = intval($matches[1]);
break;
}
}
return $pagecount;
}
// Use the function
echo getPDFPages("test 1.pdf"); // Output: 13
Of course this command line tool can be used in other languages that can parse output from an external program, but I use it in PHP.
I know its not pure PHP, but external programs are way better in PDF handling (as seen in the question).
I hope this can help people, because I have spent a whole lot of time trying to find the solution to this and I have seen a lot of questions about PDF pagecount in which I didn't find the answer I was looking for. That's why I made this question and answered it myself.
Security Notice: Use escapeshellarg on $document if document name is being fed from user input or file uploads.

Simplest of all is using ImageMagick
here is a sample code
$image = new Imagick();
$image->pingImage('myPdfFile.pdf');
echo $image->getNumberImages();
otherwise you can also use PDF libraries like MPDF or TCPDF for PHP

You can use qpdf like below. If a file file_name.pdf has 100 pages,
$ qpdf --show-npages file_name.pdf
100

Here is a simple example to get the number of pages in PDF with PHP.
<?php
function count_pdf_pages($pdfname) {
$pdftext = file_get_contents($pdfname);
$num = preg_match_all("/\/Page\W/", $pdftext, $dummy);
return $num;
}
$pdfname = 'example.pdf'; // Put your PDF path
$pages = count_pdf_pages($pdfname);
echo $pages;
?>

if you can't install any additional packages, you can use this simple one-liner:
foundPages=$(strings < $PDF_FILE | sed -n 's|.*Count -\{0,1\}\([0-9]\{1,\}\).*|\1|p' | sort -rn | head -n 1)

This seems to work pretty well, without the need for special packages or parsing command output.
<?php
$target_pdf = "multi-page-test.pdf";
$cmd = sprintf("identify %s", $target_pdf);
exec($cmd, $output);
$pages = count($output);

Since you're ok with using command line utilities, you can use cpdf (Microsoft Windows/Linux/Mac OS X). To obtain the number of pages in one PDF:
cpdf.exe -pages "my file.pdf"

I created a wrapper class for pdfinfo in case it's useful to anyone, based on Richard's answer#
/**
* Wrapper for pdfinfo program, part of xpdf bundle
* http://www.xpdfreader.com/about.html
*
* this will put all pdfinfo output into keyed array, then make them accessible via getValue
*/
class PDFInfoWrapper {
const PDFINFO_CMD = 'pdfinfo';
/**
* keyed array to hold all the info
*/
protected $info = array();
/**
* raw output in case we need it
*/
public $raw = "";
/**
* Constructor
* #param string $filePath - path to file
*/
public function __construct($filePath) {
exec(self::PDFINFO_CMD . ' "' . $filePath . '"', $output);
//loop each line and split into key and value
foreach($output as $line) {
$colon = strpos($line, ':');
if($colon) {
$key = trim(substr($line, 0, $colon));
$val = trim(substr($line, $colon + 1));
//use strtolower to make case insensitive
$this->info[strtolower($key)] = $val;
}
}
//store the raw output
$this->raw = implode("\n", $output);
}
/**
* get a value
* #param string $key - key name, case insensitive
* #returns string value
*/
public function getValue($key) {
return #$this->info[strtolower($key)];
}
/**
* list all the keys
* #returns array of key names
*/
public function getAllKeys() {
return array_keys($this->info);
}
}

this simple 1 liner seems to do the job well:
strings $path_to_pdf | grep Kids | grep -o R | wc -l
there is a block in the PDF file which details the number of pages in this funky string:
/Kids [3 0 R 4 0 R 5 0 R 6 0 R 7 0 R 8 0 R 9 0 R 10 0 R 11 0 R 12 0 R 13 0 R 14 0 R 15 0 R 16 0 R 17 0 R 18 0 R 19 0 R 20 0 R 21 0 R 22 0 R 23 0 R 24 0 R 25 0 R 26 0 R 27 0 R 28 0 R 29 0 R 30 0 R 31 0 R 32 0 R 33 0 R 34 0 R 35 0 R 36 0 R 37 0 R 38 0 R 39 0 R 40 0 R 41 0 R]
The number of 'R' characters is the number of pages
screenshot of terminal showing output from strings

You can use mutool.
mutool show FILE.pdf trailer/Root/Pages/Count
mutool is part of the MuPDF software package.

Here is a R function that reports the PDF file page number by using the pdfinfo command.
pdf.file.page.number <- function(fname) {
a <- pipe(paste("pdfinfo", fname, "| grep Pages | cut -d: -f2"))
page.number <- as.numeric(readLines(a))
close(a)
page.number
}
if (F) {
pdf.file.page.number("a.pdf")
}

Here is a Windows command script using gsscript that reports the PDF file page number
#echo off
echo.
rem
rem this file: getlastpagenumber.cmd
rem version 0.1 from commander 2015-11-03
rem need Ghostscript e.g. download and install from http://www.ghostscript.com/download/
rem Install path "C:\prg\ghostscript" for using the script without changes \\ and have less problems with UAC
rem
:vars
set __gs__="C:\prg\ghostscript\bin\gswin64c.exe"
set __lastpagenumber__=1
set __pdffile__="%~1"
set __pdffilename__="%~n1"
set __datetime__=%date%%time%
set __datetime__=%__datetime__:.=%
set __datetime__=%__datetime__::=%
set __datetime__=%__datetime__:,=%
set __datetime__=%__datetime__:/=%
set __datetime__=%__datetime__: =%
set __tmpfile__="%tmp%\%~n0_%__datetime__%.tmp"
:check
if %__pdffile__%=="" goto error1
if not exist %__pdffile__% goto error2
if not exist %__gs__% goto error3
:main
%__gs__% -dBATCH -dFirstPage=9999999 -dQUIET -dNODISPLAY -dNOPAUSE -sstdout=%__tmpfile__% %__pdffile__%
FOR /F " tokens=2,3* usebackq delims=:" %%A IN (`findstr /i "number" test.txt`) DO set __lastpagenumber__=%%A
set __lastpagenumber__=%__lastpagenumber__: =%
if exist %__tmpfile__% del %__tmpfile__%
:output
echo The PDF-File: %__pdffilename__% contains %__lastpagenumber__% pages
goto end
:error1
echo no pdf file selected
echo usage: %~n0 PDFFILE
goto end
:error2
echo no pdf file found
echo usage: %~n0 PDFFILE
goto end
:error3
echo.can not find the ghostscript bin file
echo. %__gs__%
echo.please download it from:
echo. http://www.ghostscript.com/download/
echo.and install to "C:\prg\ghostscript"
goto end
:end
exit /b

The R package pdftools and the function pdf_info() provides information on the number of pages in a pdf.
library(pdftools)
pdf_file <- file.path(R.home("doc"), "NEWS.pdf")
info <- pdf_info(pdf_file)
nbpages <- info[2]
nbpages
$pages
[1] 65

If you have access to shell, a simplest (but not usable on 100% of PDFs) approach would be to use grep.
This should return just the number of pages:
grep -m 1 -aoP '(?<=\/N )\d+(?=\/)' file.pdf
Example: https://regex101.com/r/BrUTKn/1
Switches description:
-m 1 is neccessary as some files can have more than one match of regex pattern (volonteer needed to replace this with match-only-first regex solution extension)
-a is neccessary to treat the binary file as text
-o to show only the match
-P to use Perl regular expression
Regex explanation:
starting "delimiter": (?<=\/N ) lookbehind of /N (nb. space character not seen here)
actual result: \d+ any number of digits
ending "delimiter": (?=\/) lookahead of /
Nota bene: if in some case match is not found, it's safe to assume only 1 page exists.

I got problems with imagemagick installations on production server. After hours of attempts, I decided to get rid of IM, and found another approach:
Install poppler-utils:
$ sudo apt install poppler-utils [On Debian/Ubuntu & Mint]
$ sudo dnf install poppler-utils [On RHEL/CentOS & Fedora]
$ sudo zypper install poppler-tools [On OpenSUSE]
$ sudo pacman -S poppler [On Arch Linux]
Then execute via shell in your PL ( e.g. PHP):
shell_exec("pdfinfo $filePath | grep Pages | cut -f 2 -d':' | xargs");

This works fine in Imagemagick.
convert image.pdf -format "%n\n" info: | head -n 1

Often you read regex /\/Page\W/ but it won't work for me for several pdf files.
So here is an other regex expression, that works for me.
$pdf = file_get_contents($path_pdf);
return preg_match_all("/[<|>][\r\n|\r|\n]*\/Type\s*\/Page\W/", $path_pdf, $dummy);

Remove first line in text file without allocating memory for entire text file

I have a very large text file and all I need to do is remove one single line from the top of the file. Ideally, it would be done in PHP, but any unix command would work fine. I'm thinking I can just stream through the beginning of the file till I reach \n, but I'm not sure how I do that.
Thanks,
Matt Mueller

you can use a variety of tools in *nix. A comparison of some of the different methods on a file with more than 1.5 million lines.
$ wc -l < file4
1700589
$ time sed -n '2,$p' file4 > /dev/null
real 0m2.538s
user 0m1.787s
sys 0m0.282s
$ time awk 'NR>1' file4 > /dev/null
real 0m2.174s
user 0m1.706s
sys 0m0.293s
$ time tail -n +2 file4 >/dev/null
real 0m0.264s
user 0m0.067s
sys 0m0.194s
$time more +2 file4 > /dev/null
real 0m11.771s
user 0m11.131s
sys 0m0.225s
$ time perl -ne 'print if $. > 1' file4 >/dev/null
real 0m3.592s
user 0m3.259s
sys 0m0.321s

sed -i -e '1d' file will do what you want.
-i indicates "in-place"
-e means "evaluate this expression"
'1d' means, delete the first line

If your file is flat, you can use sed '1d' file > newfile

Assuming tail from GNU coreutils:
tail -n +2 file > newfile

tail -n +2 < source > destination
Tail with positive number outputs everything starting with N-th line.

Try the following command:
sed -n '2,$p' file

I don't know how big is your file, but did you try awk 'NR > 1' {print} ?

I'm a bit rusty on perl, but this might do the trick:
#!/usr/bin/perl
$first = true;
while (<>)
{
if ($first)
{
# skip first line
$first = false;
}
else
{
print;
}
}
and use this script as a filter:
cat myfile.txt | removefirstline.pl > myfile_2.txt

function cutline($filename,$line_no=-1) {
$strip_return=FALSE;
$data=file($filename);
$pipe=fopen($filename,'w');
$size=count($data);
if($line_no==-1) $skip=$size-1;
else $skip=$line_no-1;
for($line=0;$line<$size;$line++)
if($line!=$skip)
fputs($pipe,$data[$line]);
else
$strip_return=TRUE;
return $strip_return;
}
cutline('foo.txt',1); // deletes line 1 in foo.txt
}

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

How do I convert my sed insert text script in PHP? - php

Related

How to grab inout from user and modify file using echo or sed

AWK pipe output from TTY to PHP

Can we use PHP in Bash Script?

Get the number of pages in a PDF document

Remove first line in text file without allocating memory for entire text file

Categories

Resources