Preserve colored output using php's popen

Preserve colored output using php's popen - php

When using popen in php, is there a way to preserve the colored output a program might generate? Is there maybe a way I can tell the shell to print all color escape sequences, instead of resolving them?

That depends on the program you are calling. Usually, if a program supports coloured output, it would ask the OS, "am I running on a terminal?" If yes, then it outputs colour codes. If not, it won't. If you run that program through popen(), then the OS would say "no, you're not running on a terminal" and the program would choose not to output the colour codes (because they would be confusing in the captured output).
Some programs may have an option to force the output of colour codes even if output is not being written to a terminal. However, that is not something you can force externally if the program doesn't already have a way to do it.

Related

Is it safe to use raw emojis in PHP source code?

Example :
$fire = '🔥';
I know PHP 5+ supports this functionality natively but is it best practice or should I be storing them using their codepoints instead and if so, why?

As far as your editor and the PHP compiler are concerned, it's all just text, and '🔥' is no different from 'fire' or 'Φωτιά'.
When PHP runs, it will read the bytes in from the file and put them in memory, without caring what they mean. This leads to the most likely problem you'll have: if you save the file in your text editor as UTF-16, and then echo the string to a browser telling it that it's UTF-8, the browser won't show the right thing. But that's easily avoided by making sure your editor always uses UTF-8, and your output headers tell the browser that's what you're using.
If you don't trust your editor to do that, and you're running PHP7, you could write it in the escaped notation "\u{1f525}", but when it runs, the same bytes will end up in memory.
You might have similar problems if you send the text elsewhere - to a database, for instance - and that somewhere else doesn't know to handle it as UTF-8. How you write the string in your source file won't make any difference to that, though, that's just a case of making sure everything is configured to match.
Note: you don't actually have to use UTF-8 for this, you could use UTF-16, or some other encoding, as long as you're consistent; but UTF-8 is by far the most common these days, particularly on the web.

Go back up a line in a linux console?

I know I can go back the line and overwrite its contents with \r.
Now how can I go up into the previous line to change that?
Or is there even a way to print to a specific cursor location in the console window?
My goal is to create some self-refreshing multiline console app with PHP.

Use ANSI escape codes to move the cursor. For example: Esc [ 1 F. To put the Escape character in a string you'll need to specify its value numerically, for example "\x1B[1F"
As sujoy suggests, you can use PHP ncurses for a more abstract way to move the cursor.
Whilst most "consoles" allow ANSI escape codes, other sorts of terminal use different character sequences, ncurses provides a standardised API that is terminal independent. Have a quick look at /etc/termcap (and then man terminfo) if you are interested.
Update: Lars Wirzenius' answer has a useful summary of the background. Some years ago I also wrote a short article on terminals.

The Linux virtual consoles emulate an old-time display terminal, although not perfectly. See Wikipedia on VT-100 for an example of the hardware.
These terminals read data from a serial port, and displayed it on the screen. They also looked for special bytes in the input stream from the serial port and acted upon them in other ways. For example, the newline character ('\n', byte value 10) would go to the beginning of the next line, and the carriage return character ('\r', byte value 13) would go the beginning of the current line.
More interestingly, an ASCII ESC byte (27) would start a command sequence which could to almost anything to the cursor or display. One such sequence might move the cursor to the top left of the screen, another to a given row and column. A third one might clear the screen, and a fourth one might make text be displayed in reverse colors.
Every manufacturer of terminals would invent their own command sequences (and they didn't always start with ESC either), and then change them depending on what they could make new versions of their hardware do. If a manufacturer added colors or simple graphics, those resulted in new sequences.
Adapating every application to every terminal and every change to the command sequences would have been a big task. Compare it with adapting every web application to a new browser version.
As usual, the solution is to add a layer of abstraction. In Unix, the initial abstraction was called termcap, and consisted of the file /etc/termcap, and a library to read the file. The file would specify the actual command sequences to send for each logical operation for each terminal model. So a vt102 terminal model would map the operation "clear the screen" to the \033[2J. This allowed application programmers to think in terms of the logical operations, which was much simpler.
Of course, not simple enough... The termcap library was not as good as it might have been, so two other libraries were developd: curses provided a higher abstraction level, including user input, and terminfo made the terminal definitions and their use by programmers easier.
In modern times, ncurses is a free re-implementation of curses and terminfo has pretty much replaced termcap completely. Also, ANSI has defined some "standard" sequences, based on the Digital terminals, and almost every terminal emulator uses those, at least mostly, and the Linux virtual console is one of them. Very few people have actual physical terminals anymore.
For what you're trying to do, ncurses or the tput command may be most useful. Or you may decide that just clearing the whole screen (see clear(1)) and writing output then is easiest.

My goal is to create some self-refreshing multiline console app with
PHP
For what you are trying to achieve ncurses is the way to go.

You shoud read about ncurses. In shell, you can go one line up by:
tput cuu1
See man terminfo for more options.
But executing shell command to move cursor around is quite desperate.

You just you the up and down arrows on the keyboard to scroll through console history but there is also the history command. Find out more using man history

Explain this XSS string, it uses perl

I am trying to test one of my php sanitization classes against a few xss scripts available on
http://ha.ckers.org/xss.html
So one of the scripts in there has perl in it, is this some kind of a perl statement?? And would this execute directly on the server, since perl is a server scripting language.
perl -e 'print "<IMG SRC=java\0script:alert(\"XSS\")>";' > out
Is the script that I am trying to work with. I have not tested it yet though, but I want to understand before I use it.

The \0 is a string termination character in the laguage C. Since perl is built on top of C, in the old days you could inject this "poisonous null byte" to make the C part read the line
<IMG SRC=java instead of the whole string, and thus maybe allow the whole thing through even though you were trying to strip stuff like SRC=javascript:
Mostly this doesn't work anymore because the higher level languages has gotten pretty good at defeating attacks like this by stripping out stray control chars like \0 before sending the strings on to the lower level routines.
You can read more on the poison nullbyte here: http://insecure.org/news/P55-07.txt or here: http://hakipedia.com/index.php/Poison_Null_Byte

The Perl isn't the attack, it just demonstrates how to generate the attack, since you can't see it in a plain string.
The point is that there is a null character (represented in Perl as \0) in the data.

using ≠ like != pros/cons

Is it ok to use ≠ instead of !=. I know it's an extra alt code and I've never used this on a project but I've tested it out and it works. Are there any pros/cons besides having to Alt +8800.
Edit:
I'm not going to use this, I just want to know.
Tested language php.

You have not mentioned which programming language your question is about, but ≠ has a number of disadvantages:
It does not exist in ASCII. Code written in anything else but pure 7-bit ASCII is way too vulnerable to strange encoding errors and it forces unnecessary requirements upon the editing program. It might even be displayed incorrectly depending on the editor font etc, and you do NOT want that to happen when editing code.
Even if it does work, it is not widely used, which by itself is a good reason to avoid it.
It saves screen space, sure, at the expense of clarity. It might even be mistaken for an = if you are tired. Terse code is not always best.
It cannot be typed easily in a portable manner. As a matter of fact, I don't know how to produce it on my Linux system without a graphical character selector.
Except for the screen space (disk space is actually the same or worse than !=) I cannot think of any other "advantage", so why bother?
EDIT:
On my system (Mandriva Linux 2010.1) with PHP 5.3.4 the ≠ (U+2260, or 8800 in decimal) operator does not work. Are you certain that your editor does not implicitly convert it to !=?

How to escape/strip special characters in the LaTeX document?

We implemented the online service where it is possible to generate PDF with predefined
structure. The user can choose a LaTeX template and then compile it with an appropriate inputs.
The question we worry about is the security, that the malicious user was not able to gain shell access through the injection of special instruction into latex document.
We need some workaround for this or at least a list of special characters that we should strip from the input data.
Preferred language would be PHP, but any suggestions, constructions and links are very welcomed.
PS. in few word we're looking for mysql_real_escape_string for LaTeX

Here's some code to implement the Geoff Reedy answer. I place this code in the public domain.
<?
$test = "Test characters: # $ % & ~ _ ^ \ { }.";
header( "content-type:text/plain" );
print latexSpecialChars( $test );
exit;
function latexSpecialChars( $string )
{
$map = array(
"#"=>"\\#",
"$"=>"\\$",
"%"=>"\\%",
"&"=>"\\&",
"~"=>"\\~{}",
"_"=>"\\_",
"^"=>"\\^{}",
"\\"=>"\\textbackslash",
"{"=>"\\{",
"}"=>"\\}",
);
return preg_replace( "/([\^\%~\\\\#\$%&_\{\}])/e", "\$map['$1']", $string );
}

The only possibility (AFAIK) to perform harmful operations using LaTeX is to enable the possibility to call external commands using \write18. This only works if you run LaTeX with the --shell-escape or --enable-write18 argument (depending on your distribution).
So as long as you do not run it with one of these arguments you should be safe without the need to filter out any parts.
Besides that, one is still able to write other files using the \newwrite, \openout and \write commands. Having the user create and (over)write files might be unwanted? So you could filter out occurrences of these commands. But keeping blacklists of certain commands is prone to fail since someone with a bad intention can easily hide the actual command by obfusticating the input document.
Edit: Running the LaTeX command using a limited account (ie no writing to non latex/project related directories) in combination with disabling \write18 might be easier and more secure than keeping a blacklist of 'dangerous' commands.

According to http://www.tug.org/tutorials/latex2e/Special_Characters.html the special characters in latex are # $ % & ~ _ ^ \ { }. Most can be escaped with a simple backslash but _ ^ and \ need special treatment.
For caret use \^{} (or \textasciicircum), for tilde use \~{} (or \textasciitilde) and for backslash use \textbackslash
If you want the user input to appear as typewriter text, there is also the \verb command which can be used like \verb+asdf$$&\~^+, the + can be any character but can't be in the text.

In general, achieving security purely through escaping command sequences is hard to do without drastically reducing expressivity, since it there is no principled way to distinguish safe cs's from unsafe ones: Tex is just not a clean enough programming language to allow this. I'd say abandon this approach in favour of eliminating the existence of security holes.
Veger's summary of the security holes in Latex conforms with mine: i.e., the issues are shell escapes and file creation.overwriting, though he has missed a shell escape vulnerability. Some additional points follow, then some recommendations:
It is not enough to avoid actively invoking --shell-escape, since it can be implicitly enabled in texmf.cnf. You should explicitly pass --no-shell-escape to override texmf.cnf;
\write18 is a primitive of Etex, not Knuth's Tex. So you can avoid Latexes that implement it (which, unfortunately, is most of them);
If you are using Dvips, there is another risk: \special commands can create .dvi files that ask dvips to execute shell commands. So you should, if you use dvips, pass the -R2 command to forbid invoking of shell commands;
texmf.cnf allows you to specify where Tex can create files;
You might not be able to avoid disabling creation of fonts if you want your clients much freedom in which fonts they may create. Take a look at the notes on security for Kpathsea; the default behaviour seems reasonable to me, but you could have a per user font tree, to prevent one user stepping on another users toes.
Options:
Sandbox your client's Latex invocations, and allow them freedom to misbehave in the sandbox;
Trust in kpathsea's defaults, and forbid shell escapes in latex and any other executables used to build the PDF output;
Drastically reduce expressivity, forbidding your clients the ability to create font files or any new client-specified files. Run latex as a process that can only write to certain already existing files;
You can create a format file in which the \write18 cs, and the file creation css, are not bound, and only macros that invoke them safely, such as for font/toc/bbl creation, exist. This means you have to decide what functionality your clients have: they would not be able to freely choose which packages they import, but must make use of the choices you have imposed on them. Depending on what kind of 'templates' you have in mind, this could be a good option, allowing use of packages that use shell escapes, but you will need to audit the Tex/Latex code that goes into your format file.
Postscript
There's a TUGBoat article, Server side PDF generation based on LATEX templates, addressing another take on the question to the one I have taken, namely generating PDFs from form input using Latex.

You'd probably want to make sure that your \write18 is disabled.
See http://www.fceia.unr.edu.ar/lcc/cdrom/Instalaciones/LaTex/MiKTex/doc/ch04s08.html and http://www.texdev.net/2009/10/06/what-does-write18-mean/

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.