Unicode to PHP exec - php

I have a Python file I'm calling with PHP's exec function. Python then outputs a string (apparently Unicode, based on using isinstance), which is echoed by PHP. The problem I'm running into is that if my string has any special characters in it (like the degree symbol), it won't output. I'm sure I need to do something to fiddle with the encoding, but I'm not really sure what to do, and why.
EDIT: To get an idea of how I am calling exec, please see the following code snippet:
$tables = shell_exec('/s/python-2.6.2/bin/python2.6 getWikitables.py '.$title);
Python properly outputs the string when I call getWikitables.py by itself.
EDIT: It definitely seems to be something either on the Python end, or in transmitting the results. When I run strlen on the returned values in PHP, I get 0. Can exec only accept a certain type of encoding?

Try setting the LANG environment variable immediately before executing the Python script per http://php.net/shell-exec#85095:
shell_exec(sprintf(
'LANG=en_US.utf-8; /s/python-2.6.2/bin/python2.6 getWikitables.py %s',
escapeshellarg($title)
));
(use of sprintf() to (hopefully) make it a little easier to follow the lengthy string)
You might also/instead need to do this before calling shell_exec(), per http://php.net/shell-exec#78279:
$locale = 'en_US.utf-8';
setlocale(LC_ALL, $locale);
putenv('LC_ALL='.$locale);

I have had a similar issue and solved it with the following. I don't understand why it is necessary, since I though all is already processed with UTF-8. Calling my Python script on the command line worked, but not with exec (shell_exec) via PHP and Apache.
According to a php forum entry this one is needed when you want to use escapeshellarg():
setlocale(LC_CTYPE, "en_US.UTF-8");
It needs to be called before escapeshellarg() is executed. Also, it was necessary to set a certain Python environment variable before the exec command (found an unrelated hint here):
putenv("PYTHONIOENCODING=utf-8");
My Python script evaluated the arguments like this:
sys.argv[1].decode("utf-8")
(Hint: That was required because I use a library to convert some arabic texts.)
So finally, I could imagine that the original question could be solved this way:
setlocale(LC_CTYPE, "en_US.UTF-8");
putenv("PYTHONIOENCODING=utf-8");
$tables = shell_exec('/s/python-2.6.2/bin/python2.6 getWikitables.py ' .
escapeshellarg($title));
But I cannot tell anything regarding the return value. In my case I could output it to the browser directly without any problems.
Spent many, many hours to find that out... One of the situations when I hate my job ;-)

This worked for me
setlocale(LC_CTYPE, "en_US.UTF-8");
putenv("PYTHONIOENCODING=utf-8");
$tables = shell_exec('/s/python-2.6.2/bin/python2.6 getWikitables.py ' .
escapeshellarg($title));

On php you can use methods like utf8_encode() or utf8_decode() to solve your problem.

Related

How can you use the "*" path wildcard via PHP exec() as part of a `cp` or `ls` command?

I just cannot fathom how to get the PHP exec() or shell_exec() functions to treat a '*' character as a wildcard. Is there some way to properly encode / escape this character so it makes it through to the shell?
This is on windows (via CLI shell script if that matters, Terminal or a git-bash yields the same results).
Take the following scenario:
C:\temp\ contains a bunch of png images.
echo exec('ls C:\temp\*');
// output: ls: cannot access 'C:\temp\*': No such file or directory
Permissions is not the problem:
echo exec('ls C:\temp\exmaple.png');
// output: C:\temp\example.png
Therefore the * character is the problem and is being treated as a literal filename rather than a wildcard. The file named * does not exist, so from that point of view, it's not wrong...
It also does not matter if I use double quotes to encase the command:
echo exec("ls C:\temp\*");
// output: ls: cannot access 'C:\temp\*': No such file or directory
I have also tried other things like:
exec(escapeshellcmd('ls C:\temp\*'));
exec('ls C:\temp\\\*');
exec('ls "C:\temp\*"');
exec('ls "C:\temp\"*');
And nothing works...
I'm pretty confused that I cannot find any other posts discussing this but maybe I'm just missing it. At this point I have already worked around the issue by manually programming a glob loop and using the internal copy() function on each file individually, but it's really bugging me that I do not understand how to make the wildcard work via shell command.
EDIT:
Thanks to #0stone0 - The answer provided did not particularly answer my initial question but I had not tried using forward slashes in the path and when I do:
exec('ls C:/temp/*')
It works correctly, and as 0stone0 said, it only returns the last line of the output, which is fine since this was just for proof of concept as I was not actually attempting to parse the output.
Also, on a side note, since posting this question my system had been updated to Win11 22H2 and now for some reason the original test code (with the backslashes) no longer returns the "Cannot access / no file" error message. Instead it just returns an empty string and has no output set to the &$output parameter either. That being said, I'm not sure if the forward slashes would have worked on my system prior to the 22H2 update.
exec() only returns the last output line by default.
The wildcard probably works, but the output is just truncated.
Pass an variable by ref to exec() and log that:
<?php
$output = [];
exec('ls -lta /tmp/*', $output);
var_dump($output);
Without any additional changes, this returns the same as when I run ls -lta /tmp/* in my Bash terminal
That said, glob() is still the preferred way of getting data like this especcially since
You shouldn't parse the output of ls

PHP exec command not having an $output but $return is 0

This is my code for executing a command from PHP:
$execQuery = sprintf("/usr/local/bin/binary -mode M \"%s\" %u %s -pathJson \"/home/ec2/fashion/jsonS/\" -pathJson2 \"/home/ec2/fashion/jsonS2/\"", $path, $pieces, $type);
exec($execQuery, $output, $return);
the $return value is always 0 but $output is empty. The $output should be a JSON.
If I execute the same but removing one letter to binary (for example /usr/local/bin/binar ) I get (correctly) a $return = 127.
If I write other parameters (like -mode R which doesn't exit) I got errors from the console (which are correct as well).
If I run the exact $execQuery (which I printf before to be sure about quotation marks) on the console, it executes correctly. It's only the PHP side where I've got the error.
What can be wrong?
Thank you in advance.
Well, a couple of things might be happening...
This binary you're running write to something else that STDOUT (for instance, STDERR)
The env vars available to the PHP user differ from the env vars available to the user running console (and those vars are required)
PHP User does not have permission to access some files involved.
In order to debug, it might be better to use proc_open instead of exec, and check the STDOUT and STDERR. This might give you additional information regarding what's happening.
Suggestion (and shameless advertising)
I wrote a small utility library for PHP that executes external programs in a safer way and provides aditional debug information. It might help you to, at least pinpoint the issue.

After update to 5.4, fopen can't read file

I have a website on a host that recently switched from PHP 5.2 to 5.4, and required us to chose a new php.ini file: 5.4 plain, 5.4 solo (just one php.ini file used throughout the site), and 5.4 fast.
I do not know which one I was using prior to making the switch, but when I did, (I chose 5.4 solo), I noticed that a part of my website that depends on mbstring (multibyte characters) no longer works.
In specific, it opens a text file that is full of characters and then that is used in an encryption script and it stores garbage in the mysql database. Then to retrieve it, it's again run through the script and decrypted, and displayed on the screen.
This worked just fine until the 5.4 change. Now it appears that it's unable to retrieve (open?) the text file. I have tested this with a non-multibyte character version and that works fine, so I don't think the issue is with the code, but rather with the way PHP is treating multibyte chars...and I suspect, just a hunch, that this is fixable by tweaking the PHP.ini file somehow. Zend.multibyte seems to be PHP's new thing.
My problem is that I have no idea what to tweak. I tried several different Zend.multibyte/mbstring combos and that didn't work.
I know that everything works up until a string is sent for encryption. It comes back as a null value, instead of a garbled string. I feel like something in the string is being rejected by PHP and thus it's failing...offering nothing instead of the string it should.
Does anyone have a thought as to what might be happening and why my script no-longer works with 5.4? I have checked and the mbstring module IS loaded, with default values in the php.ini.
Any suggestions would be great...I'm totally stumped. Even some additional reports or ways to test or narrow down the problem would be fantastic.
Thank you!
Here is some code, where I think the problem is:
$this->s1 = "";
$s1array = array("a1.txt", "a2.txt", "a3.txt");
foreach ($s1array as $i => $value) {
$myFile = "../a/dir/somewhere/$s1array[$i]";
$fh = fopen($myFile, 'r');
$theData = fgets($fh);
fclose($fh);
$this->s1 .= html_entity_decode($theData, ENT_NOQUOTES, 'UTF-8');
}
The files ../a/dir/somewhere/a1.txt and ../a/dir/somewhere/a2.txt (etc) are semi-comma delimited strings of html coded letters, for example: & #x0fb0f;& #x02c97;& #x00436;& #x10833;& #x00514; (I added the spaces so it would show code not the HTML values!).
But I guess now, for some reason, this above code isn't returning any results. If I assign the result to a variable and echo that variable, there's nothing. But if I assign $this->s1 = "abcde"; or a longer string and skip the "foreach" part, it will work. So something in this process, this code, no longer works in 5.4. Can anyone tell what's going on here? Thank you!
Why you use fopen and so on for text files when you could use file_put_contents and file_get_contents - they are mostly wrappers for fopen, freads and so on. I have NEVER ever had any problems with UTF8 using that two functions.
Also make sure everything (from php, to db if you are using it, and php files) are encoded or using utf8. There is nothing funnier than *.php files in for example latin2 and all the rest in utf8.

PHP exec change encoding

I need to address UTF-8 filenames with the php exec command. The problem is that the php exec command does not seem to understand utf-8. I use something like this:
echo exec('locale charmap');
returns ANSI_X3.4-1968
looking at this SO question, the solution lookes like that:
echo exec('LANG=de_DE.utf8; locale charmap');
But I still get the same output: ANSI_X3.4-1968
On the other hand - if I execute this php command on the bash command line:
php -r "echo exec('LANG=de_DE.UTF8 locale charmap');"
The output is UTF-8.
So the questions are:
Why is there an different result be executing the php command at bash and at apache_module/web page?
How to set UTF-8 for exec if it runs inside a website as apache module?
To answer my own question - i found the following solution:
setting the locale environment variable with PHP
$locale='de_DE.UTF-8';
setlocale(LC_ALL,$locale);
putenv('LC_ALL='.$locale);
echo exec('locale charmap');
This sets to / returns UTF-8. So i'm able to pass special characters and umlauts to linux shell commands.
This solves it for me (source: this comment here):
<?php
putenv('LANG=en_US.UTF-8');
$command = escapeshellcmd('python3 myscript.py');
$output = shell_exec($command);
echo $output;
?>
I had the similar problem. My program was returning me some German letters like: üäöß. Here is my code:
$programResult = shell_exec('my script');
Variable $programResult is containing German umlauts, but they were badly encoded. In order to encode it properly you can call utf8_encode() function.
$programResult = shell_exec('my script');
$programResult = utf8_encode($programResult);

PHP system returns 4

I want to convert a pdf file to an image with PHP, but i can't get the command worked. PHP returns a 4. I don't have any kind of idea what that can be.
I am using the next code:
$tmp = system("convert -version", $value);
var_dump($value);
Someone an idea?
try
exec("convert -version 2>&1", $out, $ret);
print_r($out);
it should tell you what's wrong
It looks like the -version flag is telling the convert software (looks like imagemagick) to respond with the major version number of that software. It looks like it is working correctly. You probably need to pass it the right flags to operate properly. I suggest reading the documentation to see what flags are required to convert PDFs.
try using some of the other system functions in PHP to get more detailed output.
exec("convert -version", $output, $value);
print_r($output);
The exec function above will give you all the output from the command in the $output parameter, as an array.
The return status (which will be held in the $value parameter in the exec call above or the system call in your original code) gives you the return value of the executed shell command.
In general, this will be zero for success, with non-zero integer return values indicating different kinds of error. So it appears there's something wrong with the command as you have it (possibly -version is not recognised: often you need a double hyphen before long-hand command-line options).
Incidentally, you may also find that the passthru function is more suited to your needs. If your convert program generates binary image data corresponding to the converted PDF, you can use passthru to send that image data directly to the browser (after setting the appropriate headers of course)
err... aren't you vardumping the wrong result? (I would var dump $tmp, not $value.)
I think the code should read:
$tmp = system("convert -version", $value);
var_dump($tmp);

Categories