Non ASCII Characters in filename are skipped - php

The program youtube-dl by itself supports Non ASCII characters in filename, It works flawlessly on my webserver under root user as well as www-data user, but when I try downloading a video using youtube-dl with PHP, the Non ASCII characters are completely skipped.
Eg: Stromae - bâtard will be saved as Stromae - btard.mp4 or البث الحي as .mp4
I am using this code to run the CLI command
function cmd($string) {
$descriptorspec = array(
0 => array("pipe", "r"), // stdin
1 => array("pipe", "w"), // stdout
2 => array("pipe", "w"), // stderr
);
$process = proc_open($string, $descriptorspec, $pipes);
$stdout = stream_get_contents($pipes[1]);
fclose($pipes[1]);
$stderr = stream_get_contents($pipes[2]);
fclose($pipes[2]);
$ret = proc_close($process);
return $stdout;
}
$value = ('youtube-dl https://some.valid/link');
echo cmd($value);
Kindly advise what I should do to fix this issue.

Check your phpinfo(); output for LC_ALL or LC_LANG settings. I suspect it has nothing to do with PHP, but with the shell environment that you're using versus the shell environment your web server is using.
$value = ('LC_ALL=en_US.UTF-8 youtube-dl https://some.valid/link');
echo cmd($value);

By default PHP use ISO-8859-1 charset. Configure PHP to use UTF-8. You can to this by adding
mb_internal_encoding("UTF-8");
At the beggining of your script

Related

php exec grep -axv don't return anything

I'am stuck since days while requesting grep on PHP, it work in cli but don't return anything via http.
it search for files that contain non UTF-8 carachters
in CLI it retrun ퟿������ but nothing (array is null) from the web
<?php
exec("/sbin/grep -axv '.*' /srv/http/test 2>&1", $datareturn);
print_r($datareturn);
?>
disable_functions = is empty in php.ini
Also tried with proc_open :
<?php
$descriptorspec = array(
0 => array("pipe", "r"), // stdin is a pipe that the child will read from
1 => array("pipe", "w"), // stdout is a pipe that the child will write to
2 => array("file", "/tmp/error-output.txt", "a") // stderr is a file to write to
);
$process = proc_open(
"/sbin/grep -axv '.*' /srv/http/test",
$descriptorspec,
$pipes
);
if (is_resource($process)) {
// Closing $pipes[0] because we don't need it
fclose($pipes[0]);
echo stream_get_contents($pipes[1]);
fclose($pipes[1]);
//avoid a deadlock
$return_value = proc_close($process);
echo "command returned $return_value\n";
}
?>
In CLI it return :
"퟿������
command returned 0
From http, "command returned 1", error-output.txt is empty in 2 cases

Get full error from PDFTK when using PHP exec

I'm using PHP exec() in a script to merge PDF files with PDFTK.
From PHP docs: exec function says the second argument, if provided, will list each line from the console output. All I get is an empty array though.
Example of code being used:
exec(pdftk "file1.pdf" "file2.pdf" Merged_File.pdf, $output = array(), $result);
I can successfully get errors if I run the code in the console, but I'd like for my application to have access to the full text errors.
You are probably looking to get messages from stderr using proc_open. Something like this:
<?php
$cmd = "/path/to/script arguments here";
$cwd = dirname(__FILE__);
$descriptorspec = array(
0 => array("pipe", "r"), // stdin
1 => array("pipe", "w"), // stdout
2 => array("pipe", "w"), // stderr
);
if ( ($process = proc_open($cmd, $descriptorspec, $pipes, $cwd, null)) !== false )
{
// Standard output
$stdout = stream_get_contents($pipes[1]);
fclose($pipes[1]);
// Errors
$stderr = stream_get_contents($pipes[2]);
fclose($pipes[2]);
proc_close($process);
}
?>

Python ImportError while using youtube-dl by php

Traceback (most recent call last): File "/usr/bin/youtube-dl", line
3, in import youtube_dl File
"/usr/lib/python2.7/dist-packages/youtube_dl/init.py", line 65, in
from .utils import ( File
"/usr/lib/python2.7/dist-packages/youtube_dl/utils.py", line 18, in
import ssl File "/usr/lib/python2.7/ssl.py", line 61, in import _ssl
#if we can't import it, let the error propagate ImportError: /usr/lib/python2.7/lib-dynload/_ssl.x86_64-linux-gnu.so: symbol
GENERAL_NAME_free, version OPENSSL_1.0.0 not defined in file
libcrypto.so.1.0.0 with link time reference
The Command i used
youtube-dl --max-quality 2180 --write-thumbnail -x --audio-format mp3
-c -o "/home/bahaa/%(id)s.%(ext)s" https://www.youtube.com/watch?v=xFQFMSNZW08&list=RDV-jLo0Ovems
The php code i am using
<?php
$url = 'https://www.youtube.com/watch?v=xFQFMSNZW08&list=RDV-jLo0Ovems';
//escapeshellarg
$string = ('youtube-dl --max-quality 2180 --write-thumbnail -x --audio-format mp3 -c -o "/home/bahaa/%(id)s.%(ext)s" https://www.youtube.com/watch?v=xFQFMSNZW08&list=RDV-jLo0Ovems');
$descriptorspec = array(
0 => array("pipe", "r"), // stdin
1 => array("pipe", "w"), // stdout
2 => array("pipe", "w"), // stderr
);
$process = proc_open($string, $descriptorspec, $pipes);
$stdout = stream_get_contents($pipes[1]);
fclose($pipes[1]);
$stderr = stream_get_contents($pipes[2]);
fclose($pipes[2]);
$ret = proc_close($process);
echo json_encode(
array(
'status' => $ret,
'errors' => str_replace("\n", "<br />", $stderr."<hr />"),
'output' => $stdout,
)
);
?>
Hi Finally I got the solution!
Firstly, would you please let me know your server configuration? I'm using XAMPP, in my case, the "Python" and its libs used by Apache are not the same as used by my ubuntu, so I build and install Python3 into /opt/lampp/bin and build openssl by myself. You can download openssl1.0.1 from its website and configure it by "./config shared" to build a .so file.
Next copy the libssl.so* and libcrypto.so* (totally 4 files) into /opt/lampp/lib (this is where LD_LIBRARY_PATH points to) and restart apache now
It works for me, Hope it will help you

Environment is not passed to process opened by proc_open

I have a problem passing environment variables to processes that i opened with proc_open.
I found the following example on http://us2.php.net/manual/en/function.proc-open.php
<?php
$descriptorspec = array(
0 => array("pipe", "r"), // stdin is a pipe that the child will read from
1 => array("pipe", "w"), // stdout is a pipe that the child will write to
2 => array("file", "/tmp/error-output.txt", "a") // stderr is a file to write to
);
$cwd = '/tmp';
$env = array('some_option' => 'aeiou');
$process = proc_open('php', $descriptorspec, $pipes, $cwd, $env);
if (is_resource($process)) {
// $pipes now looks like this:
// 0 => writeable handle connected to child stdin
// 1 => readable handle connected to child stdout
// Any error output will be appended to /tmp/error-output.txt
fwrite($pipes[0], '<?php print_r($_ENV); ?>');
fclose($pipes[0]);
echo stream_get_contents($pipes[1]);
fclose($pipes[1]);
// It is important that you close any pipes before calling
// proc_close in order to avoid a deadlock
$return_value = proc_close($process);
echo "command returned $return_value\n";
}
?>
The example should echo the env array like the documentation say. But on my machine (PHP 5.4.6-1ubuntu1.4 (cli)) the echoed array is empty. Are there some Suhosin or php.ini restrictions that ban env var passing to processes? I have no idea.
If $_ENV is empty, you should have a look at your variables_order ini setting and ensure that the value contains the E
However, you can use $_SERVER instead:
fwrite($pipes[0], '<?php print_r($_SERVER); ?>');
It will contain environment variables too and should be enabled at 99.999% of servers (I guess)
Set
variables_order = "EGPCS"
in your php.ini

Call a program via shell_exec with utf-8 text input

Perquisites: hunspell and php5.
Test code from bash:
user#host ~/ $ echo 'sagadījās' | hunspell -d lv_LV,en_US
Hunspell 1.2.14
+ sagadīties
- works properly.
Test code (test.php):
$encoding = "lv_LV.utf-8";
setlocale(LC_CTYPE, $encoding); // test
putenv('LANG='.$encoding); // and another test
$raw_response = shell_exec("LANG=$encoding; echo 'sagadījās' | hunspell -d lv_LV,en_US");
echo $raw_response;
returns
Hunspell 1.2.14
& sagad 5 0: tagad, sagad?ties, sagaudo, sagand?, sagar?o
*
*
Screenshot (could not post code with invalid characters):
It seems that shell_exec cannot handle utf-8 correctly, or maybe some additional encoding/decoding is needed?
EDIT: I had to use en_US.utf-8 to get valid data.
Try this code:
<?php
// The word we are checking
$subject = 'sagadījās';
// We want file pointers for all 3 std streams
$descriptors = array (
0 => array("pipe", "r"), // STDIN
1 => array("pipe", "w"), // STDOUT
2 => array("pipe", "w") // STDERR
);
// An environment variable
$env = array(
'LANG' => 'lv_LV.utf-8'
);
// Try and start the process
if (!is_resource($process = proc_open('hunspell -d lv_LV,en_US', $descriptors, $pipes, NULL, $env))) {
die("Could not start Hunspell!");
}
// Put pipes into sensibly named variables
$stdIn = &$pipes[0];
$stdOut = &$pipes[1];
$stdErr = &$pipes[2];
unset($pipes);
// Write the data to the process and close the pipe
fwrite($stdIn, $subject);
fclose($stdIn);
// Display raw output
echo "STDOUT:\n";
while (!feof($stdOut)) echo fgets($stdOut);
fclose($stdOut);
// Display raw errors
echo "\n\nSTDERR:\n";
while (!feof($stdErr)) echo fgets($stdErr);
fclose($stdErr);
// Close the process pointer
proc_close($process);
?>
Don't forget to verify that the encoding of the file (and therefore the encoding of the data you are passing) actually is UTF-8 ;-)

Categories