PHP passthru: unable to get full response from python script

PHP passthru: unable to get full response from python script - php

I'm trying to get data from Python script:
import pymorphy2
import json
import sys
morph = pymorphy2.MorphAnalyzer()
butyavka = morph.parse(sys.argv[1])[0]
for item in butyavka.lexeme:
print(item.word)
PHP code:
<?php
chdir('C:\\Users\Michael-PC\AppData\Local\Programs\Python\Python35-32');
$out;
passthru('python WordAnalizator.py "слово"', $out);
echo($out);
?>
If I use console, it make correct response, like:
But in PHP I have only first word:
Whats wrong?

This is obvious encoding problem (Russian letters become unreadable). So, try to set (i.e. change default) encoding in the PHP code, e.g. add to header usage of Unicode:
header('Content-Type: text/html; charset=utf-8');
If charset=utf-8 does not help, try charset=windows-1251 instead.
UPDATE:
Do not forget to save your file (PHP code in UTF encoding for utf-8, or ANSI for windows-1251)

Related

PHP read a line from a csv file return wrong in charset

I got a csv file, if I set the charset to ISO-8859-2(eastern europe) in Libre Calc, than it renders the characters correctly, but since the server's locale set to EN-UK.
I can not read the characters correctly, for example:
it returns : T�t insted of Tót.
I tried many things like:
echo (mb_detect_encoding("T�t","ISO-8859-2","UTF-8"));
I know probably the char does not exist in UTF-8 but I tried.
Also tried to setup the correct charset in the header:
header('Content-Type: text/html; charset=iso-8859-2');
echo "T�th";
but its returns : TÄĹźËth insted of Tóth.
Please help me solve this, thanks in advance

I advise against setting the header to charset=iso-8859-2'. It is usual to work with UTF-8. If the data is available with a different encoding, it should be converted to UTF-8 and then processed as CSV. The following example code could be kept as simple as the newline characters in UTF-8 and iso-8859-2 are the same.
$fileName = "yourpath/Iso8859_2.csv";
$fp = fopen($fileName,"r");
while($row = fgets($fp)){
$strUtf8 = mb_convert_encoding($row,'UTF-8','ISO-8859-2');
$arr = str_getcsv($strUtf8);
var_dump($arr);
}
fclose($fp);
The exact encoding of the CSV file must be known. mb_detect_encoding is not suitable for determining the encoding of a file.

Delphi and PHP UTF-8 Returned Data

I have a PHP application that calculate all my data and return answers to HTML page with AJAX and JQuery.
in top of my PHP code I add
header("Content-Type: text/html; charset=utf-8");
and return data with
echo json_encode($returndata);
and also i save my file with UTF-8 Format
all things work good with HTML but when i get PHP response with a another program for example Delphi IDHTTP then arabic character show like :\u06f2 \u0634\u0647\u0631\u06cc\u0648\u0631 \u06f1\u06f3\u06f9\u06f9 \u06f1\u06f4:\u06f1\u06f3
i run PHP on server with IIS
here is my Delphi code
try
RespJson := IdHTTP1.Post
('http://192.168.0.6:1000/allcalculate.php',data);
finally
data.Free;
end;
delete(RespJson,length(RespJson),1);
delete(RespJson,1,1);
RespJson := StripChars(RespJson,['"']);
arrresp:= splitstring(RespJson,',');
arrresp:=splitstring(arrresp[30],':');
advedit48.Text:=arrresp[1];
how can i fix this problem
thank you and sorry for bad english

Encoding bug with an image and file_get_contents in PHP

I use this code to retrieve and display an image:
header("Content-type: image/png");
echo file_get_contents(site_domain().image_asset_module_url("header.png",$this->name));
on my local WAMP it works, but on the remote server file_get_contents returns a wrong-encoded string:
Local:
‰PNG IHDR^jRÀ2¡ pHYsÒÝ~üÿÿIDATxÚì½˜Uõµþ¿`ŠŠÔéÃÕ¨¹&&ù'77¹i¦˜è‰=V:RlH‡™aAlH™B¯Jbh...
Remote:
�PNG IHDR^jR�2� pHYs��~���IDATx����U����`������ը�&&�'77�i��草=V:Rl...
If I use utf8_encode I get:
PNG IHDR^jRÀ2¡ pHYsÒÝ~üÿÿIDATxÚì½Uõµþ¿`ÔéÃÕ¨¹&&ù'77¹i¦è=V:RlHaAlHB¯Jbh...
So I always get a break picture on my remote Server - why and what is the solution?

The data is always the same. file_get_contents does not alter data in any way. You're also not dealing with text in some encoding, but with binary data. Any sort of text-encoding or conversion thereof does not apply here.
Your first sample is the binary image data as interpreted as Latin-1 encoded text.
Your second sample is the same binary data as interpreted as UTF-8 encoded text.
I.e., the data is fine, the interpretation is wrong. The interpretation should be set by the Content-Type header, perhaps this is not being set correctly on the remote server. For this problem, inspect the raw HTTP response headers and see How to fix "Headers already sent" error in PHP.

I would have rather used
<?php
$file = 'http://url/to_image.png';
$data = file_get_contents($file);
header('Content-type: image/png');
echo $data;
Or Can you try this
$remoteImage = "http://www.example.com/gifs/logo.gif";
$imginfo = getimagesize($remoteImage);
header("Content-type: $imginfo['mime']");
readfile($remoteImage);

shell_exec() to call a php file returns incorrect output character encoding

Ok, here's my scenario.
I have file.php that contains the following:
<?php
$output = shell_exec("php output.php");
echo $output;
?>
And the output.php contains the following:
<?php
echo "This is my output!";
?>
When I run file.php from a web browser, I get the following output:
‹ ÉÈ,V¢ÜJ…üÒ’‚ÒEÿÿp³*š
However, when I run the same php output.php directly from the shell, I get the correct output:
This is my output!
Now I'm well aware that this is some sort of encoding issue, but I cannot for the life of me figure out how to resolved it. I've tried setting the language using putenv('LANG=en_US.UTF-8');. I also tried using header('Content-Type: text/html; charset=UTF-8'); and even trying to determine what encoding type is being outputted using mb_detect_encoding($out, 'UTF-8', true);. without result.
exec() produces the same, malformed output.
I would really appreciate if anyone can shed some light on this and can possibly provide some insight on what is happening between the shell_exec and the output of the file to cause the output to be malformed.

The problem was the PHP output was being compressed twice, due to output compression being enabled.
The solution is to disable zlib.output_compression either by an entry in your .htaccess file, or by including the following at the top of your .php file:
ini_set('zlib.output_compression', 'Off');

fwrite() and UTF8

I am creating a file using php fwrite() and I know all my data is in UTF8 ( I have done extensive testing on this - when saving data to db and outputting on normal webpage all work fine and report as utf8.), but I am being told the file I am outputting contains non utf8 data :( Is there a command in bash (CentOS) to check the format of a file?
When using vim it shows the content as:
Donâ~#~Yt do anything .... Itâ~#~Ys a
great site with
everything....Weâ~#~Yve only just
launched/
Any help would be appreciated: Either confirming the file is UTF8 or how to write utf8 content to a file.
UPDATE
To clarify how I know I have data in UTF8 i have done the following:
DB is set to utf8 When saving data
to database I run this first:
$enc = mb_detect_encoding($data);
$data = mb_convert_encoding($data, "UTF-8", $enc);
Just before I run fwrite i have checked the data with Note each piece of data returns 'IS utf-8'
if (strlen($data)==mb_strlen($data, 'UTF-8')) print 'NOT UTF-8';
else print 'IS utf-8';
Thanks!

If you know the data is in UTF8 than you want to set up the header.
I wrote a solution answering to another tread.
The solution is the following: As the UTF-8 byte-order mark is \xef\xbb\xbf we should add it to the document's header.
<?php
function writeStringToFile($file, $string){
$f=fopen($file, "wb");
$file="\xEF\xBB\xBF".$file; // this is what makes the magic
fputs($f, $string);
fclose($f);
}
?>
You can adapt it to your code, basically you just want to make sure that you write a UTF8 file (as you said you know your content is UTF8 encoded).

fwrite() is not binary safe. That means, that your data - be it correctly encoded or not - might get mangled by this command or it's underlying routines.
To be on the safe side, you should use fopen() with the binary mode flag. that's b. Afterwards, fwrite() will safe your string data "as-is", and that is in PHP until now binary data, because strings in PHP are binary strings.
Background: Some systems differ between text and binary data. The binary flag will explicitly command PHP on such systems to use the binary output. When you deal with UTF-8 you should take care that the data does not get's mangeled. That's prevented by handling the string data as binary data.
However: If it's not like you told in your question that the UTF-8 encoding of the data is preserved, than your encoding got broken and even binary safe handling will keep the broken status. However, with the binary flag you still ensure that this is not the fwrite() part of your application that is breaking things.
It has been rightfully written in another answer here, that you do not know the encoding if you have data only. However, you can validate data if it validates UTF-8 encoding or not, so giving you at least some chance to check the encoding. A function in PHP which does this I've posted in a UTF-8 releated question so it might be of use for you if you need to debug things: Answer to: SimpleXML and Chinese look for can_be_valid_utf8_statemachine, that's the name of the function.

//add BOM to fix UTF-8 in Excel
fputs($fp, $bom =( chr(0xEF) . chr(0xBB) . chr(0xBF) ));
I find this piece works for me :)

The problem is your data is double encoded. I assume your original text is something like:
Don’t do anything
with ’, i.e., not the straight apostrophe, but the right single quotation mark.
If you write a PHP script with this content and encoded in UTF-8:
<?php
//File in UTF-8
echo utf8_encode("Don’t"); //this will double encode
You will get something similar to your output.

$handle = fopen($file,"w");
fwrite($handle, pack("CCC",0xef,0xbb,0xbf));
fwrite($handle,$file);
fclose($handle);

I know all my data is in UTF8 - wrong.
Encoding it's not the format of a file. So, check charset in headers of the page, where you taking data from:
header("Content-type: text/html; charset=utf-8;");
And check if data really in multi-byte encoding:
if (strlen($data)==mb_strlen($data, 'UTF-8')) print 'not UTF-8';
else print 'utf-8';

There is some reason:
first you get information from database it is not utf-8.
if you sure that was true use this ,I always use this and it work :
$file= fopen('../logs/logs.txt','a');
fwrite($file,PHP_EOL."_____________________output_____________________".PHP_EOL);
fwrite($file,print_r($value,true));

The only thing I had to do is add a UTF8 BOM to the CSV, the data was correct but the file reader (external application) couldn't read the file properly without the BOM

Try this simple method that is more useful and add to the top of the page before tag <body> :
<head>
<meta charset="utf-8">
</head>

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

PHP passthru: unable to get full response from python script - php

Related

PHP read a line from a csv file return wrong in charset

Delphi and PHP UTF-8 Returned Data

Encoding bug with an image and file_get_contents in PHP

shell_exec() to call a php file returns incorrect output character encoding

fwrite() and UTF8

Categories

Resources