Google Translate API and Character Encoding - php

I am using the below PHP code with Google's Translate API and I have read that json_encode requires UTF-8 input, so was wondering how do I know if Google is returning UTF-8 encoded characters to me?
// URL Encode string
$str = urlencode($str);
// Make request
$response = file_get_contents('https://www.googleapis.com/language/translate/v2?key=' . GTRAN_KEY . '&target=es&source=en&q=' . $str);
// Decode json response to array
$json = json_decode($response,true);

if json_decode fails (for any reason, including charset-problems) it will return null, so you could check for that. to specifically chck the encoding, you could use mb_detect_encoding.
if(!mb_detect_encoding($response, 'UTF-8', true)){
// error: no utf-8
}else{
$json = json_decode($response,true);
if($json === null){
// error: json_decode failed (or google returned 'null')
}else{
// ok, do great stuff here
}
}

Related

PHP escape unicode characters only

in Facebook validation documentation
Please note that we generate the signature using an escaped unicode
version of the payload, with lowercase hex digits. If you just
calculate against the decoded bytes, you will end up with a different
signature. For example, the string äöå should be escaped to
\u00e4\u00f6\u00e5.
I'm trying to make a unittest for the validation that I have, but I don't seem to be able to produce the signutre because I can't escape the payload. I've tried
mb_convert_encoding($payload, 'unicode')
But this encodes all the payload, and not just the needed string, as Facebook does.
My full code:
// on the unittest
$content = file_get_contents(__DIR__.'/../Responses/whatsapp_webhook.json');
// trim whitespace at the end of the file
$content = trim($content);
$secret = config('externals.meta.config.app_secret');
$signature = hash_hmac(
'sha256',
mb_convert_encoding($content, 'unicode'),
$secret
);
$response = $this->postJson(
route('whatsapp.webhook.message'),
json_decode($content, true),
[
'CONTENT_TYPE' => 'text/plain',
'X-Hub-Signature-256' => $signature,
]
);
$response->assertOk();
// on the request validation
/**
* #var string $signature
*/
$signature = $request->header('X-Hub-Signature-256');
if (!$signature) {
abort(Response::HTTP_FORBIDDEN);
}
$signature = Str::after($signature, '=');
$secret = config('externals.meta.config.app_secret');
/**
* #var string $content
*/
$content = $request->getContent();
$payloadSignature = hash_hmac(
'sha256',
$content,
$secret
);
if ($payloadSignature !== $signature) {
abort(Response::HTTP_FORBIDDEN);
}
For one, mb_convert_encoding($payload, 'unicode') converts the input to UTF-16BE, not UTF-8. You would want mb_convert_encoding($payload, 'UTF-8').
For two, using mb_convert_encoding() without specifying the source encoding causes the function to assume that the input is using the system's default encoding, which is frequently incorrect and will cause your data to be mangled. You would want mb_convert_encoding($payload, 'UTF-8', $source_encoding). [Also, you cannot reliably detect string encoding, you need to know what it is.]
For three, mb_convert_encoding() is entirely the wrong function to use to apply the desired escape sequences to the data. [and good lord are the google results for "php escape UTF-8" awful]
Unfortunately, PHP doesn't have a UTF-8 escape function that isn't baked into another function, but it's not terribly difficult to write in userland.
function utf8_escape($input) {
$output = '';
for( $i=0,$l=mb_strlen($input); $i<$l; ++$i ) {
$cur = mb_substr($input, $i, 1);
if( strlen($cur) === 1 ) {
$output .= $cur;
} else {
$output .= sprintf('\\u%04x', mb_ord($cur));
}
}
return $output;
}
$in = "asdf äöå";
var_dump(
utf8_escape($in),
);
Output:
string(23) "asdf \u00e4\u00f6\u00e5"
Instead of trying to re-assemble the payload from the already decoded JSON, you should take the data directly as you received it.
Facebook sends Content-Type: application/json, which means PHP will not populate $_POST to begin with - but you can read the entire request body using file_get_contents('php://input').
Try and calculate the signature based on that, that should work without having to deal with any hassles of encoding & escaping.

Get JSON object from URL Difficulties

I'm having difficulties grabbing any of the JSON information from this URL.
I've tried other JSON snippets and they seem to work so I'm not sure if it's the way that the URL is structured or something.
Basic example below.
<?php
$json = file_get_contents('http://nhs-sh.cfpreview.co.uk/api/version/fetchLatestData?dataType=Clinics&versionNumber=-1&uuID=website&dt=');
$obj = json_decode($json);
echo "Body: " . $obj->Body;
?>
The link provided starts with
{ data :
which is valid javascript but invalid json. You can test it on http://jsonlint.com. To fix this we can replace the data with "data" :
$json = file_get_contents('http://nhs-sh.cfpreview.co.uk/api/version/fetchLatestData?dataType=Clinics&versionNumber=-1&uuID=website&dt=');
$obj = json_decode($json);
if (json_last_error() !== JSON_ERROR_NONE) { //check if there was an error decoding json
$json = '{ "data" :'. substr(trim($json), 8); // replace the first 8-1 characters with { "data" :
$obj = json_decode($json);
}
print_r($obj->data); //show contents of data
Please note that this fix is dependent on the data source e.g. if they change data to dataset. The correct measure would be to ask the developers to fix their json implementation.

How to consume clean JSON from twitter in php

May be I am missing something very basic here, but this is what I tried and in some cases my JSON is not a valid JSON.
Code -
$hash_tag = $_POST['hash_tag'];
$url = 'https://api.twitter.com/1.1/search/tweets.json';
$getfield = "?q=#".$hash_tag."&count=30";
$requestMethod = 'GET';
$twitter = new TwitterAPIExchange($settings);
$data = $twitter->setGetfield($getfield)
->buildOauth($url, $requestMethod)
->performRequest();
$tdata = json_decode($data);
$newArr0 = array();
foreach($tdata as $k=>$v) {
if($k == "statuses")
$newArr0 = $v;
}
foreach ($newArr0 as &$nval1) {
unset($nval1->source);
if(isset($nval1->retweeted_status)) {
unset($nval1->retweeted_status);
}
}
//stripslashes($newArr0);
$myarray = array('response'=>'1','message'=>'Tweet result', 'tweet_data'=>$newArr0);
echo json_encode($myarray);
Like I searched for Britney but it gives me an Invalid JSON - JSON
Error -
Parse error on line 623:
... "location": "WVU 2017 \u
-----------------------^
Expecting 'STRING', 'NUMBER', 'NULL', 'TRUE', 'FALSE', '{', '['
But If I search "sachin" I am getting Valid JSON, Let me know what I am doing wrong here, as I need to send the JSON encoded stream to mobile devices from back end.
The JSON response contains no errors. The error is that you are pasting the response into http://jsonlint.com/ or some other service which clearly cuts the string at exactly the offset you are showing the error.
Stop using the service for such long string and remember to check what you pasted to the original response you got in PHP.
Also, when you have problems like this, cut your logic into smaller parts. Such as
$tdata = json_decode($data);
after this do a
var_dump($tdata);
die;
as you want to cut out whatever you are doing after getting and decoding the response and narrowing the reason to the failure. The rest of the code is when debugging the response, just noise.
If you only run the code in top, and not manually copy pasting it, it works fine..

PHP: json_decode not working

This does not work:
$jsonDecode = json_decode($jsonData, TRUE);
However if I copy the string from $jsonData and put it inside the decode function manually it does work.
This works:
$jsonDecode = json_decode('{"id":"0","bid":"918","url":"http:\/\/www.google.com","md5":"6361fbfbee69f444c394f3d2fa062f79","time":"2014-06-02 14:20:21"}', TRUE);
I did output $jsonData copied it and put in like above in the decode function. Then it worked. However if I put $jsonData directly in the decode function it does not.
var_dump($jsonData) shows:
string(144) "{"id":"0","bid":"918","url":"http:\/\/www.google.com","md5":"6361fbfbee69f444c394f3d2fa062f79","time":"2014-06-02 14:20:21"}"
The $jsonData comes from a encrypted $_GET variable. To encrypt it I use this:
$key = "SOME KEY";
$iv_size = mcrypt_get_iv_size(MCRYPT_BLOWFISH, MCRYPT_MODE_ECB);
$iv = mcrypt_create_iv($iv_size, MCRYPT_RAND);
$enc = mcrypt_encrypt(MCRYPT_BLOWFISH, $key, $data, MCRYPT_MODE_ECB, $iv);
$iv = rawurlencode(base64_encode($iv));
$enc = rawurlencode(base64_encode($enc));
//To Decrypt
$iv = base64_decode(rawurldecode($_GET['i']));
$enc = base64_decode(rawurldecode($_GET['e']));
$data = mcrypt_decrypt(MCRYPT_BLOWFISH, $key, $enc, MCRYPT_MODE_ECB, $iv);
some time there is issue of html entities, for example \" it will represent like this \&quot, so you must need to parse the html entites to real text, that you can do using
html_entity_decode()
method of php.
$jsonData = stripslashes(html_entity_decode($jsonData));
$k=json_decode($jsonData,true);
print_r($k);
You have to use preg_replace for avoiding the null results from json_decode
here is the example code
$json_string = stripslashes(html_entity_decode($json_string));
$bookingdata = json_decode( preg_replace('/[\x00-\x1F\x80-\xFF]/', '', $json_string), true );
Most likely you need to strip off the padding from your decrypted data. There are 124 visible characters in your string but var_dump reports 144. Which means 20 characters of padding needs to be removed (a series of "\0" bytes at the end of your string).
Probably that's 4 "\0" bytes at the end of a block + an empty 16-bytes block (to mark the end of the data).
How are you currently decrypting/encrypting your string?
Edit:
You need to add this to trim the zero bytes at the end of the string:
$jsonData = rtrim($jsonData, "\0");
Judging from the other comments, you could use,
$jsonDecode = json_decode(trim($jsonData), TRUE);
While moving on php 7.1 I encountered with json_decode error number 4 (json syntex error). None of the above solution on this page worked for me.
After doing some more searching i found solution at https://stackoverflow.com/a/15423899/1545384 and its working for me.
//Remove UTF8 Bom
function remove_utf8_bom($text)
{
$bom = pack('H*','EFBBBF');
$text = preg_replace("/^$bom/", '', $text);
return $text;
}
Be sure to set header to JSON
header('Content-type: application/json;');
str_replace("\t", " ", str_replace("\n", " ", $string))
because json_decode does not work with special characters. And no error will be displayed. Make sure you remove tab spaces and new lines.
Depending on the source you get your data, you might need also:
stripslashes(html_entity_decode($string))
Works for me:
<?php
$sql = <<<EOT
SELECT *
FROM `students`;
EOT;
$string = '{ "query" : "' . str_replace("\t", " ", str_replace("\n", " ", $sql)).'" }';
print_r(json_decode($string));
?>
output:
stdClass Object
(
[query] => SELECT * FROM `students`;
)
I had problem that json_decode did not work, solution was to change string encoding to utf-8. This is important in case you have non-latin characters.
Interestingly mcrypt_decrypt seem to add control characters other than \0 at the end of the resulting text because of its padding algorithm. Therefore instead of rtrim($jsonData, "\0")
it is recommended to use
preg_replace( "/\p{Cc}*$/u", "", $data)
on the result $data of mcrypt_decrypt. json_decode will work if all trailing control characters are removed. Pl refer to the comment by Peter Bailey at http://php.net/manual/en/function.mdecrypt-generic.php .
USE THIS CODE
<?php
$json = preg_replace('/[[:cntrl:]]/', '', $json_data);
$json_array = json_decode($json, true);
echo json_last_error();
echo json_last_error_msg();
print_r($json_array);
?>
Make sure your JSON is actually valid. For some reason I was convinced that this was valid JSON:
{ type: "block" }
While it is not. Point being, make sure to validate your string with a linter if you find json_decode not te be working.
Try the JSON validator.
The problem in my case was it used ' not ", so I had to replace it to make it working.
In notepad+ I changed encoding of json file on: "UTF-8 without BOM".
JSON started to work
TL;DR Be sure that your JSON not containing comments :)
I've taken a JSON structure from API reference and tested request using Postman. I've just copy-pasted the JSON and didn't pay attention that there was a comment inside it:
...
"payMethod": {
"type": "PBL" //or "CARD_TOKEN", "INSTALLMENTS"
},
...
Of course after deletion the comment json_decode() started working like a charm :)
Use following function:
If JSON_ERROR_UTF8 occurred :
$encoded = json_encode( utf_convert( $responseForJS ) );
Below function is used to encode Array data recursively
/* Use it for json_encode some corrupt UTF-8 chars
* useful for = malformed utf-8 characters possibly incorrectly encoded by json_encode
*/
function utf_convert( $mixed ) {
if (is_array($mixed)) {
foreach ($mixed as $key => $value) {
$mixed[$key] = utf8ize($value);
}
} elseif (is_string($mixed)) {
return mb_convert_encoding($mixed, "UTF-8", "UTF-8");
}
return $mixed;
}
Maybe it helps someone, check in your json string if you have any NULL values, json_decode will not work if a NULL is present as a value.
This super basic function may help you. I made the NULL in an array just in case I need to add more stuff in the future.
function jsonValueFix($json){
$json = str_replace( array('NULL'),'""',$json );
return $json;
}
I just used json_decode twice and it worked for me
$response = json_decode($apiResponse, true);
$response = json_decode($response, true);

Convert "Java source code characters" in JSON string using PHP

I have a JSON string that contains Dal\u00e9. When I use json_decode on the JSON, it is converted to Dalé, however the original string that the JSON is from is Dalé. Why is this not converted properly?
I have found that "\u00E9" is the C/C++/Java source code encoding for é. However, to me this doesn't answer why this is going wrong.
Example of incorrect PHP output:
<?php
$opts = array('http'=>array('ignore_errors' => true));
$context = stream_context_create($opts);
$jsonurl = "http://api.kivaws.org/v1/loans/552804.json";
$json = file_get_contents($jsonurl, false, $context);
$json_output = array(json_decode($json));
$json_error = $json_output[0]->error;
$json_message = $json_error->message;
foreach ($json_output[0]->{'loans'} as $loan) {
echo 'Name: '.$loan->{'name'};
}
?>
You need to tell the web browser what encoding you are giving it.
<?php
header('content-type: text/plain; charset=utf-8');
var_dump(json_decode($jsonStr));
if you are using php 5.4 you may use the function options of json_encode() like this :-
echo $b=json_encode('Dalé',JSON_UNESCAPED_UNICODE);
echo json_decode($b);

Categories