PHP string parameter best practice

PHP string parameter best practice - php

I have a PHP function that internally builds an object graph using among other things one string parameter, then uses json_encode() to create a JSON string and then post the JSON string to a remote web service.
Like this:
function send($text)
{
$payload = array(
'text' => $text
// Set additional properties here ...
);
$payload_json = json_encode($payload);
// Post $payload_json to remote service with Curl ...
}
From the manual of json_encode (http://php.net/manual/en/function.json-encode.php)
All string data must be UTF-8 encoded.
I see a couple of options:
Attempt to validate that $text is in fact UTF-8 and throw an exception if it is not
Attempt to detect the encoding in $text and convert it to UTF-8 if necessary
Return false when $text is not UTF-8
Communicate with my API users in documentation that $text must be UTF-8
Check for error with json_last_error() and throw an exception if an error was encountered
What is the best practice?

You should always communicate to the users of the API what you're expecting.
If you expect UTF-8 encoded text, say so in your documentation.
Once it's in there, you should return a descriptive error such as "Invalid Encoding, for more information read the documentation: link" where link goes to the relevant page for the call that failed
This way, you're not responsible anymore, and the developers that use your API will know what is going wrong, and you don't have to worry about it in your API.
You're the developer, it's your API, and your API has it's own rules, if people want to use it, they need to follow the rules you set.

Related

How to parse invalid JSON with PHP

I'm working on a careers listing page for our company. We are using an API to retrieve the information from our HR software provider. The JSON appears to be invalid. I'm using the below to test it.
<?php
//Testing if ADPs json is valid
$json = json_decode($jsondata);
if (json_last_error() === JSON_ERROR_NONE) {
// $json contains a valid json string. It's ready to use.
print_r($jobdata);
} else {
// oops, it's not valid JSON.
echo '<h2>'.'We\'re sorry. We are unable to list jobs right now. Please contact'.' careers#domain.com'.'</h2>';
}
?>
Is there a way I can parse the invalid JSON? It appears that there is an unexpected bracket somewhere.

json_decode is failing for a reason. Do you really want to deserialize malformed data and further base your logic on such data?
Again, before manually decoding anything check the following:
Is the server returning data character-encoded in an encoding not expected by your application. Especially if your backend is running on an linux-based server and the 3rd party API on windows one
Is the JSON not further encoded by your application internal logic. A good try would be to check for encoded html entities and/or decode them with html_entity_decode()
From my experience - if such problems persist, try to use an wrapper which automates previously mentioned steps for you (and translated semi-json expressions, like mongodb queries, javascript expressions):
https://github.com/zendframework/zend-json

Trouble decoding Json from Google

Well, I'm trying to retrieve some info, from the Google Suggest tool.
The thing is, the json returned after the request, doesn't seem decode-able (using json_decode) + JSONLint sees it as "invalid".
What's wrong?
{
e: "GooDUs7lFIeXO63LgBA",
c: 0,
u: "https://www.google.com/s?gs_rn\x3d24\x26gs_ri\x3dpsy-ab\x26tok\x3dt8ORbtI13MEFLoCQjPSv6w\x26cp\x3d2\x26gs_id\x3d3i\x26xhr\x3dt\x26q\x3dtemplate\x26es_nrs\x3dtrue\x26pf\x3dp\x26safe\x3doff\x26sclient\x3dpsy-ab\x26oq\x3d\x26gs_l\x3d\x26pbx\x3d1\x26bav\x3don.2,or.r_cp.r_qf.\x26bvm\x3dbv.50500085,d.bGE\x26fp\x3dc513cf9c63a02102\x26biw\x3d1304\x26bih\x3d437\x26tch\x3d1\x26ech\x3d20\x26psi\x3dFYkDUs-xCsrT4QTD9YGwDw.1375963413783.1",
p: true,
d: "[\x22template\x22,[[\x22template\\u003cb\\u003es\\u003c\\/b\\u003e\x22,0],[\x22template\\u003cb\\u003e monster\\u003c\\/b\\u003e\x22,0],[\x22template\\u003cb\\u003e c++\\u003c\\/b\\u003e\x22,0],[\x22template\\u003cb\\u003es for pages\\u003c\\/b\\u003e\x22,0]],{\x22t\x22:{\x22bpc\x22:false,\x22tlw\x22:false},\x22q\x22:\x22YjrI_EdhVrEkZrkqZwaGIJ_Ih4c\x22,\x22j\x22:\x223i\x22}]"
}
That's what JSONLint gives as an error :
Parse error on line 1:
{ e: "GooDUs7lFIeXO63L
-----^
Expecting 'STRING', '}'
P.S. Even after editing it like "e": and so on, it still gives out error regarding the value of u and claiming that it was expecting a STRING or NUMBER etc... :S

The code given in the question is not valid JSON.
In order to be valid JSON, it would be required to have the field named in quotes. There are no quotes around the e variable name, or any of the others.
This is what the JSON decoder is complaining about: It is expecting to see "e", not e.
In addition, JSON does not accept the \x escaping format (character reference in hex); it can only use the \u format (unicode character reference in decimal). The code you've provided includes escaped characters in both formats.
The question is, are you using an official Google API? Because they're usually pretty good at providing valid JSON. This isn't valid JSON, so it may be that you're not using the correct API. Another clue is that the variable names aren't very meaningful; offical APIs would normally give more meaningful variable names. If it is the correct API, you should try raising a ticket with Google to fix it; broken JSON is not good, but it should be pretty trivial for them to fix.
Assuming you can't get them to fix it and we can't find an alternative API location that does give valid data, how do we deal with what we've got?
While this code may not be valid JSON, it is valid as a Javascript object (the JSON rules are stricter that those of plain Javascript). It could therefore be run in a Javascript interpreter using eval(), if you trusted it enough for that.
The only other alternativate is to fix the string prior to parsing it so that the variable names are quoted. That's a bit of a pain, but would be do-able if the output was consistent. You'll have problems though if it ever changes (and again, if it's an unofficial API, that could happen at any time without warning).

The problem is with the backslashes in the strings (used for the escape characters)
In PHP 5.4, you can use JSON_UNESCAPED_SLASHES:
echo json_encode(JSON_STRING, JSON_UNESCAPED_SLASHES);
Otherwise, you can do the replacement-
str_replace('\\/', '/', json_encode(JSON_STRING));
Since \/ is a valid way to represent /

OK, so this is what I ended up doing (not elegant at all but it works) :
$content = preg_replace_callback(
"(\\\\x([0-9a-f]{2}))i",
function($a) {return chr(hexdec($a[1]));},
$content
);
$content = str_replace("e:","\"e\":",$content);
$content = str_replace("c:","\"c\":",$content);
$content = str_replace("u:","\"u\":",$content);
$content = str_replace("p:","\"p\":",$content);
$content = str_replace("d:","\"d\":",$content);
$content = str_replace("\"[","[",$content);
$content = str_replace("]\"","]",$content);
$content = json_decode($content);

Remove double-quotes from a json_encoded string on the keys

I have a json_encoded array which is fine.
I need to strip the double-quotes on all of the keys of the json string on returning it from a function call.
How would I go about doing this and returning it successfully?
Thanks!
I do apologise, here is a snippet of the json code:
{"start_date":"2011-01-01 09:00","end_date":"2011-01-01 10:00","text":"test"}
Just to add a little more info:
I will be retrieving the JSON via an AJAX request, so if it would be easier, I am open to ideas in how to do this on the javascript side.

EDITED as per anubhava's comment
$str = '{"start_date":"2011-01-01 09:00","end_date":"2011-01-01 10:00","text":"test"}';
$str = preg_replace('/"([^"]+)"\s*:\s*/', '$1:', $str);
echo $str;
This certainly works for the above string, although there maybe some edge cases that I haven't thought of for which this will not work. Whether this will suit your purposes depends on how static the format of the string and the elements/values it contains will be.

TL;DR: Missing quotes is how Chrome shows it is a JSON object instead of a string. Ensure that you have Header('Content-Type: application/json; charset=UTF8'); in PHP's AJAX response to solve the real problem.
DETAILS:
A common reason for wanting to solve this problem is due to finding this difference while debugging the processing of returned AJAX data.
In my case I saw the difference using Chrome's debugging tools. When connected to the legacy system, upon success, Chrome showed that there were no quotes shown around keys in the response according to the debugger. This allowed the object to be immediately treated as an object without using a JSON.parse() call. Debugging my new AJAX destination, there were quotes shown in the response and variable was a string and not an object.
I finally realized the true issue when I tested the AJAX response externally saw the legacy system actually DID have quotes around the keys. This was not what the Chrome dev tools showed.
The only difference was that on the legacy system there was a header specifying the content type. I added this to the new (WordPress) system and the calls were now fully compatible with the original script and the success function could handle the response as an object without any parsing required. Now I can switch between the legacy and new system without any changes except the destination URL.

Problems using PHP SoapClient to pass an encrypted value to a .Net SOAP service

I have a SOAP service I am calling with PHP 5.3.1's builtin SoapClient. The first operation I must perform on the service is a custom authentication operation, and one of the required parameters I must pass is a 3DES encrypted string which I am creating using PHP's mcrypt, like so:
$encryptionKey = '1234myKey1234';
$currentFormattedDate = date ("Y/m/d H:i");
$encryptedString = mcrypt_encrypt('tripledes', $encryptionKey, $currentFormattedDate, 'ecb');
If I try to just pass $encryptedString as I get it from mcrypt_encrypt() I get a fatal error on my side and no call is made:
Fatal error: SOAP-ERROR: Encoding: string 'd\xe0...' is not a valid utf-8 string in /path/to/file
However if I utf8_encode() the string as such:
$encryptedString = utf8_encode($encryptedString)
Then the call is made but their webservice responds with the following error:
The formatter threw an exception while trying to deserialize the message: There was an error while trying to deserialize parameter http://tempuri.org/:argStatusDate. The InnerException message was 'There was an error deserializing the object of type System.String. The byte 0x19 is not valid at this location. Line 2, position 318.'.
This is the closest I can get to success with this process after having tried so many things that I'm back to square one. I have verified I can just pass a bogus string which results in the expected response of not being able to authenticate.
I don't think this should make any difference since I believe the SOAP call is ultimately made as utf8, but I have tried setting 'encoding' => 'ISO-8859-1' when constructing my SoapClient in PHP and I get the same error. The call is made but the server responds with the deserialization error.
Does anyone know a better way for me to treat this encrypted string that will please both my PHP client and their .Net webservice?
Maybe the problem is on their end?
FWIW, I can also request that we change the encryption method to "Rijndael AES Block Cypher" per their documentation. Not sure if that would result in an easier to handle string.

You probably need to encode the data in a base 64 encoded CDATA segment inside the opening and closing tags. You might want to ask the creater of the service for a sample, or - if it is a webservice - try to download the definition or even create a client through discovery. Note that the last link was found using Google search, I've been out of PHP for a while.
[EDIT] changing the cipher won't help for this, although anything is better than ECB encoding XML

node.js hmac.digest() output seems wrong

I'm trying to implement a Facebook library with node.js, and the request signing isn't working. I have the PHP example seen here translated into node. I'm trying it out with the example given there, where the secret is the string "secret". My code looks like this:
var signedRequest = request.signed_request.split('.');
var sig = b64url.decode(signedRequest[0]);
var expected = crypto.createHmac('sha256', 'secret').update(signedRequest[1]).digest();
console.log(sig == expected); // false
I can't console.log the decoded strings themselves, because they have special characters that cause the console to clear (if you have a suggestion to get around that please let me know) but I can output the b64url encodings of them.
The expected encoded sig, as you can see on the FB documentation, is
vlXgu64BQGFSQrY0ZcJBZASMvYvTHu9GQ0YM9rjPSso
My expected value, when encoded, is
wr5Vw6DCu8KuAUBhUkLCtjRlw4JBZATCjMK9wovDkx7Dr0ZDRgzDtsK4w49Kw4o
So why do I think it's digest that's wrong? Maybe the error is on my side? Well, if I execute the exact example in PHP given in the documentation, the correct result comes out. But if I change the hash_hmac call so the last parameter is false, outputting hex, I get
YmU1NWUwYmJhZTAxNDA2MTUyNDJiNjM0NjVjMjQxNjQwNDhjYmQ4YmQzMWVlZjQ2NDM0NjBjZjZiOGNmNGFjYQ==
Now, if I go back to my javascript code, and change my hmac code to .digest("hex") instead of the default "binary" and log the base64 encoding of the result, I get... surprise!
YmU1NWUwYmJhZTAxNDA2MTUyNDJiNjM0NjVjMjQxNjQwNDhjYmQ4YmQzMWVlZjQ2NDM0NjBjZjZiOGNmNGFjYQ
Same, except the == signs are missing off the end, but I think that's a console thing. I can't imagine that being the issue, without them it's not even a valid base64 string length.
So, how come the digest method outputs the correct result when using hex, but the wrong answer when using binary? Is the binary not quite the same as the "raw" output of the PHP equivalent? And if that's the case what is the correct way to call it?

We have discovered that this was indeed a bug in the crypto lib, and was a known issue logged on github. We will have to upgrade and get the fix.

I am Tesserex's partner. I believe the answer may have been combination of both Tesserex's self posted answer and Juicy Scripter's answer. We were still using Node ver. 0.4.7. The bug Tesserex mentioned can be found here: https://github.com/joyent/node/issues/324. I'm not entirely certain that this bug affected us, but it seems a good possibility. We updated Node to ver 0.6.5 and applied Juicy Scripter's solution and everything is now working. Thank you.
As a note about the suggestion of using existing libraries. Most of the existing libraries require express, this is something we are trying to avoid do to some of the specifics of our application. Also the existing libraries tend to assume that your using node.js like a web server and answering a single users request at a time. We are using persistent connections with websockets and our facebook client will be handling session data for multiple users simultaneously. Eventually I hope to make our Facebook client open source for use with applications like ours.

Actually there is no problem with digest, the results of b64url.decode are in utf8 encoding by default (which can be specified by second parameter) if you use:
var sig = b64url.decode(signedRequest[0], 'binary');
var expected = crypto.createHmac('sha256', 'secret').update(signedRequest[1]).digest();
// sig === expected
signature and result of digest will be the same.
You may also check this by turning digest results into utf8 encoded string:
var sig = b64url.decode(signedRequest[0]);
var expected = crypto.createHmac('sha256', 'secret').update(signedRequest[1]).digest();
var expected_buffer = new Buffer(expected_sig.digest(), 'binary');
// sig === expected_buffer.toString()
Also you may consider using existing libraries to do that kind of work (and probably more), to name a few:
facebook-wrapper
everyauth

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.