Encoding an array to a URL using http_build_query() produces strange behaviour when an array key is also a html-char code.
For example:
return http_build_query([
'id' = > ['my', 'data', 'here'], // no problem
'class' = > ['my', 'data', 'here'], // no problem
'yen' = > ['my', 'data', 'here'], // ¥ html car is ¥
'parameter' = > ['my', 'data', 'here'], // ¶ html char is ¶
]);
and the encoded result is:
id[0]=my&id[1]=data&id[2]=here&class[0]=my&class[1]=data&class[2]=here¥[0]=my¥[1]=data¥[2]=here¶meter[0]=my¶meter[1]=data¶meter[2]=here
whats going on here, it cant be possible that i cannot use the word parameter as an array key.
If you view the source of HTML output, you will see
id%5B0%5D=my&id%5B1%5D=data&id%5B2%5D=here&class%5B0%5D=my&class%5B1%5D=data&class%5B2%5D=here¥%5B0%5D=my¥%5B1%5D=data¥%5B2%5D=here¶meter%5B0%5D=my¶meter%5B1%5D=data¶meter%5B2%5D=here
Which is correct. While displaying only, the browser will interpret malformed entities like ¥ as ¥. There is nothing to worry about on the server side.
HTML entities reference
Demo: IDEOne
Related
In my controller, I access the comment data with $this->request->data['Comment']['text']. I use CakePHP's formhelper to build the form, and a plugin called Summernote to transform the textarea into a WYSIWYG editor. I save the comment as HTML in my database.
In this case, I am trying to submit a comment with just '>'
$data = $this->request->data['Comment']['text'];
pr($data);
//returns >
pr(mb_strlen($data, utf-8));
//returns 4
pr(mb_strlen('>', utf-8));
//returns 1
//that is the one that confuses me the most,
//it seems that there's a difference between $data and '>'
mb_detect_encoding($data);
//returns ASCII
I'm already using jQuery to check the number of characters entered on the front-end, so I can deactivate the submit-button when the user goes over the limit. This uses .innerText.length and works like a charm, but if I make that the only check people can just go into the element editor and re-enable the submit button to send however long comments they like.
EDIT:
var_dump($this->request->data['Comment']['text']) gave me the following result:
Note that unlike in the examples above, I am trying to send '>>>' here
array (size=1)
'text' => string '>>>' (length=12)
EDIT:
Alex_Tartan figured out the problem: I needed to do html_entity_decode() on my string before counting it with mb_strlen()!
I've tested the case here: https://3v4l.org/VLr9e
What might be the case is an untrimmed $data (white spaces won't show up in regular print - you can use var_dump($data)).
The textarea tag will enclose formatting spaces into the value.
Check out Why is textarea filled with mysterious white spaces?
so for that, you can do:
$data = '> ';
$data = trim($data);
// var_dump(data) will output:
// string(4) "> "
echo $data."\n";
//returns >
echo mb_strlen($data, 'UTF-8')."\n";
//returns 1
echo mb_strlen('>', 'UTF-8')."\n";
//returns 1
Update (from comments):
The problem was encoded html characters which needed to be decoded:
$data = html_entity_decode($data);
I'm encoding array of Image URLS into json string and store them in database. (utf8_general_ci).
When I insert data into table and retrive it, json_decode() is capable of decoding it.
However, when I copy data from one table to another (INSERT INTO ... SELECT statement) data after retrieving from database cannot be decoded anymore.
Instead, i get corrupted json ENCoded string. Even empty array [] cannot be properly decoded.
It converts from http://pl.tinypic.com/r/fwoiol/8
into http://pl.tinypic.com/r/bgea05/8
(had to make images since those squares cannot be copied as text).
Edit, After checking a bit more i tried to bin2hex() both strings from database.
Both seem to be exactly same.
However, one decodes and one does not. The
5b22687474703a5c2f5c2f7777772e
changes into
0022687474703a5c2f5c2f7777772e
So, json_decode only changes 5b into 00 in string.
It's like It's losing encoding somewhere?
Edit 2
static public function jsonDecodeFieldsArray($entries, $fields = array('features','images')){
foreach($entries as $key => $entry){
$entries[$key] = self::jsonDecodeFields($entry, $fields);
}
return $entries;
}
static public function jsonDecodeFields($entry, $fields = array('features','images')){
foreach($fields as $field){
if(isset($entry[$field])){
$entry[$field] = json_decode((string) $entry[$field], true);
}
}
return $entry;
}
I'm using code above, to decode keys of array specified by $fields. However, it not only decodes wrongfully. But also affects keys that are not listed in $fields. Corrupting their encodings.
More to add. If I dont use those functions and use only json_decode on fields json_decode($array[0][images], true) it works fine.
To Clarify that I found answer/solution I write this Answer
The reason behoind this error was not SQL error and data was proper. I had an example array of:
$many_entries = array(
array(
'features' = > 'json_encoded_string'
'images' = > 'json_encoded_string'
),
array(
'features' = > 'json_encoded_string'
'images' = > 'json_encoded_string'
)
);
// And
$one_entry = array(
'features' = > 'json_encoded_string'
'images' = > 'json_encoded_string'
);
Now I had 2 functions. One to Parse $many_entries (jsonDecodeFieldsArray) array and one to Parse $one_entry array structure (jsonDecodeFields).
The problem was I used jsonDecodeFieldsArray on $one_entry which made jsonDecodeFields iterate on strings.
It is odd that the character encoding is changing through the transmission. I would say check your charset(s) in PHP but you said the issue is only occurring in a table => table SQL transfer. I would still check the charset of the column / table.
You can fix the issue by running a str_replace() upon decoding. For example:
$DB_ARRAY = $DB_QUERY->fetch_array();
$CORRECT_ENCODING = json_decode(str_replace('0x93', '[', $DB_ARRAY['urlstring']), true);
You would, of course, need to know what the wrongly encoded character is. Or its ASCII code equivalent.
I have sth like that inside *.txt file.
function_name({"one": {"id": "id_for_one", "value": "value_for_one"}, ...});
And I am getting the file like this:
$source = 'FILE_NAME.txt';
$json = json_decode(file_get_contents($source),true);
echo $json['one']['value'];
It doesn't work, but when I remove function_name( and ); it works.
How to parse it without removing these strings?
You can't. It is not valid JSON with those. Take a substring that excludes them.
You will have to remove those strings. With the function_name portion it is not valid JSON.
A JSON string will typically either begin with { (object notation) or [ (array notation), but can also be scalar values such as a string or number. You cannot parse it without first making sure the string is valid JSON.
You are trying to get the string within a file and decoding it as a JSON file.
The 'function_name' isn't a valid JSON string, the rest inside yes.
How to parse it without removing these strings?
There is no way.
This should work for you.
$data = file_get_contents($source);
$data = substr($data, strlen("function_name("));
$data{strlen($data)-1}=$data{strlen($data)-2}=" ";
$json = json_decode($data,true);
Both {} and [] works for string to access individual characters.
The function in your text file, means that isn't a json file.
Remove the string using a regular expression, and your problem is fixed.
If the function is a fixed name, do something like this:
$source = 'FILE_NAME.txt';
$json_content = str_replace('function_name(', '', file_get_contents($source));
$json_content = substr($json_content,0,-2);
$json = json_decode($json_content,true);
echo $json['one']['value'];
I have this code here:
case 'resource_list':
if(file_exists('content.php')){
include('../ajax/content.php');
} else {
die('does not exist');
}
$html = render_content_page($array[1],$array[2]);
$slate = 'info_slate';
$reply_array = array(
'html' => $html,
'slate' => $slate
);
echo json_encode($reply_array);
break;
i have debugged every level right up until json_encode() is called. But the data i receive back in my ajax is nul for the html key. This code is essentially a copy and paste of another case the just calls a function other than render_content_page() but that works perfectly fine.
$reply_array var_exports to:
array (
'html' => '<ol>
<li unit="quiz" identifier=""><img src="img/header/notifications.png"/>Fran�ois Vase Volute Krater</li>
</ol>',
'slate' => 'info_slate',
)
My initial thought is that special character in Fran�ois Vase Volute Krater, as json_encode only works with UTF-8 encoded data.
Try UTF-8 encoding it before JSON encoding it like so:
json_encode(utf8_encode("Fran�ois Vase Volute Krater"));
Maybe problem is with encoding?
As manual states, json_encode() works only only with utf8 encoded data:
This function only works with UTF-8 encoded data.
http://php.net/json_encode
As documented, json_encode expects its input text in UTF-8. Most likely, your input (the ç) is not in UTF-8.
Use utf8_encode (if you're currently using ISO-8859-1) or mb_convert_encoding (otherwise) to convert input strings to UTF-8.
I am retrieving json from a php file that connects to the database and then creates the json to be imported into my page via an ajax call.
Some of the database columns have file paths in them, ie images/myfolder/myfile
The json that I see when I open the page in a web browser formats the file path like this:
\/images\/myfolder\/myfile
Are the escaped characters going to break the json?
i.e, if i do var icon_image = myData.file
will icon_image hold: \/images\/myfolder\/myfile or /images/myfolder/myfile
I am hoping the second option from above, but if it doesn't how do I get it to display as /images/myfolder/myfile
I have put mb_internal_encoding( 'UTF-8' ); at the top of the php page
The script that generates the json is as follows:
mb_internal_encoding( 'UTF-8' );
mysql_select_db($database_growth_conn, $growth_conn);
$query_rs_icons = sprintf("SELECT * FROM icons_ico ORDER BY name_ico");
//echo($query_rs_icons);
$rs_icons = mysql_query($query_rs_icons, $growth_conn) or die(mysql_error());
$row_rs_icons = mysql_fetch_assoc($rs_icons);
$totalRows_rs_icons = mysql_num_rows($rs_icons)
$rows = array();
while($r = mysql_fetch_assoc($rs_icons)) {
$rows[] = $r;
}
$jsondata = json_encode($rows);
echo '{"icons":'.$jsondata.'}';
are the escaped characters going to break the json?
No. A backslash-escaped non-special punctuation is just the same as the punctuation itself. The JS string literals "a/b" and "a\/b" result in the same string value, a/b.
json_encode escapes the forward slash character for you so that if you try to include a string with the sequence </script> in it in a script element, it doesn't prematurely end the script block.