My PHP_EOL is "\r\n", however, when I do print_r on an array each new line has a "\n" - not a "\r\n" - placed after it.
Any idea if it's possible to change this behavior?
If you look the source code of print_r you'll find:
PHP_FUNCTION(print_r)
{
zval *var;
zend_bool do_return = 0;
if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "z|b", &var, &do_return) == FAILURE) {
RETURN_FALSE;
}
if (do_return) {
php_output_start_default(TSRMLS_C);
}
zend_print_zval_r(var, 0 TSRMLS_CC);
if (do_return) {
php_output_get_contents(return_value TSRMLS_CC);
php_output_discard(TSRMLS_C);
} else {
RETURN_TRUE;
}
}
ultimately you can ignore the stuff arround zend_print_zval_r(var, 0 TSRMLS_CC); for your question.
If you follow the stacktrace, you'll find:
ZEND_API void zend_print_zval_r(zval *expr, int indent TSRMLS_DC) /* {{{ */
{
zend_print_zval_r_ex(zend_write, expr, indent TSRMLS_CC);
}
which leads to
ZEND_API void zend_print_zval_r_ex(zend_write_func_t write_func, zval *expr, int indent TSRMLS_DC) /* {{{ */
{
switch (Z_TYPE_P(expr)) {
case IS_ARRAY:
ZEND_PUTS_EX("Array\n");
if (++Z_ARRVAL_P(expr)->nApplyCount>1) {
ZEND_PUTS_EX(" *RECURSION*");
Z_ARRVAL_P(expr)->nApplyCount--;
return;
}
print_hash(write_func, Z_ARRVAL_P(expr), indent, 0 TSRMLS_CC);
Z_ARRVAL_P(expr)->nApplyCount--;
break;
From this point on, you could continue to find the relevant line - but since there is already a hardcoded "Array\n" - i'll assume the rest of the print_r implementation uses the same hardcoded \n linebreak-thing.
So, to answer your question: You cannot change it to use \r\n.
Use one of the provided workarounds.
Sidenode: Since print_r is mainly used for debugging, this will do the job as well:
echo "<pre>";
print_r($object);
echo "</pre>";
Use second param in print_r (set true), read DOC:
http://www.php.net/manual/en/function.print-r.php
See:
mixed print_r ( mixed $expression [, bool $return = false ] );
Example:
$eol = chr(10); //Break line in like unix
$weol = chr(13) . $eol; //Break line with "carriage return" (required by some text editors)
$data = print_r(array(...), true);
$data = str_replace(eol, weol, $data);
echo $data;
Like pointed out elsewhere on this page, the newlines are hardcoded in the PHP source, so you have to replace them manually.
You could use your own version of print_r like this:
namespace My;
function print_r($expression, $return = false)
{
$out = \print_r($expression, true);
$out = \preg_replace("#(?<!\r)\n#", PHP_EOL, $out);
if ($return) {
return $out;
}
echo $out;
return true;
}
Whenever you want to use it, you just import it with
// aliasing a function (PHP 5.6+)
use My\print_r as print_r;
print_r("A string with \r\n is not replaced");
print_r("A string with \n is replaced");
This will then use PHP_EOL for newlines. Note that it will only substitute newlines, e.g. \n, but not any \r\n you might have in the $expression. This is to prevent any \r\n to become \r\r\n.
The benefit of doing it this way is that it will work as a drop-in replacement of the native function. So any code that already uses the native print_r can be replaced simply by adding the use statement.
This may not be the most elegant solution, but you could capture the print_r() output using buffer output, then use str_replace() to replace existences of \n with your PHP_EOL. In this example I've replaced it with x to show that it's working...
ob_start();
$test_array = range('A', 'Z');
print_r($test_array);
$dump = ob_get_contents();
ob_end_clean();
As pointed out by dognose, since PHP 4.3 you can return the result of print_r() into a string (more elegant):
$dump = print_r($test_array, true);
Then replace line endings:
$dump = str_replace("\n", "x" . PHP_EOL, $dump);
echo $dump;
Output:
Arrayx
(x
[0] => Ax
[1] => Bx
[2] => Cx
[3] => Dx
[4] => Ex
[5] => Fx
[6] => Gx
... etc
[25] => Zx
)x
Question Is it possible to change the behavior of PHP's print_r function was marked was duplicated of this one . I'd like to answer more how is possible change the behavior of print_r. My propose is do another function with another name that do the print_r customized . And we just need replace print_r functions with print_r_pretty ...
function print_r_pretty($in, $saveToString = false) {
$out = print_r($in, true);
$out = str_replace("\n", "\r\n", $out);
switch ($saveToString) {
case true: return $out;
default: echo $out;
}
}
But line :
$out = str_replace("\n", "\r\n", $out);
can be replaced by another line that do another changes to print_r like this :
$out = explode("\n", $out, 2)[1];
Related
I am being sent a csv file that is tab delimited. Here is a sample of what I see:
Invoice: Invoice Date Account: Name Bill To: First Name Bill To: Last Name Bill To: Work Email Rate Plan Charge: Name Subscription: Device Serial Number
2021-03-10 Test Company Wally Kolcz test#test.com Sample plan A0H1234567890A
I wrote a script to open, read and loop over the values but I get weird stuff after:
if (($handle = fopen($user_file, "r")) !== FALSE) {
while (($data = fgetcsv($handle, 1000, "\t")) !== FALSE) {
if($line >1 && isset($data[1])){
$user = [
'EmailAddress' => $data[4],
'Name' => $data[2].' '.$data[3],
];
}
$line++;
}
fclose($handle);
}
Here is what I get when I dump the first line.
array:7 [▼
0 => b"ÿþI\x00n\x00v\x00o\x00i\x00c\x00e\x00:\x00 \x00I\x00n\x00v\x00o\x00i\x00c\x00e\x00 \x00D\x00a\x00t\x00e\x00"
1 => "\x00A\x00c\x00c\x00o\x00u\x00n\x00t\x00:\x00 \x00N\x00a\x00m\x00e\x00"
2 => "\x00B\x00i\x00l\x00l\x00 \x00T\x00o\x00:\x00 \x00F\x00i\x00r\x00s\x00t\x00 \x00N\x00a\x00m\x00e\x00"
3 => "\x00B\x00i\x00l\x00l\x00 \x00T\x00o\x00:\x00 \x00L\x00a\x00s\x00t\x00 \x00N\x00a\x00m\x00e\x00"
4 => "\x00B\x00i\x00l\x00l\x00 \x00T\x00o\x00:\x00 \x00W\x00o\x00r\x00k\x00 \x00E\x00m\x00a\x00i\x00l\x00"
5 => "\x00R\x00a\x00t\x00e\x00 \x00P\x00l\x00a\x00n\x00 \x00C\x00h\x00a\x00r\x00g\x00e\x00:\x00 \x00N\x00a\x00m\x00e\x00"
6 => "\x00S\x00u\x00b\x00s\x00c\x00r\x00i\x00p\x00t\x00i\x00o\x00n\x00:\x00 \x00D\x00e\x00v\x00i\x00c\x00e\x00 \x00S\x00e\x00r\x00i\x00a\x00l\x00 \x00N\x00u\x00m\x00b\x00e\x00r\x00 ◀"
]
I tried adding:
header('Content-Type: text/html; charset=UTF-8');
$data = array_map("utf8_encode", $data);
setlocale(LC_ALL, 'en_US.UTF-8');
And when I dump mb_detect_encoding($data[2]), I get 'ASCII'...
Any way to fix this so I don't have to manually update the file each time I receive it? Thanks!
Looks like the file is in UTF-16 (every other byte is null).
You probably need to convert the whole file with something like mb_convert_encoding($data, "UTF-8", "UTF-16");
But you can't really use fgetcsv() in that case…
As #Andrea already mentioned, your data is encoded as UTF-16LE and you need to convert it to an encoding compatible with what you want to do. That said, it is possible to do in-flight with PHP stream filters.
abstract class TranslateCharset extends php_user_filter {
protected $in_charset, $out_charset;
private $buffer = '';
private $total_consumed = 0;
public function filter($in, $out, &$consumed, $closing) {
$output = '';
while ($bucket = stream_bucket_make_writeable($in)) {
$input = $this->buffer . $bucket->data;
for( $i=0, $p=0; ($c=mb_substr($input, $i, 1, $this->in_charset)) !== ""; ++$i, $p+=strlen($c) ) {
$output .= mb_convert_encoding($c, $this->out_charset, $this->in_charset);
}
$this->buffer = substr($input, $p);
$consumed += $p;
}
// this means that there's unconverted data at the end of the bridage.
if( $closing && strlen($this->buffer) > 0 ) {
$this->raise_error( sprintf(
"Likely encoding error at offset %d in input stream, subsequent data may be malformed or missing.",
$this->total_consumed += $consumed)
);
$consumed += strlen($this->buffer);
// give it the ol' college try
$output .= mb_convert_encoding($this->buffer, $this->out_charset, $this->in_charset);
}
$this->total_consumed += $consumed;
if ( ! isset($bucket) ) {
$bucket = stream_bucket_new($this->stream, $output);
} else {
$bucket->data = $output;
}
stream_bucket_append($out, $bucket);
return PSFS_PASS_ON;
}
protected function raise_error($message) {
user_error( sprintf(
"%s[%s]: %s",
__CLASS__, get_class($this), $message
), E_USER_WARNING);
}
}
class UTF16LEtoUTF8 extends TranslateCharset {
protected $in_charset = 'UTF-16LE';
protected $out_charset = 'UTF-8';
}
stream_filter_register('UTF16LEtoUTF8', 'UTF16LEtoUTF8');
// properly-encoded UTF-16BE example input "Invoice:,a"
$in = "\xFE\xFFI\x00n\x00v\x00o\x00i\x00c\x00e\x00:\x00,\x00a\x00";
// prep example pipe, in practice this would simple be your fopen() call.
$fh = fopen('php://memory', 'rwb+');
fwrite($fh, $in);
rewind($fh);
// skip BOM
fseek($fh, 2);
stream_filter_append($fh, 'UTF16LEtoUTF8', STREAM_FILTER_READ);
var_dump(fgetcsv($fh, 4096));
Output:
array(2) {
[0]=>
string(8) "Invoice:"
[1]=>
string(1) "a"
}
In practice there is no "magic bullet" to detect the encoding of an input file or string. In this case there is a Byte Order Mark [BOM] of 0xFF 0xFE that denotes that this in UTF-16LE but the BOM is frequently omitted, or may simply occur naturally at the beginning of any arbitrary string, or is simply not required for most encodings, or is simply not used by whoever encoded the data.
That last bit is the exact reason why everyone should avoid the utf8_encode() and utf8_decode() functions like the plague, because they simply assume that you only ever want to go between UTF-8 and ISO-8859-1 [western european], and make no effort to avoid corrupting your data when used incorrectly because they can't possibly know any better.
TLDR: You must explicitly know the encoding of your input data, or you're going to have a bad time.
Edit: Since I've gone and put a proper spitshine on this I've put it up as a Composer package, in case anyone else needs something like this.
https://packagist.org/packages/wrossmann/costrenc
I ended up with is as working code:
$f = file_get_contents($user_file);
$f = mb_convert_encoding($f, 'UTF8', 'UTF-16LE');
$f = preg_split("/\R/", $f);
$f = array_map('str_getcsv', $f);
$line = 0;
foreach($f as $record){
if($line !== 0 && isset($record[0])){
$pieces = preg_split('/[\t]/',$record[0]);
//My work here
}
}
Thank you everyone for your examples and suggestions!
Using PHP5 (cgi) to output template files from the filesystem and having issues spitting out raw HTML.
private function fetch($name) {
$path = $this->j->config['template_path'] . $name . '.html';
if (!file_exists($path)) {
dbgerror('Could not find the template "' . $name . '" in ' . $path);
}
$f = fopen($path, 'r');
$t = fread($f, filesize($path));
fclose($f);
if (substr($t, 0, 3) == b'\xef\xbb\xbf') {
$t = substr($t, 3);
}
return $t;
}
Even though I've added the BOM fix I'm still having problems with Firefox accepting it. You can see a live copy here: http://ircb.in/jisti/ (and the template file I threw at http://ircb.in/jisti/home.html if you want to check it out)
Any idea how to fix this? o_o
you would use the following code to remove utf8 bom
//Remove UTF8 Bom
function remove_utf8_bom($text)
{
$bom = pack('H*','EFBBBF');
$text = preg_replace("/^$bom/", '', $text);
return $text;
}
try:
// -------- read the file-content ----
$str = file_get_contents($source_file);
// -------- remove the utf-8 BOM ----
$str = str_replace("\xEF\xBB\xBF",'',$str);
// -------- get the Object from JSON ----
$obj = json_decode($str);
:)
Another way to remove the BOM which is Unicode code point U+FEFF
$str = preg_replace('/\x{FEFF}/u', '', $file);
b'\xef\xbb\xbf' stands for the literal string "\xef\xbb\xbf". If you want to check for a BOM, you need to use double quotes, so the \x sequences are actually interpreted into bytes:
"\xef\xbb\xbf"
Your files also seem to contain a lot more garbage than just a single leading BOM:
$ curl http://ircb.in/jisti/ | xxd
0000000: efbb bfef bbbf efbb bfef bbbf efbb bfef ................
0000010: bbbf efbb bf3c 2144 4f43 5459 5045 2068 .....<!DOCTYPE h
0000020: 746d 6c3e 0a3c 6874 6d6c 3e0a 3c68 6561 tml>.<html>.<hea
...
if anybody using csv import then below code useful
$header = fgetcsv($handle);
foreach($header as $key=> $val) {
$bom = pack('H*','EFBBBF');
$val = preg_replace("/^$bom/", '', $val);
$header[$key] = $val;
}
This global funtion resolve for UTF-8 system base charset. Tanks!
function prepareCharset($str) {
// set default encode
mb_internal_encoding('UTF-8');
// pre filter
if (empty($str)) {
return $str;
}
// get charset
$charset = mb_detect_encoding($str, array('ISO-8859-1', 'UTF-8', 'ASCII'));
if (stristr($charset, 'utf') || stristr($charset, 'iso')) {
$str = iconv('ISO-8859-1', 'UTF-8//TRANSLIT', utf8_decode($str));
} else {
$str = mb_convert_encoding($str, 'UTF-8', 'UTF-8');
}
// remove BOM
$str = urldecode(str_replace("%C2%81", '', urlencode($str)));
// prepare string
return $str;
}
An extra method to do the same job:
function remove_utf8_bom_head($text) {
if(substr(bin2hex($text), 0, 6) === 'efbbbf') {
$text = substr($text, 3);
}
return $text;
}
The other methods I found cannot work in my case.
Hope it helps in some special case.
A solution without pack function:
$a = "1";
var_dump($a); // string(4) "1"
function deleteBom($text)
{
return preg_replace("/^\xEF\xBB\xBF/", '', $text);
}
var_dump(deleteBom($a)); // string(1) "1"
I'm not so fond of using preg_replace or preg_match for simple tasks. What about this alternative method of detecting and removing the BOM?
function remove_utf8_bom(string $text): string
{
$bomStart = mb_substr($text, 0, 1);
return ($bomStart == pack('H*','EFBBBF')) ?
mb_substr($text, 1) :
$text;
}
If you are reading some API using file_get_contents and got an inexplicable NULL from json_decode, check the value of json_last_error(): sometimes the value returned from file_get_contents will have an extraneous BOM that is almost invisible when you inspect the string, but will make json_last_error() to return JSON_ERROR_SYNTAX (4).
>>> $json = file_get_contents("http://api-guiaserv.seade.gov.br/v1/orgao/all");
=> "\t{"orgao":[{"Nome":"Tribunal de Justi\u00e7a","ID_Orgao":"59","Condicao":"1"}, ...]}"
>>> json_decode($json);
=> null
>>>
In this case, check the first 3 bytes - echoing them is not very useful because the BOM is invisible on most settings:
>>> substr($json, 0, 3)
=> " "
>>> substr($json, 0, 3) == pack('H*','EFBBBF');
=> true
>>>
If the line above returns TRUE for you, then a simple test may fix the problem:
>>> json_decode($json[0] == "{" ? $json : substr($json, 3))
=> {#204
+"orgao": [
{#203
+"Nome": "Tribunal de Justiça",
+"ID_Orgao": "59",
+"Condicao": "1",
},
],
...
}
When working with faulty software it happens that the BOM part gets multiplied with every saving.
So I am using this to get rid of it.
function remove_utf8_bom($text) {
$bom = pack('H*','EFBBBF');
while (preg_match("/^$bom/", $text)) {
$text = preg_replace("/^$bom/", '', $text);
}
return $text;
}
How about this:
function removeUTF8BomHeader($data) {
if (substr($data, 0, 3) == pack('CCC', 0xef, 0xbb, 0xbf)) {
$data = substr($data, 3);
}
return $data;
}
tested a lot and it works perfect without any issue
I use the PHP zip:// stream wrapper to parse large XML files line by line. For example:
$stream_uri = 'zip://' . __DIR__ . '/archive.zip#foo.xml';
$reader = new XMLReader();
$reader->open( $stream_uri, null );
$reader->read();
while ( true ) {
echo( $reader->readInnerXml() . PHP_EOL );
if ( ! $reader->next() ) {
break;
}
}
Quite often an XML file will include dodgy UTF control characters XMLReader doesn't like. So I'd like to implement a custom stream wrapper I can pass the output of the zip:// stream to, which will run a preg_replace on each line to remove those characters.
My dream is to be able to do this:
stream_wrapper_register( 'xmlchars', 'XML_Chars' );
$stream_uri = 'xmlchars://zip://' . __DIR__ . '/archive.zip#foo.xml';
and have XMLReader happily read the tidied-up nodes. I've figured out a way to reconstruct the zip stream URI based on the path passed to my wrapper:
class XML_Chars {
protected $stream_uri = '';
protected $handle;
function stream_open( $path, $mode, $options, &$opened_path ) {
$parsed_url = parse_url( $path );
$this->stream_uri = 'zip:' . $parsed_url['path'] . '#' . $parsed_url['fragment'];
return true;
}
}
But I'm puzzled about the best way to open the zip:// stream so I can modify its output and pass the result through to the XMLReader. Can anyone give me any pointers about how to implement that?
In case useful to anybody else, I've found a different way to solve the problem: a stream filter. You define it like this:
class UTF_Character_Filter extends php_user_filter {
public function filter( $in, $out, &$consumed, $closing ) {
while ( $bucket = stream_bucket_make_writeable( $in ) ) {
$consumed += $bucket->datalen;
// Remove characters in the hex range 0 - 8, B and C, E to 1F
// i.e. all control characters except newline, tab and return
$bucket->data = preg_replace( '|[\x0-\x8\xB-\xC\xE-\x1F]|ms', '', $bucket->data );
stream_bucket_append( $out, $bucket );
}
return PSFS_PASS_ON;
}
}
stream_filter_register( 'utf_character_filter', 'UTF_Character_Filter' );
And use it like this:
php://filter/read=utf_character_filter/resource=zip://archive.zip#import.xml
I'd still be interested to know if anyone's figured out how to make a stream wrapper that can accept the input of another stream wrapper though, as it could be a handy tool.
This does not work:
$jsonDecode = json_decode($jsonData, TRUE);
However if I copy the string from $jsonData and put it inside the decode function manually it does work.
This works:
$jsonDecode = json_decode('{"id":"0","bid":"918","url":"http:\/\/www.google.com","md5":"6361fbfbee69f444c394f3d2fa062f79","time":"2014-06-02 14:20:21"}', TRUE);
I did output $jsonData copied it and put in like above in the decode function. Then it worked. However if I put $jsonData directly in the decode function it does not.
var_dump($jsonData) shows:
string(144) "{"id":"0","bid":"918","url":"http:\/\/www.google.com","md5":"6361fbfbee69f444c394f3d2fa062f79","time":"2014-06-02 14:20:21"}"
The $jsonData comes from a encrypted $_GET variable. To encrypt it I use this:
$key = "SOME KEY";
$iv_size = mcrypt_get_iv_size(MCRYPT_BLOWFISH, MCRYPT_MODE_ECB);
$iv = mcrypt_create_iv($iv_size, MCRYPT_RAND);
$enc = mcrypt_encrypt(MCRYPT_BLOWFISH, $key, $data, MCRYPT_MODE_ECB, $iv);
$iv = rawurlencode(base64_encode($iv));
$enc = rawurlencode(base64_encode($enc));
//To Decrypt
$iv = base64_decode(rawurldecode($_GET['i']));
$enc = base64_decode(rawurldecode($_GET['e']));
$data = mcrypt_decrypt(MCRYPT_BLOWFISH, $key, $enc, MCRYPT_MODE_ECB, $iv);
some time there is issue of html entities, for example \" it will represent like this \", so you must need to parse the html entites to real text, that you can do using
html_entity_decode()
method of php.
$jsonData = stripslashes(html_entity_decode($jsonData));
$k=json_decode($jsonData,true);
print_r($k);
You have to use preg_replace for avoiding the null results from json_decode
here is the example code
$json_string = stripslashes(html_entity_decode($json_string));
$bookingdata = json_decode( preg_replace('/[\x00-\x1F\x80-\xFF]/', '', $json_string), true );
Most likely you need to strip off the padding from your decrypted data. There are 124 visible characters in your string but var_dump reports 144. Which means 20 characters of padding needs to be removed (a series of "\0" bytes at the end of your string).
Probably that's 4 "\0" bytes at the end of a block + an empty 16-bytes block (to mark the end of the data).
How are you currently decrypting/encrypting your string?
Edit:
You need to add this to trim the zero bytes at the end of the string:
$jsonData = rtrim($jsonData, "\0");
Judging from the other comments, you could use,
$jsonDecode = json_decode(trim($jsonData), TRUE);
While moving on php 7.1 I encountered with json_decode error number 4 (json syntex error). None of the above solution on this page worked for me.
After doing some more searching i found solution at https://stackoverflow.com/a/15423899/1545384 and its working for me.
//Remove UTF8 Bom
function remove_utf8_bom($text)
{
$bom = pack('H*','EFBBBF');
$text = preg_replace("/^$bom/", '', $text);
return $text;
}
Be sure to set header to JSON
header('Content-type: application/json;');
str_replace("\t", " ", str_replace("\n", " ", $string))
because json_decode does not work with special characters. And no error will be displayed. Make sure you remove tab spaces and new lines.
Depending on the source you get your data, you might need also:
stripslashes(html_entity_decode($string))
Works for me:
<?php
$sql = <<<EOT
SELECT *
FROM `students`;
EOT;
$string = '{ "query" : "' . str_replace("\t", " ", str_replace("\n", " ", $sql)).'" }';
print_r(json_decode($string));
?>
output:
stdClass Object
(
[query] => SELECT * FROM `students`;
)
I had problem that json_decode did not work, solution was to change string encoding to utf-8. This is important in case you have non-latin characters.
Interestingly mcrypt_decrypt seem to add control characters other than \0 at the end of the resulting text because of its padding algorithm. Therefore instead of rtrim($jsonData, "\0")
it is recommended to use
preg_replace( "/\p{Cc}*$/u", "", $data)
on the result $data of mcrypt_decrypt. json_decode will work if all trailing control characters are removed. Pl refer to the comment by Peter Bailey at http://php.net/manual/en/function.mdecrypt-generic.php .
USE THIS CODE
<?php
$json = preg_replace('/[[:cntrl:]]/', '', $json_data);
$json_array = json_decode($json, true);
echo json_last_error();
echo json_last_error_msg();
print_r($json_array);
?>
Make sure your JSON is actually valid. For some reason I was convinced that this was valid JSON:
{ type: "block" }
While it is not. Point being, make sure to validate your string with a linter if you find json_decode not te be working.
Try the JSON validator.
The problem in my case was it used ' not ", so I had to replace it to make it working.
In notepad+ I changed encoding of json file on: "UTF-8 without BOM".
JSON started to work
TL;DR Be sure that your JSON not containing comments :)
I've taken a JSON structure from API reference and tested request using Postman. I've just copy-pasted the JSON and didn't pay attention that there was a comment inside it:
...
"payMethod": {
"type": "PBL" //or "CARD_TOKEN", "INSTALLMENTS"
},
...
Of course after deletion the comment json_decode() started working like a charm :)
Use following function:
If JSON_ERROR_UTF8 occurred :
$encoded = json_encode( utf_convert( $responseForJS ) );
Below function is used to encode Array data recursively
/* Use it for json_encode some corrupt UTF-8 chars
* useful for = malformed utf-8 characters possibly incorrectly encoded by json_encode
*/
function utf_convert( $mixed ) {
if (is_array($mixed)) {
foreach ($mixed as $key => $value) {
$mixed[$key] = utf8ize($value);
}
} elseif (is_string($mixed)) {
return mb_convert_encoding($mixed, "UTF-8", "UTF-8");
}
return $mixed;
}
Maybe it helps someone, check in your json string if you have any NULL values, json_decode will not work if a NULL is present as a value.
This super basic function may help you. I made the NULL in an array just in case I need to add more stuff in the future.
function jsonValueFix($json){
$json = str_replace( array('NULL'),'""',$json );
return $json;
}
I just used json_decode twice and it worked for me
$response = json_decode($apiResponse, true);
$response = json_decode($response, true);
I can't seem to find a real answer to this problem so here I go:
How do you parse raw HTTP request data in multipart/form-data format in PHP? I know that raw POST is automatically parsed if formatted correctly, but the data I'm referring to is coming from a PUT request, which is not being parsed automatically by PHP. The data is multipart and looks something like:
------------------------------b2449e94a11c
Content-Disposition: form-data; name="user_id"
3
------------------------------b2449e94a11c
Content-Disposition: form-data; name="post_id"
5
------------------------------b2449e94a11c
Content-Disposition: form-data; name="image"; filename="/tmp/current_file"
Content-Type: application/octet-stream
�����JFIF���������... a bunch of binary data
I'm sending the data with libcurl like so (pseudo code):
curl_setopt_array(
CURLOPT_POSTFIELDS => array(
'user_id' => 3,
'post_id' => 5,
'image' => '#/tmp/current_file'),
CURLOPT_CUSTOMREQUEST => 'PUT'
);
If I drop the CURLOPT_CUSTOMREQUEST bit, the request is handled as a POST on the server and everything is parsed just fine.
Is there a way to manually invoke PHPs HTTP data parser or some other nice way of doing this?
And yes, I have to send the request as PUT :)
Edit - please read first: this answer is still getting regular hits 7 years later. I have never used this code since then and do not know if there is a better way to do it these days. Please view the comments below and know that there are many scenarios where this code will not work. Use at your own risk.
--
Ok, so with Dave and Everts suggestions I decided to parse the raw request data manually. I didn't find any other way to do this after searching around for about a day.
I got some help from this thread. I didn't have any luck tampering with the raw data like they do in the referenced thread, as that will break the files being uploaded. So it's all regex. This wasnt't tested very well, but seems to be working for my work case. Without further ado and in the hope that this may help someone else someday:
function parse_raw_http_request(array &$a_data)
{
// read incoming data
$input = file_get_contents('php://input');
// grab multipart boundary from content type header
preg_match('/boundary=(.*)$/', $_SERVER['CONTENT_TYPE'], $matches);
$boundary = $matches[1];
// split content by boundary and get rid of last -- element
$a_blocks = preg_split("/-+$boundary/", $input);
array_pop($a_blocks);
// loop data blocks
foreach ($a_blocks as $id => $block)
{
if (empty($block))
continue;
// you'll have to var_dump $block to understand this and maybe replace \n or \r with a visibile char
// parse uploaded files
if (strpos($block, 'application/octet-stream') !== FALSE)
{
// match "name", then everything after "stream" (optional) except for prepending newlines
preg_match('/name=\"([^\"]*)\".*stream[\n|\r]+([^\n\r].*)?$/s', $block, $matches);
}
// parse all other fields
else
{
// match "name" and optional value in between newline sequences
preg_match('/name=\"([^\"]*)\"[\n|\r]+([^\n\r].*)?\r$/s', $block, $matches);
}
$a_data[$matches[1]] = $matches[2];
}
}
Usage by reference (in order not to copy around the data too much):
$a_data = array();
parse_raw_http_request($a_data);
var_dump($a_data);
I used Chris's example function and added some needed functionality, such as R Porter's need for array's of $_FILES. Hope it helps some people.
Here is the class & example usage
<?php
include_once('class.stream.php');
$data = array();
new stream($data);
$_PUT = $data['post'];
$_FILES = $data['file'];
/* Handle moving the file(s) */
if (count($_FILES) > 0) {
foreach($_FILES as $key => $value) {
if (!is_uploaded_file($value['tmp_name'])) {
/* Use getimagesize() or fileinfo() to validate file prior to moving here */
rename($value['tmp_name'], '/path/to/uploads/'.$value['name']);
} else {
move_uploaded_file($value['tmp_name'], '/path/to/uploads/'.$value['name']);
}
}
}
I would suspect the best way to go about it is 'doing it yourself', although you might find inspiration in multipart email parsers that use a similar (if not the exact same) format.
Grab the boundary from the Content-Type HTTP header, and use that to explode the various parts of the request. If the request is very large, keep in mind that you might store the entire request in memory, possibly even multiple times.
The related RFC is RFC2388, which fortunately is pretty short.
I'm surprised no one mentioned parse_str or mb_parse_str:
$result = [];
$rawPost = file_get_contents('php://input');
mb_parse_str($rawPost, $result);
var_dump($result);
http://php.net/manual/en/function.mb-parse-str.php
I haven't dealt with http headers much, but found this bit of code that might help
function http_parse_headers( $header )
{
$retVal = array();
$fields = explode("\r\n", preg_replace('/\x0D\x0A[\x09\x20]+/', ' ', $header));
foreach( $fields as $field ) {
if( preg_match('/([^:]+): (.+)/m', $field, $match) ) {
$match[1] = preg_replace('/(?<=^|[\x09\x20\x2D])./e', 'strtoupper("\0")', strtolower(trim($match[1])));
if( isset($retVal[$match[1]]) ) {
$retVal[$match[1]] = array($retVal[$match[1]], $match[2]);
} else {
$retVal[$match[1]] = trim($match[2]);
}
}
}
return $retVal;
}
From http://php.net/manual/en/function.http-parse-headers.php
Here is a universal solution working with arbitrary multipart/form-data content and tested for POST, PUT, and PATCH:
/**
* Parse arbitrary multipart/form-data content
* Note: null result or null values for headers or value means error
* #return array|null [{"headers":array|null,"value":string|null}]
* #param string|null $boundary
* #param string|null $content
*/
function parse_multipart_content(?string $content, ?string $boundary): ?array {
if(empty($content) || empty($boundary)) return null;
$sections = array_map("trim", explode("--$boundary", $content));
$parts = [];
foreach($sections as $section) {
if($section === "" || $section === "--") continue;
$fields = explode("\r\n\r\n", $section);
if(preg_match_all("/([a-z0-9-_]+)\s*:\s*([^\r\n]+)/iu", $fields[0] ?? "", $matches, PREG_SET_ORDER) === 2) {
$headers = [];
foreach($matches as $match) $headers[$match[1]] = $match[2];
} else $headers = null;
$parts[] = ["headers" => $headers, "value" => $fields[1] ?? null];
}
return empty($parts) ? null : $parts;
}
Update
The function was updated to support arrays in form fields. That is fields like level1[level2] will be translated into proper (multidimensional) arrays.
I've just added a small function to my HTTP20 library, that can help with this. It is made to parse form data for PUT, DELETE and PATCH and add it to respective static variable to simulate $_POST global.
For now it's just for text fields, though, no binary support, since I currently do not have a good use case in my project to properly test it and I'd prefer not to share something I can't test extensively. But if I do get to it at some point - I will update this answer.
Here is the code:
public function multiPartFormParse(): void
{
#Get method
$method = $_SERVER['HTTP_ACCESS_CONTROL_REQUEST_METHOD'] ?? $_SERVER['REQUEST_METHOD'] ?? null;
#Get Content-Type
$contentType = $_SERVER['CONTENT_TYPE'] ?? '';
#Exit if not one of the supported methods or wrong content-type
if (!in_array($method, ['PUT', 'DELETE', 'PATCH']) || preg_match('/^multipart\/form-data; boundary=.*$/ui', $contentType) !== 1) {
return;
}
#Get boundary value
$boundary = preg_replace('/(^multipart\/form-data; boundary=)(.*$)/ui', '$2', $contentType);
#Get input stream
$formData = file_get_contents('php://input');
#Exit if failed to get the input or if it's not compliant with the RFC2046
if ($formData === false || preg_match('/^\s*--'.$boundary.'.*\s*--'.$boundary.'--\s*$/muis', $formData) !== 1) {
return;
}
#Strip ending boundary
$formData = preg_replace('/(^\s*--'.$boundary.'.*)(\s*--'.$boundary.'--\s*$)/muis', '$1', $formData);
#Split data into array of fields
$formData = preg_split('/\s*--'.$boundary.'\s*Content-Disposition: form-data;\s*/muis', $formData, 0, PREG_SPLIT_NO_EMPTY);
#Convert to associative array
$parsedData = [];
foreach ($formData as $field) {
$name = preg_replace('/(name=")(?<name>[^"]+)("\s*)(?<value>.*$)/mui', '$2', $field);
$value = preg_replace('/(name=")(?<name>[^"]+)("\s*)(?<value>.*$)/mui', '$4', $field);
#Check if we have multiple keys
if (str_contains($name, '[')) {
#Explode keys into array
$keys = explode('[', trim($name));
$name = '';
#Build JSON array string from keys
foreach ($keys as $key) {
$name .= '{"' . rtrim($key, ']') . '":';
}
#Add the value itself (as string, since in this case it will always be a string) and closing brackets
$name .= '"' . trim($value) . '"' . str_repeat('}', count($keys));
#Convert into actual PHP array
$array = json_decode($name, true);
#Check if we actually got an array and did not fail
if (!is_null($array)) {
#"Merge" the array into existing data. Doing recursive replace, so that new fields will be added, and in case of duplicates, only the latest will be used
$parsedData = array_replace_recursive($parsedData, $array);
}
} else {
#Single key - simple processing
$parsedData[trim($name)] = trim($value);
}
}
#Update static variable based on method value
self::${'_'.strtoupper($method)} = $parsedData;
}
Obviously you can safely remove method check and assignment to a static, if you do not those.
Have you looked at fopen("php://input", "r") for parsing the content?
Headers can also be found as $_SERVER['HTTP_*'], names are always uppercased and dashes become underscores, eg $_SERVER['HTTP_ACCEPT_LANGUAGE'].