While evaluating performance of PHP frameworks I came across a strange problem
Sending a JSON as application/json seems to be much slower than sending with no extra header (which seems to fallback to text/html)
Example #1 (application/json)
header('Content-Type: application/json');
echo json_encode($data);
Example #2 (text/html)
echo json_encode($data);
Testing with apache bench (ab -c10 -n1000) gives me:
Example #1: 350 #/sec
Example #2: 440 #/sec
which shows that setting the extra header seems to be a little bit slower.
But:
Getting the same JSONs via "ajax" (jQuery.getJSON('url', function(j){console.log(j)});) makes the difference very big (timing as seen in Chrome Web Inspector):
Example #1: 340 ms / request
Example #2: 980 ms / request
Whats the matter of this difference?
Is there a reason to use application/json despite the performance difference?
I'll take up the last part of question:
Is there a reason to use application/json despite the performance
difference?
Answer: Yes
Why:
1) text/html can often be malformed json and will go uncaught until you try parsing it. application/json will fail and you can easily debug whenever the json is malformed
2) If you are viewing json in browser, having the header type will format it in a user-friendly formatting. text/html will show it more as a blob.
3) If you are consuming this json on your webpage, application/json will immediately be converted into js object and you may access them as obj.firstnode.childnode etc.
4) callback feature can work on application/json, but not on text/html
Note:
Using gzip will sufficiently alleviate the performance problem. text/html will still be bit faster, but not the recommended way for fetching json objects
Would like to see more insight on performance though. Header length is definitely not causing performance issue. More to do with your webserver analyzing the header format.
Does your server handle gzipping/deflate differently depending on content-type? Mine does. Believe ab does not accept gzip by default. (You can set this in ab with a custom header with the -H flag). But Chrome will always say it accepts gzipping.
You can use curl test to see if the files are different sizes:
curl http://www.example.com/whatever --silent -H "Accept-Encoding: gzip,deflate" --write-out "size_download=%{size_download}\n" --output /dev/null
You can also look at the headers to see if gzipping is applied:
curl http://www.example.com/whatever -I -H "Accept-Encoding: gzip,deflate"
Related
I seem to be stuck at sending the compressed messages from PHP to NodeJS over Amazon SQS.
Over on the PHP side I have:
$SQS->sendMessage(Array(
'QueueUrl' => $queueUrl,
'MessageBody' => 'article',
'MessageAttributes' => Array(
'json' => Array(
'BinaryValue' => bzcompress(json_encode(Array('type'=>'article','data'=>$vijest))),
'DataType' => 'Binary'
)
)
));
NOTE 1: I also tried putting compressed data directly in the message, but the library gave me an error with some invalid byte data
On the Node side, I have:
body = decodeBzip(message.MessageAttributes.json.BinaryValue);
Where message is from sqs.receiveMessage() call and that part works since it worked for raw (uncompressed messages)
What I am getting is TypeError: improper format
I also tried using:
PHP - NODE
gzcompress() - zlib.inflateraw()
gzdeflate() - zlib.inflate()
gzencode() - zlib.gunzip()
And each of those pairs gave me their version of the same error (essentially, input data is wrong)
Given all that I started to suspect that an error is somewhere in message transmission
What am I doing wrong?
EDIT 1: It seems that the error is somewhere in transmission, since bin2hex() in php and .toString('hex') in Node return totally different values. It seems that Amazon SQS API in PHP transfers BinaryAttribute using base64 but Node fails to decode it. I managed to partially decode it by turning off automatic conversion in amazon aws config file and then manually decoding base64 in node but it still was not able to decode it.
EDIT 2: I managed to accomplish the same thing by using base64_encode() on the php side, and sending the base64 as a messageBody (not using MessageAttributes). On the node side I used new Buffer(messageBody,'base64') and then decodeBzip on that. It all works but I would still like to know why MessageAttribute is not working as it should. Current base64 adds overhead and I like to use the services as they are intended, not by work arounds.
This is what all the SQS libraries do under the hood. You can get the php source code of the SQS library and see for yourself. Binary data will always be base64 encoded (when using MessageAttributes or not, does not matter) as a way to satisfy the API requirement of having form-url-encoded messages.
I do not know how long the data in your $vijest is, but I am willing to bet that after zipping and then base64 encoding it will be bigger than before.
So my answer to you would be two parts (plus a third if you are really stubborn):
When looking at the underlying raw API it is absolutely clear that not using MessageAttributes does NOT add additional overhead from base64. Instead, using MessageAttributes adds some slight additional overhead because of the structure of the data enforced by the SQS php library. So not using MessageAttributes is clearly NOT a workaround and you should do it if you want to zip the data yourself and you got it to work that way.
Because of the nature of a http POST request it is a very bad idea to compress your data inside your application. Base64 overhead will likely nullify the compression advantage and you are probably better off sending plain text.
If you absolutely do not believe me or the API spec or the HTTP spec and want to proceed, then I would advise to send a simple short string 'teststring' in the BinaryValue parameter and compare what you sent with what you got. That will make it very easy to understand the transformations the SQS library is doing on the BinaryValue parameter.
gzcompress() would be decoded by zlib.Inflate(). gzdeflate() would be decoded by zlib.InflateRaw(). gzencode() would be decoded by zlib.Gunzip(). So out of the three you listed, two are wrong, but one should work.
I use WordPress 4.1.1.
I tried to install the JSON API plugin.
Strange letters are displayed above the JSON content. And they update after refresh of the page.
I tried to bring another letter under the code of plugin. These letters appeared under these figures, so is the problem in the WordPress system?
Please help me to understand and to remove them, because I can't parse my JSON.
On localhost it works fine with the same properties and data...
The letter are: 7b00c, 78709, 6eb3d... and they change with updates..
The strange characters is probably a chunk-size.
Content-Length
When a server-side process sends a response through an HTTP server, the data will typically be stored in a buffer before it is transmitted to the client (browser). If the entire response fits in the buffer in a timely manner, the server will declare the size in a Content-Length: header, and send the response as-is to the client.
Chunked Transfer Coding
If the response does not fit in the buffer, or the server decides to vacate the buffer for other reasons before the full size is known, it will instead send the response in chunks. This is indicated by the Transfer-Encoding: chunked header. Each chunk is preceeded by its length in hexadecimal (followed by a CRLF-pair). The end of the response is indicated by a 0 chunk-size. The exact syntax is detailed below.
Solution
If you are parsing the HTTP response yourself, there are all sorts of intricacies that you need to consider. Chunked encoding is one of them. You need to check for the Transfer-Encoding: chunked header and assemble the response by parsing and stripping out the chunk-size parts.
It's much easier to use a library such as cURL which will handle all the details for you.
One hack to avoid chunks is to send the response using HTTP/1.0 rather than HTTP/1.1. In HTTP/1.0, the length is indicated either by the Content-Length: header, or by closing the connection.
Syntax
This is the syntax for chunked bodies specified in RFC 7230 - "Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing" (ABNF notation):
4.1. Chunked Transfer Coding
chunked-body = *chunk
last-chunk
trailer-part
CRLF
chunk = chunk-size [ chunk-ext ] CRLF
chunk-data CRLF
chunk-size = 1*HEXDIG
last-chunk = 1*("0") [ chunk-ext ] CRLF
chunk-data = 1*OCTET ; a sequence of chunk-size octets
trailer-part = *( header-field CRLF )
How do I correctly compress a string, so PHP would be able to decompress?
I tried this:
public static byte[] compress(String string) throws IOException {
ByteArrayOutputStream os = new ByteArrayOutputStream(string.length());
DeflaterOutputStream gos = new DeflaterOutputStream(os);
// ALSO TRIED GZOutputStream, same results!
gos.write(string.getBytes());
gos.close();
byte[] compressed = os.toByteArray();
os.close();
return compressed;
}
But PHP does not recognize output as valid GZip compressed string...
The problem seems to be in some headers / footers being added by Android...
For example when I compress something word via PHP with gzcompress I got similar results as with Android, but not similar enough, so PHP could read it:
something (HEX DUMP):
Android: 1f8b08000000000000002bcecf4d2dc9c8cc4b0700fb31da0909000000
PHP: 789c2bcecf4d2dc9c8cc4b0700134703cf
The weirdest thing is that by changing GZOutputStream to DeflaterOutputStream it fixed the problem with something word, but the problem still appears with longer strings...
PS. Removing heading 10 characters from Android generated data does not help at all.
EDIT: I tried to decompress it in PHP with:
gzdecode() - this function does not exist in standard Debian PHP5
version.
gzdecompress() - does not work
And some functions to emulate gzdecode() from PHP site comments that don't really do much.
All above, with removing first 10 bytes and leaving them.
PS2. I tried every single solution from Stack Overflow, and other sources, and still nothing. It is not a duplicate.
EDIT2 (BINARY DUMP): Sample data generated with Android that can't be decomprssed by gzuncompress() or pseudo-gzdecode() functions from PHP.NET: data.compressed.
It supposed to be some JSON, after decompression.
The Android data that starts with 1f8b is a gzip stream. In php you use gzdecode() for that. gzencode() on php makes gzip streams.
The php data that starts with 789c is a zlib stream. You used gzcompress() to make that, and you would use gzuncompress() to decode it.
The compressed data contained within both of those streams, starting with 2bce is raw deflate data. You can use gzinflate() to decode that if you happened to make it somewhere, and you can use gzdeflate() to generate raw deflate.
Just to rant, gzencode(), gzcompress(), and gzdeflate() are some of the most misleading function names ever concocted, since only one of them is related to gzip yet all start with gz, and nothing in the name gzcompress() indicates zlib.
Update:
The "EDIT2" data is, for some reason, doubly compressed. It was compressed first to the zlib format, and then that zlib stream was compressed to the gzip format. (Though gzip couldn't compress the already compressed data, so it's a little bigger.)
You should repair the problem that made it doubly compressed. Or if you have no control over that, you can doubly decompress it, first stripping the gzip header using the RFC 1952 specification and then gzinflate() on the raw deflate data, and then using gzdecompress() on the result.
Sometimes when downloading sources of a webpage and trying to decode it, i get an error: gzdecode() insufficient memory. (memory limit i 500m, usage far below that)
I include headers with my curl output, those are separated from the content correctly, before decoding. Content encoding header of the pages is clearly gzip. I read on php.net that including a length argument could cause such a crash, but I do not use length argument with gzdecode.
So while seemingly everything should be fine, I still get the error. Last time I found it with this page: https://ahmia.fi/address/.
Is there probably something with the https I do not know about? My curl settiing is \CURLOPT_SSL_VERIFYPEER => false.
Any help appreciated!
CURLOPT_ENCODING The contents of the "Accept-Encoding: " header. This enables decoding of the response. Supported encodings are "identity", "deflate", and "gzip". If an empty string, "", is set, a header containing all supported encoding types is sent.
try this:
CURLOPT_ENCODING => ""
I'm trying to mimick an application that sends octet streams to and from a server. The data contained in the body looks like raw bytes, and I'm fairly certain the data being sent for each command is static, so I'm hoping to map the bytes to something more readable in my application. For example, I'll have an array that does: "test" => "&^D^^&#*#dgkel" So I can call "test" and get the real bytes that need to be sent. Trouble is, PHP seems to convert these bytes. I'm not sure if it is an encoding problem or what, but what has been happing is I'll give it some bytes (for example, �ھ����#�qs��������������������X����������������������������) which has a length of 67 I believe, but PHP will say (when I do a var_dump of the HTTP request) that the headers sent contained "Content-Length: 174" or something close to that and the bytes will look like �ھ����#�qs��������������������X����������������������������
So I'm not really sure how to fix this.. Anyone have any ideas? Cheers!
Edit, a little PHP:
$request = new HttpRequest($this->GetMessageURL(), HTTP_METH_POST);
$request->addHeaders($headers);
$request->addRawPostData($buttonMapping[$button]);
$request->send();