We have an iOS app that sends data in an encoded format. In PHP the following code will decode it properly.
bson_decode(pack("H*", $hex_string));
In Python, the following code will create a valid encoded object that the PHP code can then decode (data is a dict in this).
from bson import BSON
def encode(data):
return str(BSON.encode(data)).encode('hex')
The following Python code will decode a string that was encoded by the above Python code:
from bson import BSON
def parse(str):
hexed = str.decode('hex')
return BSON.decode(BSON(hexed))
In theory that should decoded data sent from the app as well. But it throws the following exceptions:
bson.errors.InvalidBSON: bad eoo
It looks like the Objective C code that encodes the data in the app adds some extra padding. If I remove the last characters from the app encoded string it works. Is there anything I can do to account for this? Changing the app code is NOT possible. Even if it were there are millions of device running the old code which I need to support so I still need to have a fix for this.
According the BSON specification, BSON documents must be terminated with a NULL byte (\x00). Have you checked if the byte string you are trying to decode is NULL terminated? If not, you may need to append a NULL byte at the end.
Related
I am trying to fetch byte array from C# Web API. C# client can perfectly fetch byte array but it comes in PHP then it shows random string. This string looks like encoded.
I have tried the same API with POSTMAN also. Postman also provided same encoded string. How can I fetch byte array from C# web API in PHP?
I am using HTTP request with content-type of application/x-www-form-urlencoded. This API suppose to give byte array for the following the text,
Required Byte array of content: This is demo file.
Actual response: VGhpcyBpcyBkZW1vIGZpbGUuCg==
Byte values come from webapi as strings. You need to convert this value to base64 type.
string str = yourbytestring;
byte[] cnvbyte = Convert.FromBase64String(str.ToString());
I seem to be stuck at sending the compressed messages from PHP to NodeJS over Amazon SQS.
Over on the PHP side I have:
$SQS->sendMessage(Array(
'QueueUrl' => $queueUrl,
'MessageBody' => 'article',
'MessageAttributes' => Array(
'json' => Array(
'BinaryValue' => bzcompress(json_encode(Array('type'=>'article','data'=>$vijest))),
'DataType' => 'Binary'
)
)
));
NOTE 1: I also tried putting compressed data directly in the message, but the library gave me an error with some invalid byte data
On the Node side, I have:
body = decodeBzip(message.MessageAttributes.json.BinaryValue);
Where message is from sqs.receiveMessage() call and that part works since it worked for raw (uncompressed messages)
What I am getting is TypeError: improper format
I also tried using:
PHP - NODE
gzcompress() - zlib.inflateraw()
gzdeflate() - zlib.inflate()
gzencode() - zlib.gunzip()
And each of those pairs gave me their version of the same error (essentially, input data is wrong)
Given all that I started to suspect that an error is somewhere in message transmission
What am I doing wrong?
EDIT 1: It seems that the error is somewhere in transmission, since bin2hex() in php and .toString('hex') in Node return totally different values. It seems that Amazon SQS API in PHP transfers BinaryAttribute using base64 but Node fails to decode it. I managed to partially decode it by turning off automatic conversion in amazon aws config file and then manually decoding base64 in node but it still was not able to decode it.
EDIT 2: I managed to accomplish the same thing by using base64_encode() on the php side, and sending the base64 as a messageBody (not using MessageAttributes). On the node side I used new Buffer(messageBody,'base64') and then decodeBzip on that. It all works but I would still like to know why MessageAttribute is not working as it should. Current base64 adds overhead and I like to use the services as they are intended, not by work arounds.
This is what all the SQS libraries do under the hood. You can get the php source code of the SQS library and see for yourself. Binary data will always be base64 encoded (when using MessageAttributes or not, does not matter) as a way to satisfy the API requirement of having form-url-encoded messages.
I do not know how long the data in your $vijest is, but I am willing to bet that after zipping and then base64 encoding it will be bigger than before.
So my answer to you would be two parts (plus a third if you are really stubborn):
When looking at the underlying raw API it is absolutely clear that not using MessageAttributes does NOT add additional overhead from base64. Instead, using MessageAttributes adds some slight additional overhead because of the structure of the data enforced by the SQS php library. So not using MessageAttributes is clearly NOT a workaround and you should do it if you want to zip the data yourself and you got it to work that way.
Because of the nature of a http POST request it is a very bad idea to compress your data inside your application. Base64 overhead will likely nullify the compression advantage and you are probably better off sending plain text.
If you absolutely do not believe me or the API spec or the HTTP spec and want to proceed, then I would advise to send a simple short string 'teststring' in the BinaryValue parameter and compare what you sent with what you got. That will make it very easy to understand the transformations the SQS library is doing on the BinaryValue parameter.
gzcompress() would be decoded by zlib.Inflate(). gzdeflate() would be decoded by zlib.InflateRaw(). gzencode() would be decoded by zlib.Gunzip(). So out of the three you listed, two are wrong, but one should work.
I have been trying to unserialize PHP session data in Python by using phpserialize and a serek's modules(got it from Unserialize PHP data in python), but it seems like impossible to me.
Both modules expect PHP session data to be like:
a:2:{s:3:"Usr";s:5:"AxL11";s:2:"Id";s:1:"2";}
But the data stored in the session file is:
Id|s:1:"2";Usr|s:5:"AxL11";
Any help would be very much appreciated.
The default algorithm used for PHP session serialization is not the one used by serialize, but another internal broken format called php, which
cannot store numeric index nor string index contains special characters (| and !) in $_SESSION.
The correct solution is to change the crippled default session serialization format to the one supported by Armin Ronacher's original phpserialize library, or even to serialize and deserialize as JSON, by changing the session.serialize_handler INI setting.
I decided to use the former for maximal compatibility on the PHP side by using
ini_set('session.serialize_handler', 'php_serialize')
which makes the new sessions compatible with standard phpserialize.
After reaching page 3 on Google, I found a fork of the original application phpserialize that worked with the string that I provided:
>>> loads('Id|s:1:"2";Usr|s:5:"AxL11";')
{'Id': '2', 'Usr': 'AxL11'}
This is how I do it in a stupid way:
At first, convert Id|s:1:"2";Usr|s:5:"AxL11"; to a query string Id=2&Usr=AxL11& then use parse_qs:
import sys
import re
if sys.version_info >= (3, 0):
from urllib.parse import parse_qs, quote
else:
from urlparse import parse_qs
from urllib import quote
def parse_php_session(path):
with open(path, 'r') as sess:
return parse_qs(
re.sub(r'\|s:([0-9]+):"?(.*?)(?=[^;|]+\|s:[0-9]+:|$)',
lambda m : '=' + quote(m.group(2)[:int(m.group(1))]) + '&',
sess.read().rstrip().rstrip(';') + ';')
)
print(parse_php_session('/session-save-path/sess_0123456789abcdef'))
# {'Id': ['2'], 'Usr': ['AxL11']}
It used to work without replacing ; to & (both are allowed). But since Python 3.10 the default separator for parse_qs is &
I have a PHP webservice which currently returns a zip archive as its only output. I'm reading the zip archive from disk using file_get_contents and sending it back as the body of the response.
I'd like it to return some additional metadata, in a JSON format:
{
"generatedDate": "2012-11-28 12:00:00",
"status": "unchanged",
"rawData": <zip file in raw form>
}
The iOS app which talks to this service will receive this response, parse the JSON, and then store the zip file locally for its own use.
However, if I try to stuff the result of file_get_contents into json_encode, it rightfully complains that the string is not in UTF-8 format. If I UTF-8-encode it using mb_convert_encoding($rawData, 'UTF-8',
mb_detect_encoding($rawData, 'UTF-8, ISO-8859-1', true));, it will encode it happily, but I can't find a way to reverse the operation on the client (calling [dataString dataUsingEncoding:NSUTF8StringEncoding] and then treating the result as a zip file fails with BOM could not extract archive: Couldn't read pkzip local header.
Can anyone suggest a good way to insert a blob of raw data as one field in a JSON response?
Surely if you successfully included the raw data in the JSON then you'd have the opposite problem at the other end, when you try to decode the JSON and whatever you use to decode can't handle the raw data?
Instead, I would suggest that you send the raw data only in the response body, and use headers to send the metadata.
Strike this question.
It turns out that UTF-8 encoding raw data like this is nonstandard at best, and the standard solution is base-64 encoding it and then using a base-64 decoder to recover it on the client:
$this->response(200, array('rawData' => base64_encode($rawData)));
...
NSString *rawDataString = [[response responseJSON] objectForKey:#"rawData"];
NSData *rawData = [Base64 decode:rawDataString];
ZIP archives are not text—they are binary files! Trying to convert your archive from ISO-8859-1 to UTF-8 makes as much sense as trying to rotate it.
There're several algorithms to serialize binary streams as text but they'll all increase the file size. If that's not an issue, have a look at:
base64_encode()
bin2hex()
unpack()
I have to deserialize a dictionary in PHP that was serialized using cPickle in Python.
In this specific case I probably could just regexp the wanted information, but is there a better way? Any extensions for PHP that would allow me to deserialize more natively the whole dictionary?
Apparently it is serialized in Python like this:
import cPickle as pickle
data = { 'user_id' : 5 }
pickled = pickle.dumps(data)
print pickled
Contents of such serialization cannot be pasted easily to here, because it contains binary data.
If you want to share data objects between programs written in different languages, it might be easier to serialize/deserialize using something like JSON instead. Most major programming languages have a JSON library.
Can you do a system call? You could use a python script like this to convert the pickle data into json:
# pickle2json.py
import sys, optparse, cPickle, os
try:
import json
except:
import simplejson as json
# Setup the arguments this script can accept from the command line
parser = optparse.OptionParser()
parser.add_option('-p','--pickled_data_path',dest="pickled_data_path",type="string",help="Path to the file containing pickled data.")
parser.add_option('-j','--json_data_path',dest="json_data_path",type="string",help="Path to where the json data should be saved.")
opts,args=parser.parse_args()
# Load in the pickled data from either a file or the standard input stream
if opts.pickled_data_path:
unpickled_data = cPickle.loads(open(opts.pickled_data_path).read())
else:
unpickled_data = cPickle.loads(sys.stdin.read())
# Output the json version of the data either to another file or to the standard output
if opts.json_data_path:
open(opts.json_data_path, 'w').write(json.dumps(unpickled_data))
else:
print json.dumps(unpickled_data)
This way, if your getting the data from a file you could do something like this:
<?php
exec("python pickle2json.py -p pickled_data.txt", $json_data = array());
?>
or if you want to save it out to a file this:
<?php
system("python pickle2json.py -p pickled_data.txt -j p_to_j.json");
?>
All the code above probably isn't perfect (I'm not a PHP developer), but would something like this work for you?
I know this is ancient, but I've just needed to do this for a Django 1.3 app (circa 2012) and found this:
https://github.com/terryf/Phpickle
So just in case, one day, someone else needs the same solution.
If the pickle is being created by the the code that you showed, then it won't contain binary data -- unless you are calling newlines "binary data". See the Python docs. Following code was run by Python 2.6.
>>> import cPickle
>>> data = {'user_id': 5}
>>> for protocol in (0, 1, 2): # protocol 0 is the default
... print protocol, repr(cPickle.dumps(data, protocol))
...
0 "(dp1\nS'user_id'\np2\nI5\ns."
1 '}q\x01U\x07user_idq\x02K\x05s.'
2 '\x80\x02}q\x01U\x07user_idq\x02K\x05s.'
>>>
Which of the above looks most like what you are seeing? Can you post the pickled file contents as displayed by a hex editor/dumper or whatever is the PHP equivalent of Python's repr()? How many items in a typical dictionary? What data types other than "integer" and "string of 8-bit bytes" (what encoding?)?