I'm trying to make a php socket server and I found two functions that mask and unmask a text message (frame).
I think I don't understand clearly how it works.
This is the functions :
//encode message for transfer to client
function mask($text)
{
$b1 = 0x80 | (0x1 & 0x0f);
$length = strlen($text);
if ($length <= 125)
$header = pack('CC', $b1, $length);
elseif ($length > 125 && $length < 65536)
$header = pack('CCn', $b1, 126, $length);
elseif ($length >= 65536)
$header = pack('CCNN', $b1, 127, $length);
return $header . $text;
}
//unmask incoming framed message
function unmask($text)
{
$length = ord($text[1]) & 127;
if ($length == 126) {
$masks = substr($text, 4, 4);
$data = substr($text, 8);
} elseif ($length == 127) {
$masks = substr($text, 10, 4);
$data = substr($text, 14);
} else {
$masks = substr($text, 2, 4);
$data = substr($text, 6);
}
$text = "";
for ($i = 0; $i < strlen($data); ++$i) {
$text .= $data[$i] ^ $masks[$i % 4];
}
return $text;
}
What I think I've understood :
mask convert the binary representation of the message and create a frame (by concatenating the header) of the right size according to the message length. (by adding bytes with pack() right?)
unmask -> reverse process.
What I don't understand :
What is the purpose of this variable $b1 used in mask? The syntax of this code is not clear for me.
$b1 = 0x80 | (0x1 & 0x0f);
That line is a bit strange to write that way, but here is what is going on. & is a binary AND operator, which takes the two values and returns only the bits that match. 0x1 is 00000001 and 0x0f is 00001111 in binary.
00000001
&00001111
=00000001
so (0x1 & 0x0f) is just 0x1 or 1.
The | operator is like &, but is a binary OR. If either side has a 1, the result will be a 1. 0x80 is 01000000, so
01000000
|00000001
=01000001
So the total result is 0x81. Why not just write $b1 = 0x81? I'm guessing that the author of this code copied it from some C code where the 0x1 part was a variable:
byte b1 = 0x80 | (someVariable & 0x0f);
In this case, the binary & with 0x0f ensures that only the last 4 bits of someVariable will be used, and the first 4 bits of b1 will always be 0x8 (which is probably necessary according to the frame specification).
Related
I get some PHP code, but I don't understand why it's written this way. Can somebody please explain it to me?
function unmask($text)
{
$length = ord($text[1]) & 127;
if ($length == 126) {
$masks = substr($text, 4, 4);
$data = substr($text, 8);
} elseif ($length == 127) {
$masks = substr($text, 10, 4);
$data = substr($text, 14);
} else {
$masks = substr($text, 2, 4);
$data = substr($text, 6);
}
$text = "";
for ($i = 0; $i < strlen($data); ++$i) {
$text .= $data[$i] ^ $masks[$i % 4];
}
return $text;
}
Specifically, what does:
$length = ord($text[1]) & 127
...mean, exactly?
I'm trying to develop a function for encode a text in a correct format of websocket data send. this is a function in PHP but I can't translate this in C language.
private function encode($text) {
// 0x1 text frame (FIN + opcode)
$b1 = 0x80 | (0x1 & 0x0f);
$length = strlen($text);
if($length <= 125)
$header = pack('CC', $b1, $length);
elseif($length > 125 && $length < 65536)
$header = pack('CCS', $b1, 126, $length);
elseif($length >= 65536)
$header = pack('CCN', $b1, 127, $length);
return $header.$text;
}
There's less need for bureaucracy, so it's much simpler to implement. Normally you expect the caller to provide the output buffer and the text size, so let's do that.
Also, be extra careful with the endianness, your second use of pack in that PHP code should be with 'CCn' instead of 'CCS'.
Example implementation:
static char *hdr_fill(char *buf, unsigned int hlen, size_t plen)
{
*buf++ = 0x81; /* b1 */
switch (hlen) {
case 6:
*buf++ = 127;
break;
case 4:
*buf++ = 126;
}
/* Store length in big endian order */
switch (hlen) {
case 6:
*buf++ = plen >> 24;
*buf++ = plen >> 16;
case 4:
*buf++ = plen >> 8;
case 2:
*buf++ = plen;
}
return buf;
}
size_t encode(char *buf, size_t bufsize, const char *text, size_t len)
{
const unsigned int hdrlen = len > 65535 ? 6 : (len > 255 ? 4 : 2);
if (bufsize < hdrlen + len)
return 0;
buf = hdr_fill(buf, hdrlen, len);
memcpy(buf, text, len);
return hdrlen + len;
}
For convenience it returns the result's size. You will need at least <stddef.h> (unless you replace size_t) and <string.h>.
However, you may want to use a scatter-gather approach instead, as it avoids the copying. In that case you just need to build the header and adjust it's vector. Also, IMO it's more elegant:
int setup_header(struct iovec *v, int n)
{
/* We expect a valid buffer in v[0], and the payload in v[1+] */
if (n < 2 || v[0]->iov_len < 6)
return -1;
size_t len = v[1]->iov_len;
for (int i = 2; i < n; i++)
len += v[n]->iov_len;
const unsigned int hdrlen = len > 65535 ? 6 : (len > 255 ? 4 : 2);
v[0]->iov_len = hdrlen;
hdr_fill(v[0]->iov_base, hdrlen, len);
return 0;
}
In this case the return value is zero if successful. For this one you need <sys/uio.h> or equivalent.
I have set up a PHP WebSocket server that is able to read string data from clients. My question is on how to handle binary data types. Below is the code on the client side that records microphone input as a Float32Array object and sends the data over a WebSocket connection in binary.
websocket = new WebSocket("ws://...");
websocket.binaryType = "arraybuffer";
recorder.onaudioprocess = function(stream) {
var inputData = stream.inputBuffer.getChannelData(0);
websocket.send(inputData);
}
As for the server side, I am using the following functions which I've found online for encoding/unmasking.
function mask($text)
{
$b1 = 0x80 | (0x1 & 0x0f);
$length = strlen($text);
if( $length <= 125)
$header = pack('CC', $b1, $length);
elseif ($length > 125 && $length < 65536)
$header = pack('CCS', $b1, 126, $length);
elseif ($length >= 65536)
$header = pack('CCN', $b1, 127, $length);
return $header.$text;
}
function unmask($payload)
{
$length = ord($payload[1]) & 127;
if($length == 126) {
$masks = substr($payload, 4, 4);
$data = substr($payload, 8);
$len = (ord($payload[2]) << 8) + ord($payload[3]);
}
elseif($length == 127) {
$masks = substr($payload, 10, 4);
$data = substr($payload, 14);
$len = (ord($payload[2]) << 56) + (ord($payload[3]) << 48) +
(ord($payload[4]) << 40) + (ord($payload[5]) << 32) +
(ord($payload[6]) << 24) +(ord($payload[7]) << 16) +
(ord($payload[8]) << 8) + ord($payload[9]);
}
else {
$masks = substr($payload, 2, 4);
$data = substr($payload, 6);
$len = $length;
}
$text = '';
for ($i = 0; $i < $len; ++$i) {
$text .= $data[$i] ^ $masks[$i%4];
}
return $text;
}
The code works great except they only work on string data and not for binary type. My question is how do I handle binary type and push them to clients?
I don't really get what you are trying to do. WebRTC can be used to create Peer2Peer connections. Thus you would use your PHP Server as a signaling server. The actual data is not passed to your server unless it is needed for forwarding because a p2p connection could not be established. Is that what you want to do?
Otherwise create a RtcPeerConnection and send data from peer to peer as it is intended by webrtc.
I'm writing some app based on websockets (RFC 6455). Unfortunetly it looks like the client (testing on Chrome 18) doesn't receive data, but the server says it is sending...
Chrome doesn't say anything
Here are main server methods:
private function decode($payload) {
$length = ord($payload[1]) & 127;
if ($length == 126) {
$masks = substr($payload, 4, 4);
$data = substr($payload, 8);
} elseif ($length == 127) {
$masks = substr($payload, 10, 4);
$data = substr($payload, 14);
} else {
$masks = substr($payload, 2, 4);
$data = substr($payload, 6);
}
$text = '';
for ($i = 0; $i < strlen($data); ++$i) {
$text .= $data[$i] ^ $masks[$i % 4];
}
$text = base64_decode($text);
return $text;
}
private function encode($text) {
$text = base64_encode($text);
// 0x1 text frame (FIN + opcode)
$b1 = 0x80 | (0x1 & 0x0f);
$length = strlen($text);
if ($length <= 125)
$header = pack('CC', $b1, $length);
elseif ($length > 125 && $length < 65536)
$header = pack('CCS', $b1, 126, $length);
else
$header = pack('CCN', $b1, 127, $length);
return $header . $text;
}
protected function process($user, $msg) {
echo '<< '.$msg.N;
if (empty($msg)) {
$this->send($user->socket, $msg);
return;
}
}
protected function send($client, $msg) {
$msg = $this->encode($msg);
echo '>> '.$msg.N;
socket_write($client, $msg, strlen($msg));
}
If you're sending a test message >125 bytes but <65536, your problem might be caused by a faulty format string to pack. I think this one should be 'CCn' (your current code writes the 2 bytes of the length in the wrong order).
If that doesn't help, you could try some client-side logging:
Does the onopen callback run to prove that the initial handshake completed successfully?
Do the onerror or onclose callbacks run, either after connection or after your server sends its message?
I'm trying to decode encoded long dash from numeric entity to string, but it seems that I can't find a function which can do this properly.
The best that I found is mb_decode_numericentity(), however, for some reason it fails to decode long dash and some other special characters.
$str = '–';
$str = mb_decode_numericentity($str, array(0xFF, 0x2FFFF, 0, 0xFFFF), 'ISO-8859-1');
This will return "?".
Anyone knows how to solve this problem?
The following code snippet (mostly stolen from here and improved) will work for literal, numeric decimal, and numeric hexa-decimal entities:
header("content-type: text/html; charset=utf-8");
/**
* Decodes all HTML entities, including numeric and hexadecimal ones.
*
* #param mixed $string
* #return string decoded HTML
*/
function html_entity_decode_numeric($string, $quote_style = ENT_COMPAT, $charset = "utf-8")
{
$string = html_entity_decode($string, $quote_style, $charset);
$string = preg_replace_callback('~&#x([0-9a-fA-F]+);~i', "chr_utf8_callback", $string);
$string = preg_replace('~&#([0-9]+);~e', 'chr_utf8("\\1")', $string);
return $string;
}
/**
* Callback helper
*/
function chr_utf8_callback($matches)
{
return chr_utf8(hexdec($matches[1]));
}
/**
* Multi-byte chr(): Will turn a numeric argument into a UTF-8 string.
*
* #param mixed $num
* #return string
*/
function chr_utf8($num)
{
if ($num < 128) return chr($num);
if ($num < 2048) return chr(($num >> 6) + 192) . chr(($num & 63) + 128);
if ($num < 65536) return chr(($num >> 12) + 224) . chr((($num >> 6) & 63) + 128) . chr(($num & 63) + 128);
if ($num < 2097152) return chr(($num >> 18) + 240) . chr((($num >> 12) & 63) + 128) . chr((($num >> 6) & 63) + 128) . chr(($num & 63) + 128);
return '';
}
$string ="”";
echo html_entity_decode_numeric($string);
Improvement suggestions are welcome.
mb_decode_numericentity does not handle hexadecimal, only decimal. Do you get the expected result with:
$str = '–';
$str = mb_decode_numericentity ( $str , Array(255, 3145727, 0, 65535) , 'ISO-8859-1');
You can use hexdec to convert your hexadecimal to decimal.
Also, out of curiosity, does the following work:
$str = '–';
$str = html_entity_decode($str);