Rare strange readings with fsockopen - php

I'm using fsockopen on a small cronjob to read and parse feeds on different servers. For the most past, this works very well. Yet on some servers, I get very weird lines in the response, like this:
<language>en</language>
<sy:updatePeriod>hourly</sy:updatePeriod>
<sy:updateFrequency>1</sy:updateFrequency>
11
<item>
<title>
1f
July 8th, 2010</title>
<link>
32
http://darkencomic.com/?p=2406</link>
<comments>
3e
But when I open the feed in e.g. notepad++, it works just fine, showing:
<language>en</language>
<sy:updatePeriod>hourly</sy:updatePeriod>
<sy:updateFrequency>1</sy:updateFrequency>
<item>
<title>July 8th, 2010</title>
<link>http://darkencomic.com/?p=2406</link>
<comments>
...just to show an excerpt. So, am I doing anything wrong here or is this beyond my control? I'm grateful for any idea to fix this.
Here's part of the code I'm using to retrieve the feeds:
$fp = #fsockopen($url["host"], 80, $errno, $errstr, 5);
if (!$fp) {
throw new UrlException("($errno) $errstr ~~~ on opening ".$url["host"]."");
} else {
$out = "GET ".$path." HTTP/1.1\r\n"
."Host: ".$url["host"]."\r\n"
."Connection: Close\r\n\r\n";
fwrite($fp, $out);
$contents = '';
while (!feof($fp)) {
$contents .= stream_get_contents($fp,128);
}
fclose($fp);

This looks like HTTP Chunked transfer encoding -- which is a way HTTP has of segmenting a response into several small parts ; quoting :
Each non-empty chunk starts with the
number of octets of the data it embeds
(size written in hexadecimal) followed
by a CRLF (carriage return and line
feed), and the data itself. The chunk
is then closed with a CRLF. In some
implementations, white space
characters (0x20) are padded between
chunk-size and the CRLF.
When working with fsockopen and the like, you have to deal with the HTTP Protocol yourself... Which is not always as easy as one might think ;-)
A solution to avoid having to deal with such stuff would be to use something like curl : it already knows the HTTP Protocol -- which means you won't have to re-invent the whell ;-)

I don't see anything strange that could cause that kind of behaviour. Is there any way you can use cURL to do this for you? It might solve the problem altogether :)

Related

Weird character needed to end a TCP/IP stream cant figure out what it is

Im connecting to a DMX controller using fsockopen. (Or even telnet via putty)#
And it sends the following CHAR at the end of a command. I cannot for the life of me figure out what it is...
Their documentation doesnt say annoyingly and i need to use it as the "end of stream" as the return byte count varies.
Putty shows : ▒
Web browser shows as : ?
Hex editor shows as (FF) ÿ
What one earth could it be?
My code:
<?php
$fp = fsockopen("localhost", "3333");
fputs($fp,"FSBC017000");
$content = stream_get_line($fp, 10000, "**<Whatever that char is?>**");
echo $content;
?>
Thanks to anyone in advance. Brain is broken.

Get all contents of stream socket with fgets() in blocking mode

In order to complete the handshaking for Websockets in ssl, the socket must be read in blocking mode. Using stream sockets, communication is done from the php backend with the (javascript) client using fwrite() and fgets(). In blocking mode, fgets() will wait until the next line comes in, and grab one line. Once the socket connection is made, the client sends the PHP some headers so that the handshake can be completed. The problem is, I can't think of a way to find where the end of the headers are, since the order depends on the browser being used.
I used this work around for chrome (since the sec-websocket-extensions line is the last header sent)
stream_set_blocking($lsSocketNew, true);
$lcHeader = "";
while($lcLine = fgets($lsSocketNew)){
$lcHeader .= $lcLine;
if(strstr($lcLine, "Sec-WebSocket-Extensions")){
break;
}
}
but this doesn't work in other browsers like firefox, where this header is the first one sent. :P
(I think fread() is supposed to do what I am looking for -- in blocking mode it is supposed to get "everything" on the socket when it comes in... but when I tried fread instead, it was returning a blank string. :P stream_get_contents() was the same )
Although I can't give you a PHP advice, there is a couple of things that you may want to consider:
I. What kind of "everything" are you looking for? There are no message borders in TCP so "everything in the stream" is equivalent to "random ordered amount of data". Unfortunately, you aren't going to magically read all HTTP headers and stop there.
II. Given point I, you have to find something that separates HTTP headers from an HTTP body. This is actually rather simple, because the headers end with a blank line. So, just read the data until you receive CRLF CRLF*. In PHP you will most probably see CRLF as \n, though this can depend on the OS.
III. If you're implementing websockets, using fgets is questionable, because the rest of the protocol (after HTTP handshake) is binary. You may want to use dedicated PHP's sockets module and socket_recv instead of fread. I can't say how these two functions differ, but socket_* functions are just a wrapper around BSD sockets which are implemented in a wide variety of languages. Since they're mostly language agnostic, you will find more support and tutorials in the internet.
* Per the HTTP standard:
CR = <US-ASCII CR, carriage return (13)>
LF = <US-ASCII LF, linefeed (10)>

PHP fread hangs when using SSL

I'm using fsockopen to connect to an OpenVAS manager and send XML. The code I am using is:
$connection = fsockopen('ssl://'.$server_data['host'], $server_data['port']);
stream_set_timeout($connection, 5);
fwrite($connection, $xml);
while ($chunk = fread($connection, 2048)) {
$response .= $chunk;
}
However after reading the first two chunks of data, PHP hangs on fread and doesn't time out after 5 seconds. I have tried using stream_get_contents, which gives the same result, BUT if I only use one fread, it works ok, just that I want to read everything, regardless of length.
I am guessing, it is an issue with OpenVAS, which doesn't end the stream the way PHP expects it to, but that's a shot in the dark. How do I read the stream?
I believe that fread is hanging up because on that last chunk, it is expecting 2048 bytes of information and is probably getting less that that, so it waits until it times out.
You could try to refactor your code like this:
$bytes_to_read = 2048;
while ($chunk = fread($connection, $bytes_to_read)) {
$response .= $chunk;
$status = socket_get_status ($connection);
$bytes_to_read = $status["unread_bytes"];
}
That way, you'll read everything in two chunks.... I haven't tested this code, but I remember having a similar issue a while ago and fixing it with something like this.
Hope it helps!

Getting started on FIX protocol with PHP sockets

I have pretty basic knowledge of PHP sockets and the FIX protocol altogether. I have an account that allows me to connect to a server and retrieve currency prices.
I adapted this code to connect and figure out what I receive back from the remote server:
$host = "the-server.com";
$port = "2xxxx";
$fixv = "8=FIX.4.2";
$clid = "client-name";
$tid = "target-name";
$fp = fsockopen($host, $port, $errno, $errstr, 30);
if (!$fp) {
echo "$errstr ($errno)<br />\n";
} else {
$out = "$fixv|9=70|35=A|49=$clid|56=$tid|34=1|52=20000426-12:05:06|98=0|108=30|10=185|";
echo "\n".$out."\n";
fwrite($fp, $out);
while (!feof($fp)) {
echo ".";
echo fgets($fp, 1024);
}
fclose($fp);
}
and I get nothing back. The host is good because I'm getting an error when I use a random one.
Is the message I'm sending not generating a reply ?
I might not be very good at finding things in Google but I could not find any simple tutorial on how to do this with php (at least nothing that puts together fix and php).
Any help is greatly appreciated.
FIX separator character is actually '\001' not '|', so you have to replace that when sending.
Some links for you:
FIX protocol - formal specs
Onixs FIX dictionary - very useful site for tag lookup
Edit 0:
From that same wikipedia article you mention:
The message fields are delimited using the ASCII 01 character.
...
Example of a FIX message : Execution Report (Pipe character is used to represent SOH character) ...
Edit 1:
Couple more points:
Tag 9 holds message length without tags 8 (type), 9 (length), and 10 (checksum).
Tag 10, checksum, has to be a modulo 256 sum of ASCII values of all message characters including all SOH separators, but not including the tag 10 itself (I know, it's stupid to have checksums on top of TCP, but ...)
The issue is the use of fgets(...), it is expecting a \n which does not exists in this FIX protocol.
On top of that, an expected length of 1024 is specified, which is a length that the response is unlikely to exceed.
To cap it off, since the server doesn't terminate the connection, fgets(...) hangs there "forever"

PHP fsockopen() / fread() returns messed up data

I read some URL with fsockopen() and fread(), and i get this kind of data:
<li
10
></li>
<li
9f
>asd</li>
d
<li
92
Which is totally messed up O_O
--
While using file _ get _ contents() function i get this kind of data:
<li></li>
<li>asd</li>
Which is correct! So, what the HELL is wrong? i tried on my windows server and linux server, both behaves same. And they dont even have the same PHP version.
--
My PHP code is:
$fp = #fsockopen($hostname, 80, $errno, $errstr, 30);
if(!$fp){
return false;
}else{
$out = "GET /$path HTTP/1.1\r\n";
$out .= "Host: $hostname\r\n";
$out .= "Accept-language: en\r\n";
$out .= "Connection: Close\r\n\r\n";
fwrite($fp, $out);
$data = "";
while(!feof($fp)){
$data .= fread($fp, 1024);
}
fclose($fp);
Any help/tips is appreciated, been wondering this whole day now :/
Oh, and i cant use fopen() or file _ get _ contents() because the server where my script runs doesnt have fopen wrappers enabled > __ <
I really want to know how to fix this, just for curiousity. and i dont think i can use any extra libraries on this server anyways.
About your "strange data" problem, this might be because the server you are requesting data from is transferring it in chunked mode.
You can take a look at the HTTP headers, when calling the same URL in your browser ; one of those headers might be like this :
Transfer-encoding: chunked
Quoting wikipedia's article on that matter :
Each non-empty chunk starts with the
number of octets of the data it embeds
(size written in hexadecimal) followed
by a CRLF (carriage return and line
feed), and the data itself. The chunk
is then closed with a CRLF. In some
implementations, white space
characters (0x20) are padded between
chunk-size and the CRLF.
The last chunk is a single line,
simply made of the chunk-size (0),
some optional padding white spaces and
the terminating CRLF. It is not
followed by any data, but optional
trailers can be sent using the same
syntax as the message headers.
The message is finally closed by a
final CRLF combination.
This looks close to what you are getting... So I'm guessing this is the problem.
As far as I remember, curl knows how to deal with that -- so, the easy way would be to use curl instead of fsockopen and the like
And using curl is often a better idea that using sockets : it will deal with many problems you might encounter ; like this one ;-)
Anoter idea, if you don't have curl enabled on your server, would be to use some already existing library based on fsockopen -- hoping it would take care of those kind of things for you already.
For instance, I've worked with Snoopy a couple of times ; maybe it ealready knows how to deal with that ?
(Not sure : you'll have to test by yourself -- or take a look at the documentation to know if this is OK)
Still, if you want to deal with the mysteries of the HTTP protocol by yourself... Well, I wish you luck !
You probably want to use cURL.
<?php
// create a new cURL resource
$ch = curl_init();
// set URL and other appropriate options
curl_setopt($ch, CURLOPT_URL, "http://www.example.com/");
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
// grab URL and pass it to the browser
$output = curl_exec($ch);
// close cURL resource, and free up system resources
curl_close($ch);
?>
With fsockopen(), you get the raw TCP data, not the HTTP contents. I assume you also see the HTTP headers, right? If it's in chunked encoding, you will get all the chunk headers.
This is a known issue. Someone posted a solution here on how to remove chunk headers.

Categories