I have a PHP websocket server, written using the socket module (socket_bind(), socket_accept(), socket_recv(), etc.).
It was written and tested and is working well on Linux. However, the implementation needs to be cross-platform.
I have done a quick test running it on Windows, and there were no immediate errors, but I would like a bit more certainty that my implementation is sound rather than just assuming it is OK because I haven't yet spotted any errors (which would be the case even after more in-depth testing, as sockets have a lot of edge-cases).
To what extent do these PHP functions abstract the underlying architecture (Unix sockets vs. Winsock)? For example,
Are there differences in the effect/availability of various flags or options?
Do the error codes differ, and if so is there a way of mapping them across?
Are there architectural differences that mean similar functions may behave differently (e.g. different rules about blocking)?
I'm not looking for a specific analysis of my code, as there is far too much of it, but more for a general idea of what differences I need to be aware of in order to make this code portable.
Note that, for the purposes of this question, I am only really interested in TCP/IP sockets (though more general answers are also fine).
Related
I'm starting to consider websockets as a solution to replace long polling in a new build PHP app I am commissioning.
I have a few questions which I wonder if people could help me out with.
Can a Nodejs server call PHP and if it did wouldn't it suffer the same shortcomings as just going through Apache in terms of the connections? We all know nodejs is non blocking and Apache etc isn't but if Nodejs is just making a call to a PHP server in it's own procedure would that not bottle neck in a similar way?
Are PHP and websockets a good match?
Are there any good js libraries besides socketio which apparently only works with Nodejs?
Has anyone found a good tutorial which uses websockets and a PHP backend maybe using something like that Ratchet PHP library which might help me get on my way?
Thoughts would be muchly appreciated.
Please excuse my paraphrasing of your questions.
1: Can Node.js call PHP, and wouldn't that have the same shortcomings as Apache?
Calling a run-once PHP script will have the same general shortcomings as calling a web page, except that you are removing an extra layer of processing. Apache or any web server itself is such a thin layer that, while you'll save some time, the savings will be insignificant.
If PHP is more effective at gathering data for your clients than Node.js, for whatever reason, then it might be wise to include PHP in your application.
2: Are PHP and WebSockets a good match?
Traditional PHP scripts are normally intended to be run once per request. The vast majority of PHP developers are unfamiliar with event driven development, and PHP itself does not (yet) have support for asynchronous processing.
PHP is a fast, mature scripting language that is only getting faster, even with all of its many warts and shortcomings. (Some say that its weak typing is a shortcoming. Others say that it's a shortcoming that its typing isn't weak enough.)
That said, the minimum that any language needs in order to implement WebSockets is the ability to open up a basic TCP port and listen for requests. For PHP, it is implemented as a thin wrapper around the C sockets library, and there are additional extensions and frameworks available that can also change the feel of working in TCP sockets with PHP.
PHP's garbage collector has also matured. Memory leaks come either from gross disregard for the memory space (I'm looking at you, Zend Framework) or from intentional sabotage of the garbage collection system by developers who think they're clever or want to prove how easy it is to defeat the GC. (Spoiler: It's easy in every language, if you know the details!)
It is quite possible and very easy to set up a daemon (long running background process) in PHP. It's even possible to make it well behaved enough to gracefully restart and hand its connections off to a new version of the same script, or even the same script on the same server running different versions of PHP, though this is treading out of scope just a tiny little bit.
As for whether it's a good match, that is completely up to the developer. Are you willing, able, and happy to work with PHP to write a WebSockets server, or to use one of the existing servers? Yes? Then you're a good match for PHP and WebSockets.
3: JS Libraries for WebSockets
I honestly haven't researched them.
4: Tutorials for using PHP and Websockets
I'm personally fond of this tutorial: http://www.phpbuilder.com/articles/application-architecture/optimization/creating-real-time-applications-with-php-and-websockets.html
Although I have it on good authority that the specifics of that tutorial will soon be obsolete for that specific WebSockets server. (There will still be an actively maintained legacy branch for that server, though.)
In case of link rot:
Using the PHP-Websockets server (available on Github, will be homed soon), extend the base WebSocketServer abstract class and implement the abstract methods process(), connected(), and closed().
There's much better information at the link above, though, so follow it as long as the link exists.
It would hit the same bottleneck if you go through apache. This can be remedied by using a different web server, such as lighthttpd or nginx. You won't even need node at all.
PHP does not have decent shared memory making the biggest advantages of a WebSockets irrelevent. It should be decent enough if you don't want interaction between users, but even then I would have to frown upon the usage of PHP. PHP is great for a lot of things, but real-time communication is not one of them.
You might want to look at https://github.com/einaros/ws.
PHP is not a good back-end. Anything with an execution model that isn't run-and-forget in its own sandbox, such as Node, .NET, C/C++ and Java are good matches. PHP is suited for short running executions, such as actual web sites and even web services -- but not real time connections.
We want to write a Linux service in php and compile it with HIPHOP. Since we started the project with php and we could do all the programming in-house instead of hiring a c++ programmer etc. we would love to stick to php. Speed in execution is not (so) relevant for us since the daemon is just doing some monitoring but we would like to close up the code to obfuscate it. The daemon will do some network communication and logging to a db. Is this a viable route to go? In another post someone described that hiphop needs special attention in programming since not all php features are implemented. Is this still the case? I would love to here your overall opinion on our idea.
HIPHOP is quite a beast to handle. It is very limited, so it depends specifically on your application and where it will be deployed. Remember, at present it only runs on 64 bit architectures .. so if you wanted to deploy on a 32-bit machine, you are immediately stuck in the mud.
You may have to build many different binaries for different linux distro's depending on the nature of your application. Since HipHop only works well on Fedora and CentOS, you are severely limiting your scope. Once you move off of the PHP interpreter, you lose a very large amount of interchangeability between operating systems (Think about it: Windows, Virtually all Linux, All Major BSD Distributions, ... )
Also keep in mind, I'm not sure to what extent you want to "obfuscate" your code. If you want to make network calls, etc and keep those hidden as well, a packet sniffer can see exactly how you are communicating with the outside world extremely easily.
Likewise, a debugger and a reasonably seasoned programmer will be able to reverse engineer your binary to a larger degree than you may be aware.
You may want to look into alternatives such as Zend Encoder or IonCube Encoder would be the preferred method to go about things, but these are non-free options. There are other encoders out there as well that you may want to look into.
I'm not exactly sure what you're doing other than "monitoring", so I can't say for sure. But a secondary option would be simply to severely limit the amount of code that is being run on the client machines (assuming they are reporting to a server machine) and let the server machines, which are assumed in your total control, handle more processing if any way possible.
I invite you to simply explore the idea yourself by testing, since once again, it's extremely dependent on the nature of your application and where you intend to deploy it. (And for many people, something like "where to deploy" can change rapidly). HipHop was created with a very narrow scope: Run PHP code as fast as possible. It isn't designed to be highly flexible or highly interchangeable between OS's and CPU Architectures. Please consider this before you write a large application reliant on it, and please make sure you fully understand every implication of using HipHop. Test, test, test.
Per this post here there are 3 ways
(1)do the whole thing in C++, making your program a standalone web server (possibly proxying through apache to provide things like ssl, static media, authentication etc.)
(2)run C++ in a cgi-bin, through apache
make a PHP wrapper that shells out to the C++ part (this is a nice option if the performance-critical part is small, as you can still use the comfort that PHP's garbage collection and string manipulation gives you)
I'm not sure which is best so I looked at what a high volume site does. Here is a post from Facebook in 2010
They use a static analysis tool Hip Hop, to convert PHP to C++.
I don't need the static analysis tool as I only have about 1500 lines and can convert by hand...but I need a starting point.
Right now I run a Lamp stack and want to stay on it minus the (P)HP.
Here is a link that explains how Facebook works. Not sure how accurate it is.
Thanks
As the comments note, Facebook is almost certainly using a highly-customized solution that involves high administration costs in return for very high efficiency. It is unlikely that this is actually what you want.
Since what you want is simply to replace the "P" in your LAMP stack, that implies that you probably want to keep the "LAM" -- the Linux, Apache, and MySQL (if relevant) parts. That's a good idea; while there are advantages at Facebook's scale to running a custom web server, it is extremely unlikely that it will actually be useful for you, and continuing to run Apache is certainly much easier and simpler. (And probably more secure, since you don't have to think about the security and fix bugs all by yourself.)
And you're planning to translate all your PHP, not just part of it, so calling C++ from PHP doesn't make sense.
Thus, in your case, the best solution is most likely to be running the C++ application via cgi-bin with your existing Apache server.
FastCGI is a much better option than CGI, and can act like CGI in certain circumstances. If you only want to work with Apache, you can also develop an Apache module, and there's an excellent book on the subject: The Apache Modules Book This describes many elements of C development with Apache acting in many ways like a (sort of) application server.
With careful C/C++ coding, you can achieve remarkable performance with limited memory. Not for everything, but in some circumstances, very powerful.
I would like to be able to write database driven applications (i.e. standalone apps that are not web based and don't require a browser or Apache server to run)
I have attempted to do this in Codegear C++ Builder in the past but even though my 'background' is in that (C++ OOP with Borland Builder) it is so far removed from doing the same sort of thing with PHP/mysql and other web technologies that I found that I couldn't get very far for a lot of effort getting it to work. It was a while ago now but I was using the built-in database engine that comes with Builder and I just found it frustrating and difficult.
In other words - Is there something out there that will allow me to make use of web based languages to write standalone applications (specifically PHP/Javascript/mysql)
You can stick with PHP if you want. There are QT bindings, GTK bindings, OSX/Cocoa bindings, and you can call Win32 functions. I don't know how stable all of those are, but you can do GUI in PHP as well as the command line stuff.
As for other languages... PHP is very C like. It started as basically a scripting wrapper around C (IIRC), which is why you have functions named after the C standard library (like strstr). C like languages will feel quite familiar.
I would think Python would probably be the closest to PHP. It's a scripting language, syntax is somewhat close, it has a ton of libraries, and is very well supported and commonly used. I'd imagine it would feel pretty familiar. Using indenting instead of brackets for blocks can throw some people, but it fit the way I already indented my code.
Ruby is quite popular, and is also a scripting language. I think it's farther away, syntax wise, than Python, but I've never really used it so I can't promise that. I know it has at least GTK bindings.
Perl has a lot of resources and bindings, but isn't as easy to read as PHP since you have to learn the special variables like #_. It was never really my cup of tea.
You do have the C/C++ stuff, and Java has it's large library. You may want to go that way since you say you've used C++ before. If you're on a Mac (or willing to use GNUStep) you could go with Objective-C/Cocoa. That's getting rather far from PHP syntax though.
All of these languages have database connections. You don't mention what platform you're on.
But for easy to work with, quick to pick up, works all sorts of places, and can definitely do GUIs... Python would be a good choice to look at.
You could always use PHP. It is a decent command-line / programing application. Other than that all I can say is that you knowledge of database access and storage will be helpful but by the end of the day you are going to need to learn a new language.
Most languages have libraries for database access. Just pick one that you like the feel of. It is also a good idea to choose one that is popular (for the community support) and free libraries are always nice. Also look for good documentation and one that is fairly standard.
A nice thing to know is that javascript and php syntax are very similar to many other languages. (Javascript looks almost identical to C and C++). Just read the main language tutorial then the database API tutorial and you should be good to go
I want to make the most lightweight possible HTTP server in C that supports PHP and possibly FastCGI if it will make a huge difference.
I'm not sure how to implement PHP support. Does it just call PHP.exe with the path to a .php file and read the output? What about things like header () in PHP? How are those handled by the server?
And another question, is it ideal to use separate threads for each request? I don't expect a very heavy load, but I'm not 100% sure on the design aspect of this...
I'm still pretty new to C and C++ and this is a learning experience.
Firstly let me say that if the goal is a lightweight HTTP server that serves PHP pages, this has already been done. Have a look at nginx.
As for a learning experience, you've chosen something that's actually fairly tough.
Multithreaded is hard at the best of times. On C/C++ (anything with manual memory allocation really) it's an order of magnitude harder.
Added to this is network communication. There are quirks to deal with, different versions of HTTP (mostly a non-issue now), all sorts of HTTP headers to deal with and so on.
The most intuitive solution to this problem is to have a process that listens to a port. When it receives a request, it spawns a process, which may exec to a PHP process if required.
This however does not scale. The first (obvious) optimization is to use threads instead of processes and some form of interthread communication. While this helps, it will still only scale so far.
Go beyond that and you're looking at async socket handling, which is fairly low level.
All of these however are fairly big projects.
Is there any particular reason you're doing this in C/C++? Or any particular reason you're learning one or both of those languages? These languages certainly have their place but they're increasingly becoming niche languages. Managed (garbage collected) languages/platforms have almost completely taken over. Joel argues that garbage collection is about the only huge productivity increase in programming in about the last 20 years and I tend to agree.
For a learning experience regarding HTTP code written in C you may also take a look at:
http://hping.org/wbox/
To make your own HTTP server, I reccomend to get inspiration from other peoples code. The programmer ry famous for the node.js framework has written simple elegant code regarding this matter.
Check out his libebb library, it has a parser generated with Raegel using the easy yet powerful PEG (it's based on Zed Shaw's mongrel parser). Also check the example usage. It is really clean and usable code.
libebb is a lightweight HTTP server library for C.
It lays the foundation for writing a web server
by providing the socket juggling and request parsing.
By implementing the HTTP/1.1 grammar provided
in RFC2612, libebb understands most most valid HTTP/1.1
connections (persistent, pipelined, and
chunked requests included) and rejects invalid or
malicious requests. libebb supports SSL over HTTP.
Regarding PHP-Server coupling, the easiest way is CGI but if you feel adventurous dig into php source code under SAPI (Server API) modules to see how to do it.
Similar to libebb, see http://www.gnu.org/software/libmicrohttpd/. It too uses GnuTLS for optional SSL.