I'd like to be able to run JavaScript and get the results with PHP and is wondering if there is a library for PHP that allows me to parse it out. My first thought was to use node.js, but since node.js has access to sockets, files and things I think I'd prefer to avoid that.
Rationale: I'm doing screen scraping in PHP and have encountered many scenarios where the data is being produced by JavaScript on the frontend, and I would like to avoid writing specialized filtering functions to act on the JavaScript on a per-case basis since that takes a lot of time. The more general case would be to parse the JavaScript directly.
Downvoting: I don't really see what's so controversial about this question, modern web crawlers are known to do it, the only difference is that they tend to not be written in PHP. [1]
[1] http://blogs.forbes.com/velocity/2010/06/25/google-isnt-just-reading-your-links-its-now-running-your-code/
It's an interesting question and the down-voters are being unimaginative about potential use-cases. Page archiving tools, printing scripts, preview images - all valid reasons to want to manipulate a document with the JavaScript included within the page.
I'm not aware of any existing PHP implementations, but you could probably adapt Mozilla's SpiderMonkey as a PHP module, or as a standalone tool to manipulate a DOMDocument and return the result.
I haven't had experience with server-side JavaScript, but some issues that I believe might need to be dealt with:
Host objects like document and window are not part of the ECMAScript specification (these are objects provided by the implementing browser) so you need to make sure that the library provides equivalent host objects.
You might have security issues around executing client side scripts within a server side environment. This is a lot like allowing the user to submit a PHP script to be evaluation, so you need to make sure the security sandbox is tight.
Another (perhaps) safer and easier to implement option might be to use a modified FireFox or WebKit instance that runs as a browser, loading up the target pages and returning the modified source to your application.
From PHP 5.3 you can use V8JS extention from PHP. It's a native library that uses the new Google V8 Javascript engine to execute JS and return the result.
It's good because you can pass vars in PHP arrays and are interpreted very well
NodeJS (or some other derivative of google's v8) might actually be the best way to go here. If you're concerned about the various things nodejs can do (eg. sockets, etc), you can probably "strip it down" by removing modules and/or addons -- I think even the built in stuff is ultimately implemented in such a way that it could be stripped out fairly easily.
An alternate approach might be to simply replace, override, or remove the require function from node.js.
There's also envjs which should make it easier to run js that was designed to run the browser.
Related
I am curious, I am creating a flickr plugin for wordpress. I have noticed that the PHP that I have written is fairly slower then the same javascript I have written.
I know that Javascript is run client side so it will be faster as long as there aren't numerous processes already hogging the processor. With PHP running remotely I know that it is all based on connection and what is going on with the server. I was wondering if one was better to use than the other and if DOM is maybe not the best way to go grab XML. In this case in PHP I am using DOM to go and get the XML and then parse it out. With Javascript I am using SOAP to parse the same XML.
Assumption
JavaScript is required for this plugin.
JavaScript testing was only done on your development machine.
I think you need to rethink your metrics. In your particular case JavaScript is faster than PHP, but I don't see that being the case across the board. I'm assuming you're on shared hosting as are probably most end users of your plugin, so your PHP will not be on the fastest servers. Like Rory said above. It is best to diagnosis why your PHP is slow. With JavaScript you have to take into account the average user's device speed which could range anywhere from awful to amazing. My guess is your PC is somewhere near the higher end of the spectrum.
Without any code provided it's tough to give a definitive answer. I would recommend trying your JavaScript version of the plugin on as variable a range of devices and browsers as possible. Hitting on things like iPads and cellphones.
Due to that JavaScript potential performance pitfalls on low-end devices, I would probably perform the task on the server unless investigation shows that, in your case, the JavaScript is performant across the board.
You also can run javascript in the server-side with the V8JS class in PHP since version 5.3.3+ http://ar.php.net/manual/en/book.v8js.php
There have been great things happening in the Haskell web development world, and some of the available frameworks (Yesod and Snap server) seem quite mature. However the learning curve can be a bit steep, and perhaps building web apps cannot quite be considered Haskell's forte.
The answer to another SO question of mine indicates that writing PHP extensions in Haskell should be possible. Infact I'm currently in the process of trying to convert a small Haskell program to a PHP extension as a proof of concept.
So, the question is - is there a case for creating a Haskell web framework that is meant to be run as a PHP extension and leaves all the request/response / cookies etc. handling to PHP?
What would be the design decisions involved in creating such a framework? Right now, the only thing I can think of is that it would probably expose an XML/JSON API accessible by the PHP pages using GET and SET function calls.
I can't think of a use case where this makes any sense. If you want something else to handle the HTTP request/response, you'd be better off writing to the Apache API directly.
Introducing PHP gives you argument parsing and cookie handling but also introduces a lot of other silliness. Not only are many of the common practices very unsafe or insecure, but you are limited to content generation -- if you want to dispatch to other parts of code based on the URL you have to write all that yourself. Many mature PHP programs end up just having one "start" PHP script. You also will have problems if you want to do anything interesting with uploaded files, because PHP handles that in a suboptimal way.
You could theoretically do something very processor intensive in your Haskell extension, but you might as well just write a C extension for PHP in that case. And PHP invocations are never supposed to hang around for very long anyway.
Seems like you are limiting yourself to PHP's brain-damaged model of a web application for the very trivial benefit of argument and header parsing.
Writing a Haskell interface to the Apache API could potentially be liberating. You could rely on a battle-tested web server, and also hook into every phase of the Apache request cycle. Apache's way of preforking and killing children every now and then might be a way of dealing with Haskell space leaks, although it's a sledgehammer approach.
What frameworks/tools are there to help run javascript from PHP? Is there anything like the harmony project for PHP? I am hoping to perform JS unit tests (or even better, BDD) directly from PHP, inspired by this post (for Ruby). Am I asking for too much?
There is in fact the Spidermonkey PECL extension, which embeds THE mozilla Javascript interpreter in PHP. It will however not provide the document.whatever object tree that browsers have. So I'm not sure which kind of JS unit tests you could possibly accomplish with this.
Maybe you can utilize env.js and co like that Ruby project does. But I'm unware if a pre-made setup or framework for such purposes exists (most likely not). So much for the unconclusive answer.
If you just want to probe the user interface with jQuery-like features, then phpQuery might be an option.
If you need to run some Javascript code from PHP, on the server, the spidermonkey extension might be what you are looking for (quoting) :
This extension allow you to embed
Mozilla's Javascript engine
Spidermonkey in PHP.
I've used it -- for fun -- a couple of times, and it was working not too bad ; but note I have never used it in a production environment (and know no-one who do).
You should give mozilla's Rhino a try, if you want server-side execution of javascript. It is a sister project to spidermonkey, written in Java. It was designed to be used in cases where you want syntactically valid client code to run on the server (and, fyi, provides the foundation for google's closure compiler).
It's not an instant solution for javascript-in-php, but as demonstrated here http://ejohn.org/blog/bringing-the-browser-to-the-server/, it can be used for server side testing of client code.
When constructing a website, say a Q&A site or a just a forum site for a community, is just knowing HTML,CSS,PHP, MySQL, and javascript enough to make the site dynamic?
I am saying this because when I talked with my teacher, he said that major sites use many languages combined. And he said that a site shouldn't be designed only in PHP.
So is it possible to create a good website, not e-commerce, with only html,css, and php?
yes. there is no reason you should use more than one language internally. it makes making it all work together much easier in a server environment, where the extra load of IPC over function calls can slow things down considerably.
Ofc! :)
Lots of large/enterprise portals use only HTML, CSS, Javascript, PHP & MySQL.
But don't forget that there's always a right tool for the right job... A simple site (even an e-commerce) will run very well on PHP & MySQL.
Short answer: yes, it is possible.
Longer answer...
HTML, CSS, PHP and MySQL are already many languages, but I guess he means that most major sites have a heterogeneous back-end. This is probably not out of choice though -- more often it is historical. As people change and new technologies emerge, new pieces are built with different languages and frameworks.
I have built a forum and Q&A sites with HTML+CSS+PHP+MySQL and many other people have done the same, so this set of tools is without a doubt adequate for building something like this. In fact, I would argue that you could build almost anything on the web with that combination.
A more interesting question (that will generate more heated response) is what framework you use on top of that. A CMS like Wordpress of Drupal, or an MVC framework like Zend, CakePHP or CodeIgniter.
Or whether you should be dropping PHP entirely and using something like Django or Ruby on Rails. Knowing more than PHP will definitely help you to be ready for newer approaches.
The dynamism of a website comes from a server side language that can create a HTML output on the fly, that's it. You can add a DB, simple JS or AJAX, but those are merely optionals.
Now, as for your teacher, languages like PHP, Python and ASP are, in the end, the same. It's ridiculous to have the includes files in ASP, the main files in PHP and the configuration files in Python, that makes absolutely no sense. Maybe, hopefully, he was talking about using JS in conjunction with PHP and SQL which is a natural recommendation.
The fact that your teacher is obsessed with another technology doesn't mean that you can't work with PHP only.
As the others said, it is perfectly reasonable and possible.
(I'm personally obsessed with ASP.NET, but still, I won't say that it is the only way to go for everyone. And PHP is just geat for beginners.)
Is he referring to the front-end or back-end?
The front-end of a website - the part that the user sees and interacts with - must be written in HTML.
(It may optionally include CSS and Javascript to enhance it.)
The back-end is what generates the front-end, and also determines the structure and control-flow of the application.
There's absolutely no need to use more than one language for the back-end, and it's often simpler to stick to one language.
However, for the front-end, you have no choice but to use HTML; otherwise it isn't a web application.
The front end must output HTML. These days it should use CSS for formatting. It will likely use Javascript to provide client side capability. Requiring javascript will likely create accessability issues for some users.
PHP is one of several languages used to handle requests. It is in interpreted language requiring the source be placed on the server. This opens up significant security risk if the someone gets access to the server. Several hosting sites have had major problems with PHP based sites over the last few weeks.
Java is run in a compiled form and code is not required to be on the server. This provides a layer of security as it is not simple to modify the code. Java runs in a container, and usually a framework. Developing using the Spring framework in a Tomcat container is an option. The learning curve is higher than with PHP. It also has strong support for accessing remote resources which allows it to integrate with legacy applications.
With any of the languages, there is a risk that developers will use available functionality when they shouldn't. The Java J2EE model is appropropriate for some sites, but was often implemented because it was the fashion, and there are a lot of tutorials on using it.
I'm a student learning PHP. I basically make the stuff work, but never wondered about how the php.exe(on windows) works.
Does it compile the stuff it has to handle or not?
Reason I'm asking is because someone told me that ASP.NET has to compile all website-dependant data is has/receives, like everything that gets uploaded through a form.
He claimed that PHP is faster on that subject, since it does not have to compile anything.
I can't find any good information on either subjects, so I'm asking here.
PHP is a scripting language, and so is not compiled as such. It is parsed into an interal representation which is in turn "run" by the PHP runtime.
Your friend is correct in so far as that ASP.NET is compiled. However, it's only the actual program instructions that are compiled, not data. The way PHP and ASP.NET treat incoming (and outgoing) data are pretty similar in principle. If anything, ASP.NET will be faster than PHP because it is compiled, since compiled code generally runs faster.
As #chaos has said, ASP.NET does not "compile" data received through a form.
Most likely what your associate is referring to is called ViewState in ASP.NET, and if that's that he's talking about, he's correct, although he mislabeled it as "compiling". ViewState does encode and store the state of the form and the server does need to decode this data and apply it to the object model. It uses this information to raise events that programmers can hook server-side, providing a much richer model for programming.
And, yes, this is a performance hit. PHP can be faster than ASP.NET; I've worked as a PHP developer and as an ASP.NET developer and I can attest to this.
But performance is not everything--more time is spent in data transit than it is in processing on a web server for all but a very few niche cases. And there are other aspects of your system that matter more than raw pushing power. ASP.NET trades that raw performance for other things.
This is where ASP.NET shines and PHP fails horribly. PHP cannot offer nearly the capability of ASP.NET for things like modularity, maintainability, security, re-usability, and general base library capability. Yes, PHP can be faster than ASP.NET. But ASP.NET is still superior.
Of course, ASP.NET sucks, too, IMO, but that's more because of some design decisions that I frankly disagree with. But I'd much rather use it that PHP any day of the week.
Neither PHP nor ASP.NET normally compile (interpret as program instructions and convert to executable code) data received through forms.
Possibly your associate may be confused about the difference between compilation and data sanitization, or something. I really don't know.
No. The type of data you are talking about is passed through to PHP as environment data from Apache. You can do it yourself with command line options to php.exe if you want.
Data is rarely compiled unless it is part of a resource for a program.
PHP itself is an interpreted language, which means that the code is never compiled into a machine-friendly format, it is simply scanned and parsed by the interpreter in order to be executed.
Tor Haugen is right, PHP is an interpreted language meaning the files remain as plain text on the sever and are interpreted as they are requested. ASP.Net is a bit of a hybrid because the *.aspx, *.ashx, *.ascx, etc. files are all interpreted while external libraries are compiled into DLL files that are then linked in like a normal desktop application. So if you have, for instance, several projects, one of which is an ASP.Net web application that relies on several class libraries, you would have several plain text files (web app files) and several DLL files that are generated and used by the server. You can use DLL files which PHP but it isn't as seamless. Usually such "class libraries" would simply be "included" as additional plain text files