I need a way to get full HTML content of a page after Javascript has loaded and executed.
It must be built as server side application (Linux) and can include any 3rd party software (some browser without GUI or anything else that could help).
Can this be done using PHP or Cpp, if not what other options do I have ?
This is a strange subject - I'm having hard time finding information on it.
Thank you for any help.
So I found something on subject - Any way to run Firefox with GreaseMonkey scripts without a GUI/X session , but if anyone has something to add I'm still open for suggestions.
It seems that Node.js could help
You want this http://phantomjs.org/. It uses JavaScript but lets you do anything you could do in a web browser (including viewing current state of the DOM).
Related
Is it possible to hide the .php file on the server...?
I have a website which sometimes calls php files inside iframes, now I wouldn't like it if somebody copied that code, so how would I hide it?
Or do I have to encrypt it?
Speed is a huge matter in my case, so anything that doesn't affect performance is appreciated!
Thanks
With a correctly configured web server, the PHP code isn't visible to your website visitors. For the PHP code to be accessible by people who visit your website, the server would have to be configured to display it as text instead of processing it as PHP code.
So, in other words, if you visit your website and you see a HTML page and not PHP code, your server is working correctly and no one can get to the PHP code.
Which code? Your PHP source code? The only code a user see is your html code, PHP is processed on the server side!
If your php-files are parsed by the http server, nobody can get them.
If you're still paranoid after the assurances provided here, you can make your code much more difficult for someone else to read by "obfuscating" it (Wikipedia link).
If you Google "php obfuscator", you'll find tons of PHP obfuscator products, many of them free.
Some examples:
PHP Obfuscator
Code Eclipse
Professional PHP Obfuscator/Encoder
Obfuscation does not affect performance. Only readability for humans.
If someone access a php file on your site all they will see is the code output by the PHP script (e.g. any HTML, or Javascript) - they won't see the source for the PHP page itself (and will have no way to access it).
If you are concerned about them seeing the output (e.g. the HTML the PHP script generates) from a practical point of view, there isn't anything you can do about that (the most you can do is obfuscate it, but that is largely pointless).
I have a website which sometimes calls
php files inside iframes, now I
wouldn't like it if somebody copied
that code, so how would I hide it? Or
do I have to encrypt it?
No, that makes no sense and would not work. You have to realize that the PHP code is executed on your server to serve a HTTP request, and that the iframe results in a separate HTTP request from the main page.
If you want to prevent others from including the iframe in their own page, you could check the referrer header and have the iframe page show an error if the referrer is not from your site, but that could cause problems for some legimitate users and can also be circumvented.
Alternative solution: do not use iframes; instead, integrate the PHP code that currently displays the iframe's content in your main page. This will work for all users and cannot be circumvented.
Of course, you still can't prevent others from requesting your page, extracting the content from the HTML and displaying it on their page - that's just how the internet works.
Put your important files like passwords login etc into a folder outside the web folder. E.g. under C: you can set this include path in php ini file. Then you are pretty safe. Definitely you should store your mysql access code outside the htdocs folders. I think The php code is "includes". So check yourself. Good luck
So I am using AJAX to call a server file which uses WordPress to populate a pages content and return. Which I than use to populate fields. Now what I am confused about is, how do I create the snapshot and what do I have to do to make google know I am creating one besides #! also why do I do this? The escaped_fragments are a little unclear to and hope I could get a more detailed explanation. Does anyone have any tutorials that walk you through this process similar to what I am doing?
David
Google's crawlers don't typically run your JavaScript. They hit your page, scrape your HTML, and move on. This is much more efficient than loading your page and all of its resources, running your JavaScript, guessing at when everything finished loading, and then scraping data out of the DOM.
If your site uses AJAX to populate the page with content, this is a problem for Google and others. Your page is effectively empty... void of any content... in its HTML state. It requires your JavaScript to fill it in. Since the crawlers don't run your JavaScript, your page isn't all that useful to the crawler.
These days, there are an awful lot of sites that blend the line between web-based applications and content-driven sites. These sites (like yours) require client-side code to run to get the content. Google doesn't have the resources to do this on every site they encountered, but they did provide an option. That's the info you found about escaped anchor fragments.
Google has given you the opportunity to do the work of scraping the full finished DOM for them. They have put the CPU and memory burden of running your JavaScript back on you. You can signify to Google that this is encouraged by using links with #!. Google sees this and knows that they can then request the same page, but convert everything after #! (which isn't sent to the server) to ?_escaped_fragment_= and make a request to your server. At this point, your server should generated a snapshot of the complete finished DOM, after JavaScript has ran.
The good news is that these days you don't have to hack a lot of code in place to do it. I've written a server to do this using PhantomJS. (I'm trying to get permission to open the source code up, but it's in legal limbo, sorry!) Basically, PhantomJS is a full webkit web browser but it runs without a GUI. You can use PhantomJS to load your site, run all the JavaScript, and then when its ready scrape the HTML back out of the page and send that version to Google. This doesn't require you to do anything special, other than fix your routing to point requests with _escaped_fragment_ at your snapshot server.
You can do this in about 20 lines of code. PhantomJS even has a mini web server built into it, but they recommend not using it for production code.
I hope this helps clear up some confusion!
I have created a html page which sends custom data to a php file which then processes and evaluates it.
My next task is to make this into a GUI with the requirements:
1. A box for a custom search with button (it then posts this into the
php)
2. A box where xml/json request can be seen
3. A box where the xml/json response can be seen
4. A box where the parsed version is translated and made to look pretty.
***MUST CONNECT TO INTERNET, PHP ESTABLISHES CONNECTION BUT DO NOT WANT A GUI ISSUE
Any suggestions on programs or languages etc which can help me communicate with PHP in GUI form. It needs to be able to access the internet!
I was thinking perhaps Visual Basic as that's the only one I've ever used that really uses GUI's but I'm wondering what you all think!
Thanks!
Basically, what you're asking for is a web browser, with a very simple little HTML/Javascript front-end web page to make the PHP calls and display the results. I'm not entirely sure what it is about a browser environment that makes you think it's unsuitable, but it's basically exactly what you're asking for.
If a full-blown web browser really isn't suitable, you could try using a web browser control inside a simple GUI app. This would still work exactly the same, but would be without the browser controls, such as the URL bar.
Just use a browser.
If you don't want to do that -- build a browser.
If you are just looking for basically a web-based REST testing tool, try the Firefox RESTClient plug-in.
Why don't you use a framework ?
You may take a look here:
AppJS for Linux, Windows and Mac using HTML, CSS and Javascript
Adobe AIR : cross-platform using ActionScript/FLEX or HTML/Javascript
Titanium : HTML/CSS (no support anymore)
PhoneGap : mainly used for cross-phone-platform, but here's an Windows implementation of it (you should read the README.md ...)
You may also check this from Mozilla
I have a simple php driven website running and I'm trying to figure out how it treats php pages. Some of my php documents are routing logic and some just includes for individual pages. How do i go about making this work offline?
What I though was that I'd have to re-create the routing logic in javascript. Is that my only option? In that case, is it even possible to have the site be driven by php while online and switch to JS offline? I can't make sense of it.
If your site is fairly static, HTML5's cache manifest may get you most of the way there. Have PHP output a cache.manifest file in the correct format with all your routing system's URLs and those URLs will be stored locally in a compliant browser. Attempting to access them will pull them out of the cache if possible.
If you're looking for something more dynamic, though, you're going to have to do more legwork.
Here's some good info on offline caching.
It is important to remember that PHP is processed on the server. The result of your PHP code is all that is sent to your browser. Your browser has absolutely no knowledge that PHP was even used to make the page!
If you have some dynamic code that must run offline, then you must use Javascript. If this is just for testing on your own machine, put a web server running PHP on your dev machine and acccess it via http://localhost.
HTML5 offline caching does not work to make your pages interact; it works only to make a particular page available offline. Basically, it works on a URL-by-URL basis. If you absolutely need offline functionality, you will be forced to make it work in JS.
Also, make sure your manifest includes all resources used by all pages.
Hope this helps!
It seems obvious not to use any server side scripting language file while caching it in your browser. PHP/JSP/ASP etc all are server side language we cant fulfill the request forwarded by client that need to be generated dynamically and most importantly there is no server running on client side. SO , i think we should go for JS whenever we want to do such things.
Is it possible to hide the .php file on the server...?
I have a website which sometimes calls php files inside iframes, now I wouldn't like it if somebody copied that code, so how would I hide it?
Or do I have to encrypt it?
Speed is a huge matter in my case, so anything that doesn't affect performance is appreciated!
Thanks
With a correctly configured web server, the PHP code isn't visible to your website visitors. For the PHP code to be accessible by people who visit your website, the server would have to be configured to display it as text instead of processing it as PHP code.
So, in other words, if you visit your website and you see a HTML page and not PHP code, your server is working correctly and no one can get to the PHP code.
Which code? Your PHP source code? The only code a user see is your html code, PHP is processed on the server side!
If your php-files are parsed by the http server, nobody can get them.
If you're still paranoid after the assurances provided here, you can make your code much more difficult for someone else to read by "obfuscating" it (Wikipedia link).
If you Google "php obfuscator", you'll find tons of PHP obfuscator products, many of them free.
Some examples:
PHP Obfuscator
Code Eclipse
Professional PHP Obfuscator/Encoder
Obfuscation does not affect performance. Only readability for humans.
If someone access a php file on your site all they will see is the code output by the PHP script (e.g. any HTML, or Javascript) - they won't see the source for the PHP page itself (and will have no way to access it).
If you are concerned about them seeing the output (e.g. the HTML the PHP script generates) from a practical point of view, there isn't anything you can do about that (the most you can do is obfuscate it, but that is largely pointless).
I have a website which sometimes calls
php files inside iframes, now I
wouldn't like it if somebody copied
that code, so how would I hide it? Or
do I have to encrypt it?
No, that makes no sense and would not work. You have to realize that the PHP code is executed on your server to serve a HTTP request, and that the iframe results in a separate HTTP request from the main page.
If you want to prevent others from including the iframe in their own page, you could check the referrer header and have the iframe page show an error if the referrer is not from your site, but that could cause problems for some legimitate users and can also be circumvented.
Alternative solution: do not use iframes; instead, integrate the PHP code that currently displays the iframe's content in your main page. This will work for all users and cannot be circumvented.
Of course, you still can't prevent others from requesting your page, extracting the content from the HTML and displaying it on their page - that's just how the internet works.
Put your important files like passwords login etc into a folder outside the web folder. E.g. under C: you can set this include path in php ini file. Then you are pretty safe. Definitely you should store your mysql access code outside the htdocs folders. I think The php code is "includes". So check yourself. Good luck