Use php function unless Javascript is enabled - php

I have a site that will scrape for new data on the first page visit. I would like to use AJAX to do this, so that I can present the user with at least some loading.gifs during the scrape, but that is only if Javascript is enabled.
My site utilizes a PHP template engine, so I thought to put the scrape function in <noscript> tags in the html template. Since this would occur after all the PHP code, I would have to reload the page so I can render/parse the scraped data with PHP.
This method seems a little sloppy, I was wondering if there is an efficient way to do this.

Is there any reason why you need to do this with Javascript? Why not just do it with PHP, using cURL to get the content and other PHP extensions such as SimpleXML to get hold of the content? You could even cache the scraped content in a database table for a set amount of time so that each page load doesn't force PHP to do the scraping all over again.

The <noscript> tag will only interpreted by the client (web browser) after the page has been sent, which means it's too late to trigger a php function. I would probably give up on the dual approach if I were you, but there might be some other (hacky) ways to accomplish this...
Maybe you could put an iframe in the <noscript> tag, whose src is no-js separate scraping script.
Or you could test js capability on a landing page, then send people to the page made for their setup.
Possible helpful links:
Check if JavaScript is enabled with PHP
Can I let PHP know that a user does not have javascript enabled?

Related

How to make the 1st part of the site loads first? (Like in Google PageSpeed)

I have a very large site and it takes pretty long time to load. It takes around 120 seconds. What I'm trying to do is loads 1st half of the site loads 1st. Then user can surf while others parts are being loaded.
What I'm trying to do is below.
1st of all is this possible ?
According to my knowledge Yes since Google PageSpeed does that. But the problem is if I use PageSpeed I would have to change my DNS server settings and etc. I would like to do this myself.
How can I get it done ?
What type of technology should I use ?
Given that pages have the .php extension and written in PHP language.
You can use the concept of lazy loading.
You can load only content that is necessary during the load then using jquery and ajax you can load the remaining content.
In this way user can surf and interact easily with the the part already loaded while the other part will be loading asynchronously.
jQuery ajax or post method can help you on this.
A simple example could be,
If There are 5 parts of contents in your page, 2 needs to be loaded immediately
The page will be loaded with 2 parts loaded, so it will take quite less time than 5 parts loading
After document is loaded you will use ajax to load the remaining 3 parts.
Ajax will send request to the specific page of your website(can be possibly named AjaxRequestHandler.php) with some parameters, and this page will process your request and generate html for this and will send it back to your main page which will just show this returned html and this all be happening asynchronously, so the user will be able to communicate with the initially loaded 2 parts
And even if you are new to web technologies, I suppose you have to have the knowledge of atleast ajax and asynchronous calls etc. to achieve lazy loading.
Edit :
For your this question
Except AJAX Is there way around for this?
I think you can try iframes if they can help.
Loading the main content in the page load without iframe while loading other contents in the iframes after pages is loaded.
This jsFiddle
jsfiddle.net/cGDuV/
can help you understand lazy loading with iframe, mentioned in this post of stackoverflow.
You can use javascript for the same if you want to avoid jquery.
You can manipulate the output buffer such that it flushes early thus achieving what your after in the screenshot you posted in your question.
http://www.stevesouders.com/blog/2013/01/31/http-archive-adding-flush/
You can lazyload all your images. Here's a jquery plugin that does it easily
http://www.appelsiini.net/projects/lazyload
You can combine all your js in one file. Same with your css files. This will help the speed.
You can incorporate caching, expires headers and gzip/deflate compression
https://github.com/h5bp/html5-boilerplate/blob/master/dist/.htaccess
I would suggest you load your 3rd party javascript widget garbage (Google+ buttons, fabebook like buttons, social, twitter stuff) in a non blocking asynchronous way so it does not slow down the page in the beginning.
http://css-tricks.com/thinking-async/
Optimize your images as much as possible.
http://kraken.io/
Use a CDN
http://www.maxcdn.com/
Finally test your site and see where is the big bottleneck and where you can improve the site for speed optimization. Use the waterfall chart feature
http://www.webpagetest.org/
One of the things you can do is to load all the essential (top half) of the page normally, then use javascript/ajax to load the second half of the page. This is a very common technique (and is often used to load images).
Here is an excellent tutorial from jQuery for Designers, walking through how to use jQuery to load images asynchronously after the page loads. http://jqueryfordesigners.com/image-loading/
Having said that, a two minute load time seems very excessive. Maybe you should check if there is anything that could be slowing down your server.
You need to determine why the site is loading slow. What is the size of the data you are sending? Google and Firefox have web developer tools to help you determine which elements are taking the longest too load. Once you've determined the culprit, try to load the worst offenders asynchronously.
Check out this article on aync requests: https://segment.io/blog/how-to-make-async-requests-in-php/
in my opinion you need an endless scrolling solution. That is, have a fixed amount of content per "page" (could be an estimated 1500px worth of height). Use jQuery to load another "page" when user scrolls down by a set amount.
If you really want to unconditionally load all the content, just use the same approach, and on document ready trigger the next page to load. The loop the page loader until the whole thing is loaded. That way, you load the first "page", and defer the content "below the fold" to subsequent requests.
What you want is what Facebook does Bigpipe and here is a relevant SO post: Facebook Bigpipe Technique Algorithm
There are other solutions involving all sorts of Javascript but since you want PHP and Facebook uses PHP you should read up on Bigpipe. Juho even has an example written in PHP so that should meet your PHP requirements (but yes it still requires js but not AJAX).
Prefetching Resources the web page require large files for loading can often benefited from changing the order that those files are requested from the server. Sometimes, it makes sense to download files before they are necessary, so that they are instantly available once requested. When the resources required for a page can be loaded in advance, the user-perceived network latency for that page can be significantly reduced or even eliminated. When you run Google pagespeed insights and see the result, you will see how the fix the problems in your website.
Some tips to load site faster:
Make fewer HTTP requests
Add a far-future expires header
Gzip your page's components
Minify your JavaScript, CSS and HTML
One more thing when loading a webpage and if you are using php with smarty you can use this plugin which reduces the number of http requests to you server and makes the site load faster by combining all the js and css resource's request into one single HTTP request.
Alternatively you might be looking for these plugins.
http://masonry.desandro.com/
http://isotope.metafizzy.co/
http://www.wookmark.com/jquery-plugin
Does all this stuff have to be on the same page? Does it make sense to split the content over multiple pages? Can some of it be delayed until the person requests it? Can it be grouped into tabs? Hidden tabs could be lazy loaded for instance.
Give serious thought to restructuring the content in other ways. You might be able to come up with an alternate arrangement that simplifies the problem.
Having in mind all that was mentioned above you may think of caching parts of your data/html code with memcache or in any other way possible so you skip its generation every time. Of course this depends pretty much on how often the data changes.
Don't browsers render the document as it comes in? Whatever you put at the top of the file will be received by the client first, and therefore will be displayed first. For example, when you try to view a very large image file online, it loads from top to bottom. The same is true for web pages. Just put the content you want to load first at the top of the page!
Answer to question one: yes
Answer to question two: above
Answer to question three: Nothing, just put the page in the correct order.
Well the idea is more or less the same as described by Pawan Nogariya above. You will need to fetch views and data asynchronously and then display these. But this means that you will never redirect or post back to any other page rather will get every view via ajaz. This will make you application SPA (Single Page Application) like Gmail. And, this will also mean you need to keep track of what has been renedered and what not, leaving you in a mess. So, instead of doing everything your way there are already developed and popular frameworks available that let you do that but they also make it SPA. Which means that your application doesnt "posts" to the server as in redirection but everything is doen using Ajax.
You can use Backbone (Backbone.js), Knockout (Knockout.js) and may others to achieve this. These are javascript based frameworks that help achieving what you have just asked and may expample and tutorials are also easily available. You can use it with any language as we are using it with C# (MVC) for a relatively large applicaiton.
this is going to be ugly! You should definitely consider using ajax calls to load page fragments AFTER a first content stage is loaded!
This is going to break almost all known web standards, but it might render the website in parts....
this being said: here's the ugly stuff
First: get rid of the <html> tag of your website, start with the <head> DO NOT use a <body> tag either.
Now send your html-code in the order you want it to be loaded (top first) using echo ...
after each closing tag of a group (say </table> or </div>) use flush(); ob_flush(); this will send all known content to the browser immediately.
The browser now decides if it can render the known content or not and if it will (based on the browser specifics and user settings) but with few exceptions it will.
some browsers like to wait for the closing body-tag that's why we dropped it, others even wait for the closing html tag (safari afair) that's why we dropped that too.
If you use the echo-flush scenario wisely you should be able to split the page into renderable parts which most browsers will display without an error.
Again... don't do it this way.. it's bad, ugly and not even near any web standards
But you asked for it.
For your this question
Except AJAX Is there way around for this?
I think you can try iframes if they can help.
Loading the main content in the page load without iframe while loading other contents in the iframes after pages is loaded.
This jsFiddle
jsfiddle.net/cGDuV/
can help you understand lazy loading with iframe, mentioned in this post of stackoverflow.
You can use javascript for the same if you want to avoid jquery.
With pure PHP? Not smart.
$(function() {
$('#body').delay(1).fadeOut();
});
Fiddle example: http://jsfiddle.net/r7MgY/

Get contents of DOM via PHP

I need to get the contents of a website through PHP, however, the content is only available when JavaScript is enabled. The workaround that I am using now is making an applescript to open the website in Safari, and selecting all of the page content, copying it to the clipboard, and pasting it.
That will be really hard to achieve I guess. If you observe the JS on that page that is responsible for getting the content ready, you may discover its just another AJAX call that you may be able to call directly from your PHP script.
best possible solution: ask the website owner for api/export access ;)
If that is not possible, you can only pray that you can analyze the requests that are initialized via JavaScript and imitate them.
(possible tools: firefox with firebug or tamper data plugin).
Warning the owner of the website might not like this approach, in fact, it may be disallowed to scrape the data automatically
What do you mean by:
the content is only available when JavaScript is enabled
Does the page pull data from somewhere via JS? Would it be easier to analyse where the data is coming from and access that place directly from PHP?

How to include the static HTML results of a dynamic Javascript page in PHP?

I have a small script that pulls HTML from another site using Javascript.
I want to include that static HTML that gets pulled in a PHP page without any of the Javascript code appearing in the final PHP page that gets displayed.
I tried doing an include of the file with the Javascript code in the PHP page, but it just included the actual Javascript and not the results of the Javascript.
So how would I go about doing this?
You would need to fetch the page, execute the JavaScript in it, then extract the data you wanted from the generated DOM.
The usual approach to this is to use a web automation tool such as Selenium.
You simply can't.
You need to understand that PHP and Javascript operate on different places, PHP on the server and Javascript on the client.
Your only solution is to change the way all this is done and use "file_get_contents(url)" from PHP to get the same content your javascript used to get. This way, there is no javascript anymore and you can still pre-process your page with distant content.
You wouldn't be able to do this directly from within PHP, since you'd need to run Javascript code.
I'd suggest passing the URL (and any required actions such as click event, etc) to a headless browser such as Phantom or Zombie, and capturing the DOM from it once the JS engine has done it's work.
You could also use a real browser, but of course you don't need a UI in your case, and it might actually get in the way of what you're trying to do, so a headless browser might be better.
This sort of thing would normally be used for automated testing of a site (ie Functional Testing).
There is a PHP tool named Mink which can run these sorts of scripts from within a PHP program. It is aimed at writing test scripts, but I would imagine you could use it for your purposes.
Hope that helps.

How to parse content loaded by javascript after the dom is complete

I have been working on parsing some of the data from the wow armory and have come into a bit of a snag. When it comes to the site serving up the achievements that players have received, it uses javascript to intemperate a string such as #73:1283 to display the requested information. (I made this number up but the data for the requests are formated like this).
Is it possible to pull data from a page that requires javascript to display its data with php?
How do you parse data from a site that has been loaded after the dom is ready or complete using php?
By using Firebug, I was able to look at the HTTP headers to see what AJAX calls were being made to generate the content on these pages: http://us.battle.net/wow/en/character/black-dragonflight/glitchshot/achievement#96:14861 and http://us.battle.net/wow/en/character/black-dragonflight/glitchshot/achievement#96
It looks the page is making an asynchronous call to load this page: http://us.battle.net/wow/en/character/black-dragonflight/glitchshot/achievement/14861 when the part after the hash is 96:14861, and a call to http://us.battle.net/wow/en/character/black-dragonflight/glitchshot/achievement/96 when the part after the hash is just 96. Both of those pages return XML that can be parsed to render HTML.
So generally speaking, if there's just one number after the hash, just put http://.../achievement/<number here> as the URL. If there are two numbers, put the second number at the end of the URL instead.
What you'll need to do, rather than pulling the Javascript and interpreting it, is make HTTP requests to those URLs by yourself in PHP (using cURL, for example) and parse the data on your own.
I would really recommend learning JavaScript and jQuery, since it will be very hard for you to really build a good site that pulls information from the WoW Armory without understanding all the AJAX loads that are going on in the background.
I would recommend seeing if you can replicate the query sent by JavaScript in PHP. While I don't believe there is a way to process JavaScript in PHP, there definitely isn't a simple or scalable way.
I would attempt to scan the first page's source that you downloaded with PHP for strings of that format you mention. Then if the JS on their site is querying something like http://www.wow.com/armory.php?id=#72:1284 you can just download the source of that next. You can find out how the JS is querying the server with something like FireBug or the Inspector in Chrome or Safari.
So in summary:
Check to find the JS URL format and if you can replicate it.
Create PHP to get main page and extract all strings.
Create PHP to loop through these strings and get these pages (with URL that JS requests).
Do whatever you wanted to with that information.
You can try jquery's $(document).onready function which helps
to run java script code when the web page loads up.
ex
<div id="wowoData">#4325325</div>
<script>
$(document).ready(
function(){
$("#wowoData").css("border","1px solid red");
}
)
</script>

how to get page HTML at client side or javascript

how to get page HTML at client side or through javascript in Asp.net Application. Means if I want to get the html of http://www.yahoo.com on client side through javascript or any other
You can't get the HTML source of a page on a different hostname from JavaScript, for security reasons (the Same Origin Policy).
So unless you're Yahoo, you would have to run a proxy on the server-side that will fetch http://www.yahoo.com/ and then return its content to the client side via a string in a <script> block, or in the response to an XMLHttpRequest (also best JSON-encoded). This is known as a cross-domain proxy.
If you mean get the page html as a string in javascript, you can use:
var s = document.body.innerHTML;
Though you need to note that this does not give you the html exactly as sent to the browser, it gives you the html constructed from the DOM - essentially meaning any errors will have been fixed, as well as that it will include any dynamically created elements.
link :
http://www.boutell.com/newfaq/creating/include.html
there are two ways to create client side includes:
JavaScript and iframe. Let's look at the advantages and disadvantages of both before we tackle how to do it.
The JavaScript method is the more seamless of the two. JavaScript code can fetch a fragment of a page from any URL and insert it into another page at any point. The end result looks as good as a server-side include— but only if JavaScript is turned on. And search engines don't see the included text at all, which is a serious problem.
The iframe method is simpler. The iframe element can be used to force a second page to "embed" inside the first page, in much the same way that Flash movies, videos and MP3 players are embedded with the object element. And JavaScript doesn't have to be turned on. But there are disadvantages here too. The iframe element has a fixed width and height, no matter how big the content is. That can mean scrollbars inside your page. And, as of this writing, Google doesn't appear to index the separate page referenced by the iframe so that searchers can find your page.
You use Ajax.
I recommend using the jQuery Ajax javascript library for this.
Do you mean a PHP function similar to file_get_contents($url) ?

Categories