I am using data scrapping technique(Parsing) in php to get the data from a web page using html_dom class. This page has some AJAX method to load more data when we scroll down the page but in page source there is only data that loads first time mean when we browse the page first time.
So my question is how to get the all page source that loads through AJAX??
Thanks
If you are using chrome, you could use developer tool or F12 button. Then go to network tab and tick the preserve log button and select the XHR tab to see which source is being loaded in that page.
Related
How can I give a feedback to the users by showing a loading bar, similar to what browser displays on the place of favicon.
The thing is, I am not using a javascript or ajax, I am calling an API, which will take time to load, I just want to give feedback to user by showing a loading bar, which will stop as soon as the page is finished loading.
In straight words, I want to exactly show what the browser shows - a loading animation, but this loading animation should be in the page, so that the user will know what is going on.
How can I do this? I am using PHP in the backend
I think what you're looking for is a preloader. Check out this tutorial to see the exact code to accomplish it. Essentially you're creating a large overlay in HTML/CSS that will disappear when the JS code detects the page is finished loading.
I'm looking to get information from an external website, by taking it from a div in their code. But using the file_get_contents() method doesn't work because the information isn't in the source code for the page. It only shows up after the page loads (It's available if you use an inspect element in the web browser).
Is there a way to do this? Or Am I just out of luck on that?
I am making an Ajax request query to get variables containing HTML code within a php file. I am currently using this method to display the content retrieved from the Ajax request:
document.getElementById('my_div').innerHTML = request.responseText;
However, when I view page source (Chrome) to see the HTML code on the page, the HTML retrieved from the Ajax response is not shown, but the effect of that HTML code is visible on the page.
Anyone know how to make the Ajax response visible as HTML so that it is visible on view page source?
View source will only show the html of the document when the page was loaded. Any dom manipulation done after that will not be visible in the view source.
You can see it in elements tab in the chrome console (f12) though.
Page source is the source as at when the page was loaded. You cannot get AJAX content to display here.
You can inspect the markup using the Webkit Inspector (Right-Click -> Inspect Element) on Chrome/Safari or Firebug on Firefox or use a debugging proxy such as Charles to watch the data sent/received (including any request data and the response HTML)
You can use your browser development tools to monitor network activity. This usually allows you to see the requests and responses your browser has issued for your current web page, including AJAX calls.
I'm trying to code a seo friendly ajax portfolio right now. My goal is to provide javascript effects to users and normal html to bots/users without js.
Files:
index.php (starting point of my program)
aboutme.php (contains html code for "about me")
contact.php (contains html code for "contact")
The idea:
User visits index.php and clicks on "About me" -> loading animation appears -> aboutme.php gets loaded with ajax -> history.pushstate rewrites the url to aboutme.php.
-> When the user shares the current website url on fb/twitter/g+ the bots will get the correct title, body etc., as it is the normal html page without any javascript.
But my problem is: If other users open that page, they see directly the content. But I want to show them a loading animation first until the data got loaded with ajax (similar as they click on a link).
How can I achieve such an approach? Thank you very much!
Best way to do this, create a javascript file. Write the link tag into "head" part. This will make the javascript file downloaded before content. Javascript shows the animation, but at the same time browser will be downloading the content already in the background. On "document ready" event, stop animation.
This will let bots to access the content directly. Because javascript won't work for them.
To make your ajax content crawlable see https://developers.google.com/webmasters/ajax-crawling/, Bing supports this as well. Or use the HTML5 pushState, see http://www.seomoz.org/blog/create-crawlable-link-friendly-ajax-websites-using-pushstate, https://github.com/blog/760-the-tree-slider etc.
I've always thought this is more effort than its worth(generally), but to answer your question:
index.php, aboutme.php, contect.php should deliver full html.
certain links should have js event handlers intercept the click, and instead of loading aboutme.php, they load aboutme-content-only.php in the background. then update the dom and push state etc...
this way the site can easily degrade for those users who are first time visitors, as well as those whose browsers dont support push state or javascript.
I think that it is not a problem at all, keep your href of links as usual, then using JavaScript or jQuery change the default behavior of clicking link to load the linked contents with ajax.
My website is setup like: when the page loads, a jQuery animation -- in the template -- is played which, on completion, further makes an Ajax call to fetch the content of the page. Now, I want to deploy hashbang like http://com-address/#!page and I need to retain the animation at the page load as well. But I assume the problem with this setup is; when the web-crawler visits the page, it doesn't wait for the animation to complete and make the Ajax call. It requires the state of the page with the loaded content (which, in my case is acquired after the animation is completed).
Given the above scenario, which way is better:
Change the entire flow and load the page content preemptively and hide it
till the animation is played.
Only when the hash-bang or _escaped_fragment is found in the URL:
a. follow step 1.
b. load the page with the content without animation.
My assumption about the web-crawler is incorrect & let the
current flow as it is.
Any heads-up advice?
EDIT
#kdzwinel, thanks for the tip about text-browsers!
On the second thought, I'll go with the option 2(a) because when the crawler visits the resource with fragmented URL, it should get the full resultant content on the page. And if the user navigates directly to the fragmented URL, the user-experience with animation would be intact too (by removing the content on the dynamic content fragment between the page load and the animation starts).
Also for all the other scenarios, we would continue the old flow (animate then fetch via Ajax) because we don't want to refresh the page since the user is already visiting the website and trying to navigate smoothly/seamlessly using anchors with fragmented URL (binding their click events to begin animation).
Web crawlers don't execute javascript (webmaster guidelines -
If fancy features such as JavaScript, cookies, session IDs, frames, DHTML, or Flash keep you from seeing all of your site in a text browser, then search engine spiders may have trouble crawling your site.
), if you wan't your content indexed go with option #1. Load the content when page is opened and if browser supports javascript - hide the content and show the animation.