I'm trying to come up with a way to get the all HTML/text that a user sees on any given URL, even though much of what they see may be produced dynamically (on page-load, for example) that is not in the DOM, then manually loading the javascripts and putting the resulting data back into the page.
My thinking is this:
(naively) return array of all javascript files by scraping all the <script> tags src attribute.
return array of all on-page hard-coded javascripts like: <script> var example = true; </script>
create a function to decide the real URLs encountered in the internal and external page javascripts. For example, when encountering for example $.ajax({ url: '/relative-js-file.js', it would figure out the absolute URL so PHP may access that page.
using PHP, load all of the javascript that was found on the page in a way that resembles it being loaded on the actual page itself (the page it came from).
take whatever data the javascript returns (plain, html, etc.), and inject this new plain-text and/or HTML back into the original page <body>.
I do realize this will not work a lot of the time, but my hope is that it would at least be a good starting point until I can find a better solution or create a more advanced function to handle unrecognizable/inaccessable javascript. For examlpe, the javascript itself preventing it from being loaded on any page other than its own.
My Question
Do you think this is a good approach to getting dynamic content that is not in the DOM, and forcing it in the DOM? Or can you think of a better approach? I appreciate your feedback and thoughts.
Related
I have recently been debating whether or not to use AJAX for my site navigation to only transfer over only needed updated HTML elements, or if there was not a significant difference between new elements and current elements to just load the page in its entirety (php generated or static html)
However, I thought how that if the new content was not large in size relative the current page...that perhaps I should send it has a hidden div (via CSS) along with the current page.
This 3rd way seems like a simple solution. For example just send the entire current page + any additional content that might be requested by the user as hidden divs.
When the user selects the content just hide the current content, and display the hidden content...
All in all, each way (normal, Ajax, CSS) would look the same to the user but a CSS / Javascript solution would be provide the quickest interface and be the simplest. Ajax might cut down on download for example if the content is never used.
This is a validation question. Is this a valid way to navigate a web application? By hiding/displaying divs using the display property or opacity property to flip through content?
Notes (response to answers)
The Hidden divs would be static data that is not changed by the user. I thought this would be obvious but now I've made it explicit.
Thanks!
You need to think about it this way.
If the hidden content is dynamic (changing) there is a need for AJAX because AJAX would usually be used to fetch updated content from the DB.
If your content is static (not changing) then how much new content are we talking? Does the size of the new content have a significant impact on render time if it was in a hidden DIV? If its very little, then I would say use a hidden DIV. If its alot, then mabye it's time to consider AJAX to load it in from an external page.
Here is a simple solution to get you started using a hidden DIV:
<script>
function setVisibility(id, visibility){
document.getElementById(id).style.display=visibility;}
</script>
<div id="message1" onclick="setVisibility('message1', 'none');setVisibility('message2', 'inline');"> >Hey What's Up?</div>
<div style="display:none;" id="message2">Not Much You?</div>
It does depend on the data being shown. Facebook could not do this because the data is updated often, there would be out of sync problems.
Since i do most of the development by myself, having a full ajaxy site seems like a large task for me, so i usually keep page flips, but some of the content on the inside i have it as ajax generated.
Like say there is a form to create something, i have that form submit in the background (excellent jquery plugin for that) then i have the new data be displayed at the top of the list ($.prepend). This way, things are still ajaxed, but not to a level of which is hard to manage for a single programmer.
One thing you need to keep in mind is to keep your site as non-breakable and accessible as possible. Because Ajax content isn't on the page, it can't be indexed by spiders, etc. Ajax requires more error testing/catching, etc than "standard" CSS alteration, so make sure you're going to make it work in ALL browsers. There's no excuse for bad or broken navigation.
Your idea is definitely a valid approach. You're basically looking for a trade-off between an initial, longer request, and many shorter ones down the line. As long as the amount of initially hidden content is not gratuitous, the user experience will likely benefit by getting all of the markup first, and hiding/showing it with JavaScript + CSS as needed.
Suggested Approach: Consider using jQuery's
hide()
show()
or toggle()
methods, and combining with a history manager to emulate real browser history changes as the user navigate's page states. This is the approach I use on my website: paislee.net. IMO its clean and lightning fast because there are no new HTTP requests.
I have been working on parsing some of the data from the wow armory and have come into a bit of a snag. When it comes to the site serving up the achievements that players have received, it uses javascript to intemperate a string such as #73:1283 to display the requested information. (I made this number up but the data for the requests are formated like this).
Is it possible to pull data from a page that requires javascript to display its data with php?
How do you parse data from a site that has been loaded after the dom is ready or complete using php?
By using Firebug, I was able to look at the HTTP headers to see what AJAX calls were being made to generate the content on these pages: http://us.battle.net/wow/en/character/black-dragonflight/glitchshot/achievement#96:14861 and http://us.battle.net/wow/en/character/black-dragonflight/glitchshot/achievement#96
It looks the page is making an asynchronous call to load this page: http://us.battle.net/wow/en/character/black-dragonflight/glitchshot/achievement/14861 when the part after the hash is 96:14861, and a call to http://us.battle.net/wow/en/character/black-dragonflight/glitchshot/achievement/96 when the part after the hash is just 96. Both of those pages return XML that can be parsed to render HTML.
So generally speaking, if there's just one number after the hash, just put http://.../achievement/<number here> as the URL. If there are two numbers, put the second number at the end of the URL instead.
What you'll need to do, rather than pulling the Javascript and interpreting it, is make HTTP requests to those URLs by yourself in PHP (using cURL, for example) and parse the data on your own.
I would really recommend learning JavaScript and jQuery, since it will be very hard for you to really build a good site that pulls information from the WoW Armory without understanding all the AJAX loads that are going on in the background.
I would recommend seeing if you can replicate the query sent by JavaScript in PHP. While I don't believe there is a way to process JavaScript in PHP, there definitely isn't a simple or scalable way.
I would attempt to scan the first page's source that you downloaded with PHP for strings of that format you mention. Then if the JS on their site is querying something like http://www.wow.com/armory.php?id=#72:1284 you can just download the source of that next. You can find out how the JS is querying the server with something like FireBug or the Inspector in Chrome or Safari.
So in summary:
Check to find the JS URL format and if you can replicate it.
Create PHP to get main page and extract all strings.
Create PHP to loop through these strings and get these pages (with URL that JS requests).
Do whatever you wanted to with that information.
You can try jquery's $(document).onready function which helps
to run java script code when the web page loads up.
ex
<div id="wowoData">#4325325</div>
<script>
$(document).ready(
function(){
$("#wowoData").css("border","1px solid red");
}
)
</script>
I don't know where I got the idea before but a year ago I wrote this in php:
<script type="text/javascript" src="http://www.mydomain.com/getmarkers.php"></script>
Now I'm ready to convert this website to an ASP.NET MVC website and I'm wondering what the best way is to convert this into something more 'normal?'.
The options I could think about where:
Custom HttpHandler for .js files
Keep using the script tag but with a custom route to an action
Modify the javascript to load the serverside data using an ajax call
What the getmarkers.php currently does is generate javascript to add markers to a google map. The benefit of referencing the php inside the script tag is that
It's never cached, the markers are always up to date (I know there are alternatives)
It keeps my html clear of any javascript
Very easy to add/remove certain fields for the generated markers
An example of what is being generated :
infoWindows[0] = new google.maps.InfoWindow({
content: '<div style="width:250px;color:#000;">...html content for this specific marker...</div>'
});
google.maps.event.addListener(markers[0], 'click', function() {
infoWindows[0].open(map, markers[0]);
});
What changes is the index (0 in this example) and the content of the html.
Question
1. What solution do you think fits best
2. Is it bad to reference a script by calling a serverside 'file'.
I can see no downside to embedding server-generated .js files. I personally would choose the "custom route" option so it's clear it's a generated file, leaving the .js extension to static resources.
However, serving the markers in a pure data format like JSON, and loading them using Ajax would have the advantages that
you have the data in a neutral "meta format" that you can re-use elsewhere without having to build a new data source
you can keep the Loading / HTML generating process in one place, the parent page or one of its static scripts, instead of controlling the way the HTML looks in the server-side script
the amount of transferred data is probably reduced
It's cleaner overall
Difficult to explain this Question, but im currently passing variables in a php page to some html hidden inputs.
Im fetching those values from the hidden inputs with a javascript function. This function gets called like this:
<body onload="function();">
It works on my system now, but is there any chance that the value passed from php might not get through because body has called the function BEFORE the php code sets the input type hidden?
Thanks
You have may have mixed up which part does what.
PHP generates the HTML page on the server side. When the HTML page arrives at the browser, PHP has done its job. There is no way for PHP to do something after it has rendered the HTML.
Javascript is executed in the user's browser after the page has been generated and loaded. (Or during; as theraccoonbear points out, Javasript can run in the browser before the page has loaded completely.)
A Javascript command can not communicate with the PHP script rendering the page, because when Javascript comes into play, PHP is already gone.
So the answer to your question is: No, the JS function can not execute before PHP is done. As several commentators point out, that is not entirely true. A Javascript could come into action before the input HTML elements have been rendered. In your example however, the Javascript triggers only when the document is completely loaded. In that constellation, the answer is no, it can't happen.
That shouldn't be an issue, as you are using the body's onload property, which will ensure the dom and all images etc have loaded.
Using jQuery to it like below would be better in my opinion, fires as soon as the dom is ready, rather than waiting for all images etc.
$(document).ready(function() {
// do stuff here
});
This is also easily done from an external JS file if required, which helps you logically separate your code.
how to get page HTML at client side or through javascript in Asp.net Application. Means if I want to get the html of http://www.yahoo.com on client side through javascript or any other
You can't get the HTML source of a page on a different hostname from JavaScript, for security reasons (the Same Origin Policy).
So unless you're Yahoo, you would have to run a proxy on the server-side that will fetch http://www.yahoo.com/ and then return its content to the client side via a string in a <script> block, or in the response to an XMLHttpRequest (also best JSON-encoded). This is known as a cross-domain proxy.
If you mean get the page html as a string in javascript, you can use:
var s = document.body.innerHTML;
Though you need to note that this does not give you the html exactly as sent to the browser, it gives you the html constructed from the DOM - essentially meaning any errors will have been fixed, as well as that it will include any dynamically created elements.
link :
http://www.boutell.com/newfaq/creating/include.html
there are two ways to create client side includes:
JavaScript and iframe. Let's look at the advantages and disadvantages of both before we tackle how to do it.
The JavaScript method is the more seamless of the two. JavaScript code can fetch a fragment of a page from any URL and insert it into another page at any point. The end result looks as good as a server-side include— but only if JavaScript is turned on. And search engines don't see the included text at all, which is a serious problem.
The iframe method is simpler. The iframe element can be used to force a second page to "embed" inside the first page, in much the same way that Flash movies, videos and MP3 players are embedded with the object element. And JavaScript doesn't have to be turned on. But there are disadvantages here too. The iframe element has a fixed width and height, no matter how big the content is. That can mean scrollbars inside your page. And, as of this writing, Google doesn't appear to index the separate page referenced by the iframe so that searchers can find your page.
You use Ajax.
I recommend using the jQuery Ajax javascript library for this.
Do you mean a PHP function similar to file_get_contents($url) ?