Fetching content from Website on another Server - php

What i basically want to do is to get content from a website and load it into a div of another website. This should be no problem so far.
The problem is, that the content that should be fetched is located on a different server and i have no source access to it.
I'd prefer a solution using JavaScript of jQuery.
Can i use a .htacces redirect to fetch the content from a remote server with client-side (js) techniques?
I will also go with other solutions though.
Thanks a lot in advance!

You can't execute an AJAX call against a different domain, due to the same-origin policy. You can add a <script> tag to the DOM which points at a Javascript file on another domain. If this JS file contains some JSON data that you can use, you're all set.
The only problem is you need to get at the JSON data somehow, which is where JSON-P callbacks come into the picture. If the foreign resource supports JSON-P, it will give you something that looks like
your_callback( { // JSON data } );
You then specify your code in the callback.
See JSONP for more.
If JSONP isn't an option, then the best bet is to probably fetch the data server-side, say with a cron job every few minutes, and store it locally on your own site.

You can use a server-side XMLHTTP request to grab your content from the other server. You can then parse it on you server (A.K.A screen-scraping) and serve-up the portion you want along with your web page.

If the content from the other website is just an HTML doc that you want to display on your site, you could also use an iframe to pull it in. You won't have access to any of its content because of browser security rules.

You will likely have to "scrape" the data you need and store it on your server.
This is a great tutorial on how to cache data from an external site. It is actually written to fetch and store XML, so it'll need some modification. Also, if your site doesn't allow file_get_contents then you may have to modify it to use cUrl.

Related

Parsing - can't get data from PHP file

I'm trying to parse data from http://skytech.si/
I looked around a bit and I find out that the site uses http://skytech.si/skytechsys/data.php?c=tabela to show data. When I open this file in my browser I get nothing. Is the file protected and can run only from server side or something?
Is there any way to get data from it? If I cold get HTML data (perhaps in a table?) I would probably know how to parse it.
If not, would it be still possible to parse website and how?
I had a look at the requests made;
http://skytech.si/skytechsys/?c=graf&l=bf0b3c12e9b2c2d65bd5ae8925886b57
http://skytech.si/skytechsys/?c=tabela
Forbidden
You don't have permission to access /skytechsys/ on this server.
This website doesn't allow 'outside' GET requests. You could try parsing the data via file-put-contents but I don't think you will be able to get specific data tables (aside from those on that home) due to AJAX requests that need to be made. I believe the /data? is the controller to handle data which is not exposed via the API.
When you open this URL in your browser you send GET request. Data returned under this address is accessible after sending POST request with params as follows c:tabela, l:undefined, x:undefined. Analyze headers next time and look on Network log if you are using Chrome/Chromium.
If that website does not expose an API, it is not recommended to parse the data, as their HTML structure is prone to change.
See:
http://php.net/manual/en/function.file-put-contents.php
And then you can interpret it with an HTML-parsing engine or with an regular expression (not recommended).

Get a JavaScript variable from a remote page?

I am currently working on developing an API for a company. Specifically, here's my issue. They have JavaScript arrays on a webpage that their webmaster updates. I have to pull these arrays into either a simple JS script or PHP file and get the contents of these arrays, which I can then arrange according to the API's specifications and output it as JSON.
How do I pull a JavaScript variable in from a remote page in either PHP or jQuery/JS and make it usable for other applications?
No, I don't have access to the company's website. I have to work off of page scraping for this one.
Thank you!
You can't access private javascript variables remotely, due to Same origin policy. They would have to output the arrays in some kind of readable format that you could access using AJAX, probably as JSON.
Edit: As mentioned below, if the array is explicitly defined as text in a javascript file, you could grab the contents of that file using cURL in PHP
If I where you, I’d use PHP to file_get_contents (or CURL depending on the server config) the page and then parse it based on whatever the markers are to find the value of the variable, assuming it’s written out to the page in the first place.

Get contents of DOM via PHP

I need to get the contents of a website through PHP, however, the content is only available when JavaScript is enabled. The workaround that I am using now is making an applescript to open the website in Safari, and selecting all of the page content, copying it to the clipboard, and pasting it.
That will be really hard to achieve I guess. If you observe the JS on that page that is responsible for getting the content ready, you may discover its just another AJAX call that you may be able to call directly from your PHP script.
best possible solution: ask the website owner for api/export access ;)
If that is not possible, you can only pray that you can analyze the requests that are initialized via JavaScript and imitate them.
(possible tools: firefox with firebug or tamper data plugin).
Warning the owner of the website might not like this approach, in fact, it may be disallowed to scrape the data automatically
What do you mean by:
the content is only available when JavaScript is enabled
Does the page pull data from somewhere via JS? Would it be easier to analyse where the data is coming from and access that place directly from PHP?

php crawler for website with ajax content and https

i'm trying to grab the content of a website based on ajax and https but with no luck.
Is this possible.
The website i'm trying to crawl is this:
https://www.bet3000.com/en/html/home.html#!https://www.bet3000.com/html/en/eventssportsbook.html?category_id=2117
Thanks
If you take a look at the HTTP requests that this page is doing (using, for example, Firebug for Firefox), you'll notice it makes several Ajax requests.
Instead of trying to execute the Javascript code, a possible solution could be for you to request one of those URLs, and get the data -- you'd also not have to parse the HTML, this way.
In this specific case, one of those requests is made to the following URL :
https://www.bet3000.com/ajax/en/sportsbook.json.html?category_id=2117&offset=&live=&sportsbook_id=0
This URL seems to return some JSON data, that should interest you quite a bit ;-)
(There is a few characters before and after the JSON, that will need to be removed, but, asides from that, I don't see anything that doesn't look good.)

Create a php/mysql form to be embedded on many different websites

I am not sure where to start, and would appreciate it if someone could point me in the right direction. I would like to create a simple form 'widget' for embedding on different websites.
The idea is that the form reside on my server, and the form information will be submitted to the database on my server, but will be embedded on other sites.
** The form has dynamic drop down menus that populate based on $_GET variables. For example, if I were using an iframe it would look like this...
<iframe src="http://www.example.com/form.php?id=555"></iframe>
Should I use an iframe or would javascript be better for this, is there a better way? What are the security concerns that I need to look out for?
Your best solution for this would to use an iframe.
The reason you cannot do this with javascript is because of most browsers security policy regarding cross site scripting.
With an iframe, you will be able to provide the end user a URL and then they would be able to position the frame anywhere they'd like. I imagine you would provide a URL with a specific path for each user, or a variable to define the user.
Something like:
<iframe src="http://yourdomain.com/form/?clientid=12345&style=woodgrain"></iframe>
One of the problems with the browser origin policy is that the website owner will not be able to style your forms themselves, nor will they be able to manipulate the DOM within that iframe in any way. This might actually be a blessing or a curse for you, depends on the circumstance.
If you need action after the form is submitted, you can always have the site use a script with a function that does nothing during the first iteration, but on the second iteration changes the iframe source, or even removed it from the DOM of the parent site. This would be done via an onLoad="" action in the iframe tag.
As mentioned above Cross Browser security restrictions limit your alternatives
There are 4 alternatives I know to get around this. JsonP is probably the most flexible, but I've included them all here for completness.
1) iframe is the easiest, but your widget will have limited access to the website that contains it and vis versa
2) Jsonp = most flexible - this works by using the tag. Your serverside code takes a callback parameter and tags it on front of any json it passes back.
Example in php
<?php
header("content-type: application/json");
$json = array('example'=>'results');
// Wrap and write a JSON-formatted object with a function call, using the supplied value of parm 'callback' in the URL:
echo $_GET['callback']. '('. json_encode($json) . ')';
?>
And the JQuery code would look like this
$.ajax({ url:'http://yourserver.com/ajax.php',
dataType:'jsonp',
success: function(data)
{ alert(data); }
});
Your widget consumers can either copy paste, the javascript they need or better yet load it directly off of your web server with a script src call.
3) DNS alias - Require all users of your widget to make an entry in their dns to your server so its in the same top level domain. IE point - widgetprovider.consumersdomain.com to your server. (You'll need a fixed ip as setting up virtual host for all the domains woulc be troublesome) You can then load the javascript with a script tag as in above, but you don't have to worry about jsonp and can use standard ajax calls to interact with the site.
4) Flash, Silverlight - Can get around cross domain policy by including an xml file on your server.
Bonus - I think you'll be able to do this with WebSockets once that roles out for real.
I've never done anything like that before. But you could use jQuery to load your form from an external link.
$("#feeds").load("feeds.html");
You could use some PHP to.
include 'your external path';
Then your form could look like the following:
<form action="yourExternalActionLink" method="post or get">
some tags...
</form>
I don't think you have any other option other than going with an iFrame.
Most of the modern browsers don't even allow accessing websites other than your own domain using ajax/Javascript.
you have to go with iframe, as long as you want the stuff to reside on your own server for easy updates
I haven't actually tried it but there are a lot of techniques to do cross-domain ajax requests. Here's one: http://james.padolsey.com/javascript/cross-domain-requests-with-jquery/ . The javascript solution to this would be something like this:
$.ajax({
url: 'yoursite.com/forms/272.json?param1=23&param3=df',
type: 'get',
success: function (response) {
//populate a form with response data.
}
});
So you cook up an API on your server that throws back JSON about what the form should look like, pass it whatever params you need. You get JSON back and can build the form however you like. That would be the javascript solution anyway.
But as others have mentioned cross-domain ajax isn't something you're supposed to be able to do, or so I'd thought. So if you were interested in trying this way I'd look into YQL (what the mod uses to do this) a bit more: http://developer.yahoo.com/yql/
if you want to do something out of the box... why dont you try Zoho creator forms?!
its easy and handy to use.
http://creator.zoho.com

Categories