How does google analytics avoid same origin policy? - php

I had an idea for a project involving a Javascript terminal utilising a specified PHP script as a server to carry out remote functions. I understand that the same origin policy would be an obstacle with such a project, but looking at google analytics, which I use every day, it seems they have a way of avoiding the problem on a huge scale.

Google Analytics, Google AdWords and practically all other analytics/web-marketing platforms use <img> tags.
They load their JS programs, those programs handle whatever tracking you put on the page, then they create an image and set the source of the image to be equal to whatever their server's domain is, plus add all of your tracking information to the query string.
The crux is that it doesn't matter how it gets there:
the server is only concerned about the data which is inside of the URL being called, and the client is only concerned about making a call to a specific URL, and not in getting any return value.
Thus, somebody chose <img> years and years ago, and companies have been using it ever since.

The modern way to allow cross-domain requests is for the server to respond with the following header to any requests:
Access-Control-Allow-Origin: *
This allows requests from any hosts, or alternatively a specific host can be used instead of *. This is called Cross Origin Resource Sharing (CORS). Unfortunately it's not supported in older browsers, so you need hacks to work around the browser in that case (like a commenter said perhaps by requesting an image).

You can get codes from third-party sites, but collecting data with them is restricted by the policy.
Google collects data with "_gaq" function array embedded by the 1st-orgine-site, and then Google sends the collected data as they are embedded in the http-request parameters.
http://www.google-analytics.com/__utm.gif?utmwv=4&utmn=769876874&utmhn=example.com&utmcs=ISO-8859-1&utmsr=1280x1024&utmsc=32-bit&utmul=en-us&utmje=1&utmfl=9.0%20%20r115&utmcn=1&utmdt=GATC012%20setting%20variables&utmhid=2059107202&utmr=0&utmp=/auto/GATC012.html?utm_source=www.gatc012.org&utm_campaign=campaign+gatc012&utm_term=keywords+gatc012&utm_content=content+gatc012&utm_medium=medium+gatc012&utmac=UA-30138-1&utmcc=__utma%3D97315849.1774621898.1207701397.1207701397.1207701397.1%3B...
Google demonstrates clearly how tracking works.

Related

Track Web Traffic with PHP

Is there an effective way to track web traffic (or at least the origin of web traffic) with PHP?
I was thinking of using custom canonical links for each search engine and other websites, which would mean anybody who visits mywebsite.com without a parameter is likely direct traffic. But then I would somehow need to change the href value of the link rel='canonical' element for each engine crawler (e.g. https://mywebsite.com/?ref=google, https://mywebsite.com/?ref=duckduckgo, etc), and I'm not exactly sure how to go about this (through robots.txt, meta tags or?).
I really don't want to use Google Analytics if I don't have to. I'd prefer to have all of my analytics under one roof so to speak, but I'm stuck for ideas of how to achieve this, and most of my searches on SO seem to pull up stuff related GA.
well ive read all over SO about how in many cases the header can be and is simply omitted for various reasons such as AV software, browser extensions, switching from http to https, etc? is this often the case?
Yes, this can happen. How often for your particular site's visitors is anyone's guess.
does GA rely on the referer header?
Not quite... as Google Analytics runs client-side, it's getting that information from document.referrer, which contains the same value as what is sent in the Referer header.
but i would of course like to have numbers that are as accurate as possible
With any web analytics, there are things you simply can't measure. The best way is to use a client-side analytics script to send data to your server. There are a handful of reasons why this is better than simply looking at the data you get in the HTTP request data in PHP:
Pages can be cached, so you'll be able to see page loads at times when the browser never even checked in with your server to load the page.
The Performance API is available, allowing you to track specific load timings that you can work to improve on over time.
In most browsers, you can use the Beacon API to get a sense for when the user leaves the page, so you have accurate time-on-page measurements.
id like to get an idea of what traffic is direct and what traffic is not direct and where non-direct traffic is coming from
document.referrer is what you want, and gets you as close to accurate as you can get.

Is there any secure way to allow cross-site AJAX requests?

I am currently working on a script that website owners could install that would allow users to highlight a word and see the definition of the word in a small popup div. I am only doing this as a hobby in my spare time and have no intention of selling it or anything, but nevertheless I want it to be secure.
When the text is highlighted it sends an AJAX request to my domain to a PHP page that then looks up the word in a database and outputs a div containing the information. I understand that the same-origin policy prohibits me from accomplishing this with normal AJAX, but I also cannot use JSONP because I need to return HTML, not JSON.
The other option I've looked into is adding
header("Access-Control-Allow-Origin: *");
to my PHP page.
Since I really don't have much experience in security, being that I do this as a hobby, could someone explain to me the security risks in using Access-Control-Allow-Origin: * ?
Or is there a better way I should look into to do this?
Cross-Origin Resource Sharing (CORS), the specification behind the Access-Control-Allow-Origin header field, was established to allow cross-origin requests via XMLHttpRequest but protect users from malicious sites to read the response by providing an interface that allows the server to define which cross-origin requests are allowed and which are not. So CORS is more than simply Access-Control-Allow-Origin: *, which denotes that XHR requests are allowed from any origin.
Now to your question: Assuming that your service is public and doesn’t require any authentication, using Access-Control-Allow-Origin: * to allow XHR requests from any origin is secure. But make sure to only send that header field in those scripts your want to allow that access policy.
"When the text is highlighted it sends an AJAX request to my domain to a PHP page that then looks up the word in a database and outputs a div containing the information. I understand that the same-origin policy prohibits me from accomplishing this with normal AJAX, but I also cannot use JSONP because I need to return HTML, not JSON."
As hek2mgl notes, JSONP would work fine for this. All you'd need to do is wrap your HTML in a JSONP wrapper, like this:
displayDefinition({"word": "example", "definition": "<div>HTML text...</div>"});
where displayDefinition() is a JS function that shows a popup with the given HTML code (and maybe caches it for later use).
"The other option I've looked into is adding header("Access-Control-Allow-Origin: *"); to my PHP page. Since I really don't have much experience in security, being that I do this as a hobby, could someone explain to me the security risks in using Access-Control-Allow-Origin: *?"
The risks are essentially the same as for JSONP; in either case, you're allowing any website to make arbitrary GET requests to your script (which they can actually do anyway) and read the results (which, using normal JSON, they generally cannot, although older browsers may have some security holes that can allow this). In particular, if a user visits a malicious website while being logged into your site, and if your site may expose sensitive user data through JSONP or CORS, then the malicious site could gain access to such data.
For the use case you describe, either method should be safe, as long as you only use it for that particular script, and as long as the script only does what you describe it doing (looks up words and returns their definitions).
Of course, you should nor use either CORS or JSONP for scripts you don't want any website to access, like bank transfer forms. Such scripts, if they can modify data on the server, generally also need to employ additional defenses such as anti-CSRF tokens to prevent "blind" CSRF attacks where the attacker doesn't really care about the response, but only about the side effects of the request. Obviously, the anti-CSRF tokens themselves are sensitive user-specific data, and so should not be obtainable via CORS, JSONP or any other method that bypasses same-origin protections.
"Or is there a better way I should look into to do this?"
One other (though not necessarily better) way could be for your PHP script to return the definitions as HTML, and for the popups to consist of just an iframe element pointing to the script.
JSONP should fit your needs. It is a widely deployed web technique that aims to solve cross domain issues. Also you should know about CORS which addresses some disadvantages of JSONP. The links I gave you will also contain information about security considerations about these techniques.
You wrote:
but I also cannot use JSONP because I need to return HTML, not JSON.
Why not? You could use a JSONP response like this:
callback({'content':'<div class="myclass">...</div>'});
and then inject result.content into the current page using DOM manipulation.
Concept of CSRF(Cross Site Request Foregery) can be your concern
http://en.wikipedia.org/wiki/Cross-site_request_forgery
there are multiple ways to limit this issue, most commonly used technique is employing use of csrf token.
Further you should also put a IP based Rate limiter for "Limiting execution of a to number requests made from a certain ip", to limit DoS attacks that can be done if you are a target, you can seek some help from the How do I throttle my site's API users?
CORS issues are simple - do you want anyone to be able to remotely AJAX your stuff on your domain? This could be extremely dangerous if you've got forms that are prone to CSRF. Here is an example plucked straight out of my head.
The set-up:
A bank whose online banking server has CORS headers set to accept all (ACAO: *) (call it A)
A legitimate customer who is logged in (call them B)
A hacker who happens to be able to make the client run anything (call it E)
A<->B conversation is deemed lawful. However, if the hacker can manage to make the mark (B) load a site with a bit of JS that can fire off AJAX requests (easy through permanent XSS flaws on big sites), he/she can get B to fire requests to A by JSON, which will be allowed and treated as normal requests!
You could do so many horrible things with this. Imagine that the bank has a form where the input are as follows:
POST:
* targetAccountID -> the account that will receive money
* money -> the amount to be transferred
If the mark is logged in, I could inject:
$.ajax({ url: "http://A/money.transfer.php"; data { targetAccountID: 123; money: 9999; }; });
And suddenly, anyone who visits the site and is logged in to A will see their account drained of 9999 units.
THIS is why CORS is to be taken with a pinch of salt - in practice, DO NOT open more than you need to open. Open your API and that is it.
A cool side note, CORS does not work for anything before IE9. So you'll need to build a fallback, possibly iframes or JSONP.
I wrote about this very topic a short while back: http://www.sebrenauld.co.uk/en/index/view/access-json-apis-remotely-using-javascript in a slightly happier form than Wikipedia, by the way. It's a topic I hold dear to my heart, as I've had to contend with API development a couple of times.

Is there a way to send tracking info to Google Analytics from PHP?

I have a PHP code that will return a image.
the link is given to 3rd party. so, i need to keep track where the php request coming from. Because the PHP only return the image, I cannot use the Javascript code for Google analytics.
I know that I can get the information from the access.log, but i think I can't dump the access.log to GA for analyzing, right?
so, is there a way that I can do in PHP (e.g. sending a CURL ), send somethig to Google Analytics for tracking?
In practice, what GA will do is to issue a HTTP GET for a 1-pixel sized GIF image, in which the GET parameters will contain the information to store in GA servers. If you figure out the format of the GET request, you may be able to store the information you want to. You can use any net monitoring tool or browser plugin of similar functionality (like Firebug, etc) to understand the parameters that are passed to GA servers. These are nowhere to be found in GA documentation, although the architecture of this process is.
In practice, what you're trying to accomplish is the same as enabling GA for a javascript-disabled client. By limiting the information you can provide to GA to the one that the server obtains from browser requests you won't be able to get some detailed info such as the screen resolution, etc. On the bright side, the information that won't be accessible by this method is actually very little (and probably of little significance) and the web is full of resources on using GA for the javascript-impaired that you can use as example, eventually adapting to PHP and to your particular case.
Galvanize is an open source project that does what Miguel is describing. This is the blog post introducing Galvanize.

Possible to use Javascript to get data from other sites?

Is it possible for a web page using Javascript to get data from another website? In my case I want to get it for calculations and graphing a chart. But I'm not sure if this is possible or not due to security concerns. If it is considered a no no but there is a work around I would appreciate being told the work around. I don't want to have to gather this information on the server side if possible.
Any and all help is appreciated.
Learn about JSONP format and cross-site requests (http://en.wikipedia.org/wiki/JSON#JSONP).
You may need to use the "PHP-proxy" script at your server side which will get the information from the websites and provide it to yours Javascript.
The only reliable way is to let "your" webserver act as a proxy. In PHP you can use curl() to fire a HTTP request to an external site and then just echo the response.
You can't pull data from another server due to the same origin policy. You can do some tricks to get around it, such as putting the URL in a <script> tag, but in your case it wouldn't work for just parsing HTML.
Use simple_dom_html, to parse your data server side. it is much easier than doing it in JavaScript anyways.
A simple way you might be able to do this is to use an inline iframe. If the web page you are getting the data from has no headers, or you can isolate the data being pulled in (to say an image or SWF), this might work.
cross-domain javascript used to be impossible, using a (php-)proxy was a workaround for that.
jsonp changes this entirely, it allows to request javascript from another server (if it has an API that supports jsonp, a lot of the bigger webplayers like google, twitter, yahoo, ... do), specifying the callback-function in your code that needs to be triggered to act on the response.
the response in javascript will contain:
a call to a callback-function you defined
the actual payload as a javascript-object.
frameworks like jquery offer easy support for jsonp out of the box.
once you have the raw data you could tie into google chart tools to create graphs on the fly and insert them in your webapp.
Also worth considering is support for XMLHttpRequest Access Control which is support in some modern browsers.
If the service provider that you are trying to access via a web page has this set up, it is a very simple call to XMLHttpRequest and you will get access to the resources on that site without the need for JSONP (especially useful for requests that are not GET, i.e. POST, HEAD etc)

show google map markers with php

Hi everyone!
Im working on a google map project where the user can type in a address and gets the result of nearby restaurants ploted on a google map.
So far no problems. I've created a ajax call where the backend outputs and xml and then with jquery I create the markers.
But now to my problem.
With this ajax solution anyone can easily with firebug or other webdeveloper tool access the xml result that contains all names, latitudes, longitudes of the restaurant I have.
I want to somehow protect the data that is showed.
How can I do this?
How can I plot google map markers with php without jquery? Can it be done?
thx in advance!
Google Maps Markers for an interactive map (using a the GMap2 object in the API) must be created on the client side (in Javascript) and are therefore vulnerable to reverse engineering the data.
If you want to generate the map data on the server, then you are limited to static functionality on the client. You can use the Google Static Maps API to build a URL on the server, which includes the information about the markers you want to display and the region that the static map will show. This approach sacrifices some usability for the client (no dynamic zooming, panning, marker popups etc...) to protect your data.
N.B. A determined engineer will still be able to access your data (albeit with some difficulty) by:
Parsing your static maps URL to determine the map region
Analyzing the image data to find markers and determine their locations.
The only way to protect the data is to render the map before sending it to the browser. Doing that will take most or all the cool features of google maps away since you'd have to display just an image.
Any data that is accessable by google maps is accessable by someone with firebug.
Some things you can do to make life difficult for someone trying to grab your data:
In your server code, examine the headers to see if the request came from your client page. If the request came from anywhere else, return nothing.
Encode the data that you return from the server. Decode it as late as possible in your client code, so that you only have the plaintext for one restaurant in Javascript variables at any one time. That way someone with Firebug can only directly read one restaurant at a time.
Have your server only return a limited number of locations at once, even if somebody uses Firebug to change the request parameters so that it asks for restaurants within a huge radius. That way they can only grab the cyphertext for that many locations at once to paste into their own code in which they've placed a copy of your decoding function.
Instead of grabbing the cyphertext for even that limited number of locations in a single call, send multiple requests that each return a very small number of locations, with an extra parameter specifying which chunk of restaurants is requested.
Its not foolproof, but for someone to grab substantial quantities of your data will either take them a long time, or require fairly sophisticated attack techniques, such as spoofing the request headers.
Simple answer - you can't.
Long answer
You could draw an image overlay on server-side, kinda like Wikipedia overlay in Google maps, but I don't think it's worth the effort.
You could also store a key in php session and pass it to JavaScript on initial page load and then don't return the data if data isn't requested trough Ajax with the correct key (which is unique per browser session). This would just protect you from simple bots which don't support cookies. More mess then gain.
Also remember that if someone were to write competing site using your server as data-source then they would still have to tunnel Ajax requests trough their own server because you can't do cross-domain requests with JavaScript therefore you would see a lot requests from same IP (their web-server) in your web-logs and you could easily ban that IP. (Unless they download all at once and then serve from their own server).
And is it really necessary? It's not like restaurant locations are top secret.

Categories