Extract fan database from Facebook page

Extract fan database from Facebook page - php

I am trying to export Facebook Page Fans.
The closer I found was this article.
It states:
Getting fans from a Facebook page is
not yet supported by the Facebook API.
Luckily, the Facebook Web interface
uses a simple AJAX/JSON call to supply
the data when you view the page.the data when you view the page.
And he explains what he does like this:
My strategy to set this data free was
to sniff the network traffic with the
Wireshark tool, then replay the HTTP
calls with a ruby script.calls with a ruby script.
I don´t know anything about ruby so I started trying with a PHP scripts left in one of the comments, the one by: "Etienne Bley"
The script goes like this.
The script says you can download Charles Proxy to find this variables:
$cookie
$node_id
$post_form_id
$fb_dtsg
When I use the Charles Proxy Soft, and login to as administrator I get this:
And from there I get what I guess is the cookie variable:
BTW Is it safe to share the whole cookie?? is it helpful? (if it is I´ll edit asap)
The script also says:
// set settings in these 4 lines from results of charles when getting the 2nd page of "Get All Fans" in FB ( you need to be admin of fan page to do this )
I can´t understand what does he mean by: getting the 2nd page
So, my questions:
1) What are this variables?
2) What are their values? How should/can I get them?
3) To have this scripts set correctly is the only thing I need for this script to work?
I hope the question is clear enough, if not please ask any questions you need!
Thanks in advance!

I don't know about Charles Proxy Soft, but I used Chrome's excellent Inspector to trace the request.
Steps:
Use Chrome to navigate to the Facebook Page you're interested in
Open up the Inspector (CTRL+Shift+J on Windows), go to the "Resources" tab and "Enable Resource Tracking".
On the Facebook page, click "See all" in the Fans box on the left side of the page.
Scroll to the bottom of the fan list, and click "Next"
In the Resources tab, you'll have a request to /ajax/social_graph/fetch.php. Click on that, and in the Headers tab you'll see what you need. In my example:
I'm sure you can do that with a hundred different other programs, I find it easier to use Chrome since it's already there :)

Alright, so it seems this is all simple. I recommend getting a copy of Fiddler to inspect this plan yourself.
I opened up a fan page, went to view the fans, and hit next page. I saw a POST request for http://www.facebook.com/ajax/social_graph/fetch.php?__a=1. What I got back was a really nice JSON array, containing all of the fans.
If we inspect the variables posted, it becomes obvious...
edge_type = fan
page = 1
limit = 100
node_id = 123123123123123123123 (ID of the fan page I'm assuming)
class = FanManager
post_form_id = 97823498723498 (No idea, but I bet you can get this from the dialog)
fb_dtsg = a1s3d5f (No idea)
lsd =
post_form_id_source = AsyncRequest
Anyway, what you are interested in is page and limit. I bet if you set page to 0 and limit to 500 or whatever, you will get what you are looking for. In the event you can't change limit reliably, just leave it at 100 and keep incrementing page. Also, I have my cookies in there, with the session information. How you will get those and post from PHP I don't know, but I hope this gives you some things to go on.
Again, get Fiddler, inspect what happens when you browse the page.

Related

website screenshot on my server depending on USER

I came upon many similar questions like this but I could not find simple answer. My goal is to create my web page thumbnail onto my server for a particular User (depending on SESSION). Website is dynamic means for every different user content changes like that contents of users on facebook.
What I need to do here is generate a screenshot when user experiences a problem with the application and click the capture button
I got many options like
libwkhtmltox
wkhtmltopdf
but not getting which I should use also suggest other if better.
I have linux server and using core PHP and have shell access to it.
Please don't refer external site as they are unable to get snapshot in my case (as I said SESSION variable is maintained for every user).
Please help me with the tutorial.
Thanks in advance

libwkhtmltox and wkhtmltopdf are both great technologies for capturing images of web pages. However, the problem is that it's really hard to get these technologies to have the same session as your user, if not impossible. Additionally, many errors users experience aren't reproducible on a second request. (Errors caused by db connection errors, caching, etc.) So doing something like this will have limited value. One alternative would be to throw a popup when they click your send errorpage snap that explains how to take a screenshot.
If you absolutely want to go down this path of automating the screenshot, here's a crazy, probably stupidly insecure idea. As wkhtmltopdf is built on webkit, there are options to set cookies. As long as your php session is cookie based, you could pass the user's session_id to wkhtmltopdf, and hijack your own user's session, thereby recreating the page when wkhtmltopdf makes the request. I'm so getting downvoted for suggesting this...

how to track from where visitor come to my site php

I want to track the site URL from where user reached my site.
From where he came i.el, Google, GMail, Facebook, etc.
I tried $_SERVER['HTTP_REFERER'] but it does not contain anything when user click on my site link from any external site but resides the value when I visit among my site pages and this is also not trusted.
So, What I can do from here?
Is there any other way to track the external URL through PHP?
Any idea?
EDIT: Now HTTP_REFERER is able to get the url from most of sites but not able to get the url if user came through Gmail and AOL. What could be the causes?

HTTP_REFERER is the only way to get any information about previous site.
And that is also up to the broser if it supplies that information, most do as default.
Its a header that is set by the browser in the request to your server, if it is not present, then you will never know where the user came from.
If the browser is sending and you still to not get anything on the server check if you have any code that interferes with the $_SERVER variable.
Try this URL, its a google search result that goes to a page that just dumps the HTTP_REFERER.
As the pages indicates, if the box lists (none), then your browser is not sending HTTP_REFERER but if you get a result then the problem is in sour server.
http://www.google.com/url?sa=t&source=web&cd=1&sqi=2&ved=0CBIQFjAA&url=http%3A%2F%2Fkarmak.org%2F2004%2Freftest%2Ftest&rct=j&q=http_referer%20test&ei=cNQ2TdGYGsmUOp_ExPoD&usg=AFQjCNFVSmYmQBUcL2l3_ZpmZzVWZztjWg&cad=rja
You can compare it to when you load the page withour google to redirect you:
http://karmak.org/2004/reftest/test
Here is their own start page with link:
http://karmak.org/2004/reftest/

Have you tried it in a variety of browsers? It's down to the browser (As far as I'm aware) to set HTTP_REFERER and sometimes privacy settings can prevent this.

Visitors coming from google can be tracked using google analytics, it gives you the search query terms used before.
This solution also track a lot of other things from your visitors. I undertand it's not PHP based, but it's the only other kind of solution I know if HTTP_REFERRER is not enough to you, and as you quoted google...

Get Browser Tab Index/Id

So in most current browsers there is the feature of Tabs, is there a way to get the Tab index?
so Tab 1 has www.google.com opened in it and Tab 2 has www.google.com opened in it, is there a way to identify what the Tab index is?
Pseudo Code:
if($tab == 2) {
alert "Tab 2 is active\n";
}
if($tab == 1) {
alert "Please use Tab 2 as this is Tab 1\n";
}
Funny as everything I search for about tabs is related to the tab index of the webpage itself, sigh...

Strictly speaking. TABS are on the end user's machine. PHP works on the server. PHP can't see what the end user's machine is doing, it can only serve the end user PHP'ed pages.
Google does this with JavaScript and Cookies. For every instance of the page opened, increment a cookie counter. If the counter > 1, use AJAX to display an error message. Also, prohibit the page from functioning if cookies or JavaScript is disabled.
Look into jQuery.

As far as determining the absolute tab index, I know of no way to do it with Javascript. You can identify windows by their names, but not anything else.
In your example of two tabs containing the same web page, you should be able to uniquely identify them by making them aware of each other. You'd need to use cookies for this. Essentially, when a page is loaded, it would check for a cookie that tells it about other instances of the page that are currently loaded, and make decisions accordingly.
In this scenario, your onload handler would check the cookies, and register the loading page. You'd also need an onunload handler to unset the cookie pertaining to the page being unloaded.
See Javascript communication between browser tabs/windows for more information on how to use cookies to communicate between windows with Javascript.

in php: definitely not - it's executed on your server without access to the cleints browser.
maybe there's a solution using javascript (but i've never heard of that, and i'm pretty sure this isn't possible too - at least not as a cross-browser solution).
i think the best chance you'll have (if there even is one) is using other client-side languages like flash, silverlight or a java-plugin as this ones can do a lot more than javascript - but i'm sorry i don't know any of these good enough to give more information or hints.

Don't waste anymore time on this mate. It isn't possible, mainly because any webpage inside browser will not be able to get this kind of information due to security restrictions.
Try looking for an laternative approach as some of the other guys have suggested in their comments.

I am sure there is not a global variable for support that information. But maybe clever browsers such as Firefox or Google Chrome might support something on it. I have made a quick search on net and I came with these.
First, check Mozilla Tab Helper can be work with Mozilla. But be remember, this will never be a cross browser solution. Also, I am thinking there is not a cross browser solution.
Second one is, if you want to use this for your own use then it might bu useful, I don't test it. This is a addon. Here is the Open Tab Count Mozilla Addon
Open Tab Preview

PHP Redirector Counter for a link

I need a way to count how many times a link is being clicked and I was thinking of creating a php script to redirect to and do the counting. Is there a better way to do this and how would i count each time the user visits the link and would it be best to save in the database somewhere...any suggestions

Yes, it must be a PHP script - JavaScript for example won't work all the time.
So - instead of a link to
http://some.site.com/page2.php
You would link to
http://some.site.com/redirect.php?page2.php
And in the redirect.php you will track, for example, in a database, the values, and in the end throw this header:
header("Location: http://some.site.com/".$_SERVER["QUERY_STRING"]);
To redirect to the path after ?...
// yeah - logs might work... a little bit more work, though and it is also very server specific.

I would analyze your web log files as this will work whether it's a static page or a script.
If the page you need to count is a script, you could insert code that updates a table.
Website statistics is a big industry and there are many free and pay solutions out there to explore and get ideas from.

If you need to track clicks on a specific link then you'll probably need to use javascript to capture the click and send a notification to a tracking server. If you need to track page views then you're best off looking at your server logs. Remember that a page can have many links pointing to it, you have to differentiate between link clicks events page page impressions. Another possibility, depending on your application, is to use Google tracking, or a similar third party tracking app.

how does facebook/digg get all of the iamges for a webpage?

DO they use a php page to analyze the link, and return all of the images as josn?
Is there a way to do this with just javascript, so you dont have to go to the server to analyze the page?

I don't now how they do it. I'd implement a small service for that purpose. Given an URL return some relevant image (or generate a screenshot). This service could also cache results for better performance. But still, the page needs to be accessed in order to grab the <img src=... or to take the photograph.

Facebook calls back to the server. If you use Firebug (or, as I did, the Web Inspector in Safari), you can inspect the ajax calls. Facebook calls back to a script at /ajax/composer/attachment.php - in there is some JavaScript which contains HTML that gets inserted into the page. Here is what it looks if I point the Facebook attach link dialogue to the BBC News homepage in Safari Web Inspector:
Facebook JavaScript response when you attach a link in Safari Web Inspector http://tommorris.org/files/Facebook-20100529-181745.jpg
I put up the full JavaScript response on Gist (it is all one-line and minified originally, so I just flung it through TextMate to wrap it).
I'm not sure if you could do it on the client-side - because of browser protections on cross-site scripting - and even if you could, you probably ought not to because of this potential security problem: imagine if someone puts in a URL that points to a page which only they have access to. You don't necessarily want to put what's on someone else's customised or private page up on your Facebook/Digg type site. Imagine if it was something like Flickr and there were private pics - or worse, a porno site. No, better to proxy it back to your server and then grab the images. Plus, it'll probably be faster. No need to tax your end user's potentially slow connection downloading a page when your server will probably be able to do it quicker...

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.