Grapping first image from Facebook Page using cURL and PHP DOM - php

I am trying to fetch the first images on a Facebook Page. It works on other websites - using:
$image = $doc->getElementsByTagName('img')->item(0);
But for some reason, Facebook have wrapped in the 's i need, like this:
<code class="hidden_elem" id="u_0_7"><!-- <div class="timelineLoggedOutSignUp"><div class="_5h60" id="pagelet_loggedout_sign_up" data-referrer="pagelet_loggedout_sign_up"></div></div><div class="fbTimelineTopSectionBase fbTimelineLoggedOutTopSection"><div class="_5h60" id="pagelet_above_header_timeline" data-referrer="pagelet_above_header_timeline"></div><div id="above_header_timeline_placeholder"></div><div class="fbTimelineSection mtm fbTimelineTopSection"><div id="fbProfileCover"><div class="cover" id="u_0_4"><a class="coverWrap coverImage" href="https://www.facebook.com/photo.php?fbid=632540440113248&set=a.540825239284769.1073741827.540818775952082&type=1" rel="theater" ajaxify="https://www.facebook.com/photo.php?fbid=632540440113248&set=a.540825239284769.1073741827.540818775952082&type=1&src=https%3A%2F%2Fscontent-b.xx.fbcdn.net%2Fhphotos-ash3%2F579116_632540440113248_872174037_n.png&size=851%2C315&source=10" title="Coverbillede" id="fbCoverImageContainer"><img class="coverPhotoImg photo img" src="https://scon
Note that it is wrapped into a: <!-- -->.
Is there some way I can avoid this? Maybe changing the user-agent to an older browser, where they dont use the <!-- --> wraps? I can do this, using CURLOPT_USERAGENT in my CURL settings.
Any ideas? I am quite lost here..

All of this data is available via the Facebook Graph API so you don't need to fiddle around with the DOM or scrape the page - and you don't need to be authenticated to get it. This means you don't need Facebook's SDK or need to worry about registering an application if you are just grabbing public info. Also, Facebook change their HTML all the time so scraping the content will slowly drive you mad.
A quick JS example below, this gets the cover photo for your page:
$('#GetCoverImage').click(function() {
$.getJSON(
'https://graph.facebook.com/EduKarmaDK',
function(pageData) {
console.log(pageData.cover.source);
}
);
});
Other public info about the page is available in the pageData object. Have a play around with the Graph API Explorer to see what else is available.
PHP example:
<?php
$pageData = json_decode(
file_get_contents('https://graph.facebook.com/EduKarmaDK')
);
echo($pageData->cover->source);

Related

Scraping data from a website with Simple HTML Dom

I work to finish an API for a website (https://rushwallet.com/) for github.
I am using PHP and attempting to retrieve the wallet address from this URL: https://rushwallet.com/#n3GjsndjdCURphhsqJ4mQH7AjiXlGI.
Can anyone can help me?
My code so far:
$url = "https://rushwallet.com/#n3GjsndjdCURphhsqJ4mQH7AjiXlGI";
$open_url = str_get_html(file_get_contents($url));
$content_url = $open_url->find('span[id=btcBalance]', 0)->innertext;
die(var_dump($content_url));
You cannot read the correct content in this case. You are trying to access the non-rendered page content. Therefore, you always read the empty string. The content is loaded after the page is fully loaded. The page source is shown as:
฿<span id="btcBalance"></span>
If you want to scrape the data in this case, you need to use rendering engine which is possible to render javascript. One possible engine is phantomJS, which is a headless browser and able to scrape the data after rendering.

Accessing data from external websites to 'create' own application of that data

Wow, i hope i have written the title in a correct way, because i really have no idea how this is called.
Let me explain what i am looking for.
I have a simple application. And it contains the following:
- Frontpage (salespage)
- Admin area
- Member area
- Database to provide the app of data
I have hosted the basics of this application on a server, let's call it 'www.the.app'. And i have written it in PHP using Laravel.
Now i want to use the functions of the app, which is hosted on www.the.app and use the functions like the admin area, member area, the frontpage and create my own database on 'www.awesome.app'.
What would be the best way to make such a thing happen?
I am not looking for direct solutions. I am just looking for information to point me in the right direction to be able to make the above reality. Anything would be apreciated, like information, a name i can search on, what ever is related to this.
And if there is any more information needed, let me know please :)
Here is exactly how you can do it :
1) To Auto Login To The Site :
Note : As you will need to probably get data from login based site so you can auto login to the site through using CURL while using User Credentials with it.
To Learn How To Login Through CURL.Take A Look At :
Using PHP & Curl to login to my websites form
2) Extracting Data After Logging In Through CURL :
Note : After logging to the site now you will need to use PHP DOM to extract data from the site.So you can extract data something like this way by using PHP Simple HTML DOM Parser Library.
PHP Simple HTML DOM Parser :
LINK : http://simplehtmldom.sourceforge.net/
Sample Code For Downloading All Images From A Link :
Note : Following code will download all the images present at the URL given in the code.
<?php
// Make sure to include the library php file
include('simple_html_dom.php');
//URL To Download Images From
$url = "http://www.facebook.com/"
// Create DOM from URL or file
$html = file_get_html($url);
// Find all images
$i=1;
foreach($html->find('img') as $element) {
$url = $element->src;
$img = "/my_folder/image_".$i.".png";
file_put_contents($img, file_get_contents($url));
$i++;
}
?>

Simple HTML DOM only returns partial html of website

I had a big PHP script written out to scrape images from this site: "http://www.mcso.us/paid/", but when it didn't work I butchered my code to simply echo the whole page.
I found that the table with the image links I want doesn't show up. I believe it's because the remote site uses ASP to generate the table. Is there a way around this? Am I wrong? Please help.
<?php
include("simple_html_dom.php");
set_time_limit(0);
$baseURL = "http://www.mcso.us/paid/";
$html = file_get_html($baseURL);
echo $html;
?>
There's no obvious reason why them using ASP would cause this, have you tried navigating the page with JavaScript turned off? It's a more likely scenario that the tables are generated through JS.
Do note that the search results are retrieved through ajax ( page http://www.mcso.us/paid/default.aspx ) by making a POST request, you can use cURL http://php.net/manual/en/book.curl.php , use chrome right-click-->inspect element---> network and make a search you will see all the info there (post variables etc ...)

Get data from a facebook page wall or group wall for use on personal website

I want to connect to public facebook page or group and list all entries from the wall on a personal website. I will use PHP on my server so that would be the best solution for me. Or javascript.
Could anyone explain or perhaps give a working code on how to do this? Or just all steps nessesary for making this?
If its possible to handle information about person, date, description ... for each post, that would be great! So my layout could be customized.
Thanks for helping me out here!
You need to run FQL on stream table and provide id of a page or group you are interested in as source_id (fb docs have some explanation and examples). Once you get stream data you can dig deeper and get user who left this post or any other data you need again through FQL.
There are many ways of running FQL - it could be done in JS API, PHP API, or through old REST API.
use the facebook graph api urls that they provide
python code using simplejson parser
keyword="old spice"
searchurl='http://graph.facebook.com/search?q='+keyword
resp=urllib2.urlopen(searchurl)
pageData=resp.read()
json = simplejson.loads(pageData)
posts=json['data']
for p in posts:
postid=p['id']
username=p['from']['name']
posterimg=p['icon']
comment=p['message']
In JavaScript (jQuery).
You can use my spare access_token for viewing public groups or pages ;)
To get your own access token the facebook graph explorer can generate one for you (as well as test queries).
In Javascript we make a request to facebook graph, which returns a JSON object. The response looks like this.
The code below iterates though each entry and prints out the message, if you look at the link above it gives you the naming convention for the other data fields.
for example:
data.data[0].created_time;
data.data[0].from.name;
etc..
Hope that Helps!
<!DOCTYPE html>
<head>
<script src="http://code.jquery.com/jquery-1.6.2.min.js"></script>
</head>
<body>
<ul id = 'list'>
<script>
var graphQuery = 'https://graph.facebook.com/2228101777/feed';
var authToken = '145634995501895|477bb3c939123a5845afe90d.1-100002565213903|F1VA26jsYL7yBeq2iU6SZX_XXrs'
var url = graphQuery +'?access_token='+ authToken +'&callback=?';
$.getJSON(url,function(data){
for( i=0; i < data.data.length; i++){
$("#list").append('<li>'+ data.data[i].message +'</li>');
// add some more here if needed
}
});
</script>
</ul>
</body>
</html>
What you are talking about, as far as I can tell, is Web Scraping. What you would do is get the URL of the group, use the file_get_contents($url) command in PHP to get the file, and then analyze it in PHP.
I'd suggest brushing up on your regular expressions for this, as it'll be important to review the HTML that Facebook uses for the wall posts. You'll be able to get the information that you're looking for from the HTML.
I would post some example code, but that's on another computer, far far away. Still, should be a good start.
Edit: Adding in some example code:
$URL = "http://facebook.com/group=5343242" (or whatever the URL structure is for the facebook group)
$groupPage = file_get_contents($URL)
Here's the link to the PHP pages on Regular Expressions:
http://www.php.net/manual/en/book.pcre.php

Embedded SWF in Facebook with MochiAds Loader, access FB flashvars?

I have a flash game embedded on Facebook but need access to the flashvars facebook passes to all embedded games. However I am using the mochiads preloader meaning that _root.fb_sig_user is always undefined?
How do I get to the variables?
stage.loaderInfo.parameters.fb_sig_user
Was my best guess and it doesn't seem to have worked.
Try this..
paramList = LoaderInfo(this.root.loaderInfo).parameters;
trace(paramList["fb_sig_user"];
fb_session = new FacebookSessionUtil("api_key","api_secret", stage.loaderInfo);

Categories