i am creating an application via which a user shares a specific post on the facebook wall or the user's timeline page. This is done via the Javascript sdk and Facebook graph api.
I want to know is that i need to collect all the comments and the likes on that shared post whose id i store in the database.
then i run a cron which uses the graph api again to get the posts and comments on a specific feed (id from db) on facebook.
but i want to know is, that, is there any way for a real time update. Like if someone comments on the feed it send a request to my link and that link saves / update the comment in my database.
if not, let me know that is my cron thing the best way to do this. or is there another way for it
Facebook does indeed give you the ability to get real-time updates, as discussed in this document.
According to this document how ever, it doesn't look like you can get updated about the comments/likes of a post, you can only get updates to specific fields/collections of the User object, not a specific post.
There is no such ability to upadate it in real time, you may do it with cron or do update comments, likes count upon refresh button..
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $POST_URL);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:10.0.1) Gecko/20100101 Firefox/10.0.1");
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
$file_content = curl_exec($ch);
curl_close($ch);
if ($file_content === false) {
//post was delete or something else
} else {
$post_data = json_decode($file_content, true);
}
in $POST_URL you type: https://graph.facebook.com/+POST_ID
in $post_data['likes']['count'] you will have likes count
in $post_data['comments']['count'] you will have comments count
Related
I'm running into an issue with cURL while getting customer review data from Google (without API). Before my cURL request was working just fine, but it seems Google now redirects all requests to a cookie consent page.
Below you'll find my current code:
$ch = curl_init('https://www.google.com/maps?cid=4493464801819550785');
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36');
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$result = curl_exec($ch);
curl_close($ch);
print_r($result);
$result now just prints "302 Moved. The document had moved here."
I also tried setting curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0); but that didn't help either.
Does anyone has an idea on how to overcome this? Can I programmatically deny (or accept) Google's cookies somehow? Or maybe there is a better way of handling this?
What you need is the following:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
The above curl option is what tells curl to follow redirects. However, I am not sure whether what is returned will be of much use for the specific URL you are trying to fetch. By adding the above option you will obtain the HTML source for the final page Google redirects to. But this page contains scripts that when executed load the map and other content that is ultimately displayed in your browser. So if you need to fetch data from what is subsequently loaded by JavaScript, then you will not find it in the returned results. Instead you should look into using a tool like selenium with PHP (you might take a look at this post).
I had a simple parser for an external site that's required to confirm that the link user submitted leads to an account this user owns (by parsing a link to their profile from linked page). And it worked for a good long while with just this wordpress function:
function fetch_body_url($fetch_link){
$response = wp_remote_get($fetch_link, array('timeout' => 120));
return wp_remote_retrieve_body($response);
}
But then the website changed something in their cloudflare defense, and now this results in "Please wait..." page of cloudflare with no option to pass it.
Thing is, I don't even need it done automatically - if there was a captcha, the user could've complete it. But it won't show anything other than endlessly spinning "checking your browser".
Googled a bunch of curl examples, and best I could get so far is this:
<?php
$url='https://ficbook.net/authors/1000'; //random profile from requrested website
$agent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36';
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookies.txt');
curl_setopt($ch, CURLOPT_COOKIEFILE, 'cookies.txt');
curl_setopt($ch, CURLOPT_COOKIESESSION, true);
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 120);
curl_setopt($ch, CURLOPT_TIMEOUT, 120);
curl_setopt($ch, CURLOPT_MAXREDIRS, 10);
curl_setopt($ch, CURLOPT_REFERER, 'https://facebook.com/');
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
$response = curl_exec($ch);
curl_close($ch);
echo '<textarea>'.$response.'</textarea>';
?>
Yet it still returns the browser check screen. Adding random free proxy to it doesn't seem to work either, or maybe I wasn't lucky finding a working one (or couldn't figure out how to insert it correctly in this case). Is there any way around it? Or perhaps there is some other way to see if there is a specific keyword/link on the page?
Ok, I've spent most of the day on this problem, and seems like I got it more or less sorted. Not exactly the way I expected, but hey, it works... sort of.
Instead of solving this on the server side, I ended up looking for solution to parse it on my own PC (it has better uptime than my hosting's server anyway). Turns out, there are plenty of ready-to-use open source scrapers, including those that know how to bypass cloudflare being extra defensive for no good reason.
Solution for python dummies like myself:
Install Anaconda if you don't have python installed yet.
In cmd type pip install cloudscraper
Open Spyder (it comes along with Anaconda) and paste this:
import cloudscraper
scraper = cloudscraper.create_scraper()
print(scraper.get("https://your-parse-target/").text)
Save it anywhere and poke at run button to test. If it works, you got your data in the console window of same app.
Replace print with whatever you're gonna do with that data.
For my specific case it also required to install mysql-connector-python and to enable remote access for mysql database (and my hosting had it available for free all this time, huh?). So instead of directly verifying that user is the owner of the profile they input, there's now a queue - which isn't perfect, but oh well, they'll have to wait.
First, user request is saved to mysql. My local python script will check that table every now and then to see if anything's in line to be verified. It'll get the page's content and save it back to mysql. Then the old php parser will do its job like before, but from mysql fetch instead of actual website.
Perhaps there are better solutions that don't require resorting to measures like creating a separate local parser, but maybe this will help to someone running into similar issue.
I am just playing around trying to learn php and decided to write a php page that could pull info from the leagueoflegends boards. Problem I am having is the site needs me to login first. Ive tried just
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://forums.euw.leagueoflegends.com/board');
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.6 (KHTML, like Gecko) Chrome/16.0.897.0 Safari/535.6');
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($ch, CURLOPT_REFERER, "http://leagueoflegends.com");
$html = curl_exec($ch);
curl_close($ch);
echo $html;
and I have tried
file_get_contents('http://forums.euw.leagueoflegends.com/board/')
but every time I get nowhere. I was hoping that being logged in on another tab would allow me to get the source of pages on the forums, but that doesn't seem to be the case. I honestly don't even know where to go from here or what I should be searching for to give me a clue. Normally I like to post a little more info, but like I said I am trying to learn PHP; i've seem to learn best by just jumping in.
First, good luck on your path of learning PHP! Curl is mighty powerful, but lately I've been using Guzzle instead (guzzlephp.org) for it's ease of use.
Most sites that have login mechanisms do in fact use sessions or cookies to map users so you are on the right path. What you have above will simply retrieve the main board page. From here, you'll submit a second curl request to login. The login page there is:
https://account.leagueoflegends.com/login
That actually pops up a modal window though and uses a captcha. You'll submit the following form fields:
username
password
recaptcha_response_field
to: https://account.leagueoflegends.com/auth
Since this has a captcha, your best bet may be to login as yourself and export your cookie data for this domain and see if you can reuse it in your script. It'll expire at some point so this won't be fully automated.
I have a small web page that, every day, displays a one word answer - either Yes or No - depending on some other factor that changes daily.
Underneath this, I have a Facebook like button. I want this button to post, in the title/description, either "Yes" or "No", depending on the verdict that day.
I have set up the OG metadata dynamically using php to echo the correct string into the og:title etc. But Facebook caches the value, so someone sharing my page on Tuesday can easily end up posting the wrong content to Facebook.
I have confirmed this is the issue by using the Facebook object debugger. As soon as I force a refresh, all is well. I attempted to automate this using curl, but this doesn't seem to work.
$ch = curl_init();
$timeout = 30;
curl_setopt($ch, CURLOPT_URL, "http://developers.facebook.com/tools/lint/?url={http://ispizzahalfprice.com}");
curl_setopt($ch,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
$data = curl_exec($ch);
curl_close($ch);
echo $data;
Am I missing some easy fix here? Or do I need to re-evaluate my website structure to acheive what I am looking for (e.g. use two separate pages)?
Here's the page in case it's useful: http://ispizzahalfprice.com
Using two separate URL's would be the safe bet. As you have observed, Facebook does quite heavy caching on URL scrapes. You've also seen that you, as the admin of the App, can flush and refresh Facebook's cache by pulling the page through the debugger again.
Using two URL's would solve this issue because Facebook could cache the results all they want! There will still be a separate URL for "yes" and one for "no".
I am trying to CURL this URL so that it automatically adds a product to a basket
http://www.juno.co.uk/cart/add/440551/01/
When I follow the URL in the browser it adds the product to basket
When I CURL it it doesnt add it
This is my CURL code
$url = "http://www.juno.co.uk/cart/add/440551/01/";
$c = curl_init();
curl_setopt($c, CURLOPT_URL,"$url");
$file_path = 'cookies.txt';
curl_setopt($c,CURLOPT_POST,true);
curl_setopt($c, CURLOPT_CONNECTTIMEOUT, 50);
curl_setopt($c,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
curl_setopt($c, CURLOPT_RETURNTRANSFER,1);
curl_setopt($c, CURLOPT_COOKIEJAR, $file_path);
$complete = curl_exec($c);
curl_close($c);
Any ideas? CURL is definitely set up on my server as I am successfully using it for other scripts.
You can see the output here http://soundshelter.net/addjuno.php?id=440551 - it is redirecting to the page that I expect it to (i.e. adding the item to basket) but I do not want to redirect the user to this page - only ping the page so that the item is added to basket but the user remains on my page. Any ideas?
Thanks in advance
The cart (or something about it (id, content, etc) is stored in a session, you have to create a custom function in which you can pass the id of the cart, and you can update it.
EDIT:
if this would be possible, then it would be a security risk (add items to anybody cart ?)
user is identified by session id, you need to "steal" it from your visitor and call the url via curl like you were the user (you can create cookies for the curl session i think and set the session id), but of course this is a very similar thing like stealing cookie / session datas, and there are defending techniques against it
my opinion is only one possible solution is, if the juno.co.uk has a public api for such operations
Answer may be as simple as you shouldn't need to POST, that might be causing problems since you aren't sending/specifying any data. What I mean is to comment out that line:
//curl_setopt($c,CURLOPT_POST,true);
sidebar: Can you show the output that you do get?